Consistency
What are consistency models in distributed systems, and why you should care about them.
Hi Friends,
Welcome to the 84th issue of the Polymathic Engineer newsletter.
This week, we discuss consistency, a complex yet critical topic in distributed systems.
Distributed systems involve multiple nodes working together, often across different geographic locations. This distribution is challenging because we need to ensure that all nodes have a consistent view of the data.
In this article, we will analyze different consistency models that define the rules and guarantees a system can provide regarding data consistency.
The outline will be as follows:
Consistency in Distributed Systems
Consistency models
Strong Consistency and Linearizability
Sequential Consistency
Casual Consistency
Eventual Consistency
Monotonic Reads
Read-Your-Writes
Monotonic Writes
Consistency in Distributed Systems
In distributed systems, consistency refers to the guarantee that all nodes see an up-to-date copy of the data at the same time.
When data is stored across multiple servers, it is important that a system gives the client guarantees in terms of how synced the data is. Consistency ensures that any read operation reflects the most recent write operation.
Ideally, in a strongly consistent system, all nodes can see the same data at the same time. However, consistency comes with a cost. For example, balancing consistency with availability is always challenging.
This trade-off is encapsulated in the CAP theorem, which states that a distributed system can only simultaneously guarantee two of these properties.
Consistency: Every read receives the most recent write.
Availability: Every request receives a response without guarantee that it contains the most recent write.
Partition Tolerance: The system continues to operate despite arbitrary message loss or failure of part of the system.
Since perfect consistency, availability, and partition tolerance are impossible, systems must make tradeoffs based on their specific use cases.
For example, a banking system might prioritize consistency over availability to ensure accurate transactions, whereas a social media platform might prioritize availability to enhance user experience.
Balancing consistency with throughput also requires making trade-offs. Achieving high consistency is only possible with coordination between nodes, which slows down the system.
For instance, waiting for acknowledgments from multiple nodes before confirming a write operation can reduce throughput.
Consistency models are the results of these tradeoffs.
Consistency Models
Consistency models define the rules and guarantees regarding the visibility and ordering of updates across a distributed system.
These models range from strong guarantees, where every node sees every operation in the same order, to weaker guarantees, where nodes may see write operations in different orders or not see some of them.
The following diagram shows the relationships between the common consistency models using arrows. For example, linearizability implies sequential consistency, and so on.