Weak Memory Concurrency: Sequential Consistency Fails on CPUs
Sequential consistency, defined by Leslie Lamport, treats a concurrent program as if its instructions were interleaved in some order. Within each thread the instructions must execute in program order, but instructions from different threads may appear in any order. Under this model two threads that write to distinct locations X and Y cannot both read the initial value 0 when each reads the other’s write.
Reality of Modern Systems
Compilers reorder instructions to improve performance, and CPUs may execute later instructions while waiting for a cache miss. Both optimizations change the observable order of operations. Store buffers add a further layer: each core keeps a private FIFO queue for writes. When a thread reads, it first checks its own buffer; if the address is present, the pending value is returned, otherwise the read accesses main memory. Writes leave the buffer only after asynchronous propagation, so other threads can temporarily see stale values. These mechanisms produce outcomes that violate sequential consistency, such as both threads observing 0 in the classic “a = 0, b = 0” test after a few hundred iterations.
Advanced Memory Behaviors
Multi‑copy atomicity describes whether all threads observe memory updates in the same order. x86 processors from Intel and AMD generally respect this property, while IBM Power architectures do not, allowing weak behaviors. ARM historically permitted weak ordering but recent implementations have moved toward stronger specifications. Distributed systems often tolerate weak behaviors because a universal order of operations is unnecessary; for example, Facebook wall posts can appear in different orders on different replicas without breaking correctness.
Abstraction Layers
At a high level—far up the abstraction chain—operating systems and programming‑language designers aim to hide weak memory effects, letting developers write code as if sequential consistency holds. At lower levels—hardware, C, or Rust—those weak behaviors become visible, requiring careful reasoning or explicit synchronization primitives. The contrast mirrors a “classical vs. quantum” analogy: the macro view appears deterministic, while the micro view reveals probabilistic, non‑intuitive interactions.
Takeaways
- Sequential consistency requires each thread's instructions to execute in program order while allowing arbitrary interleaving across threads, making the simultaneous observation of initial values impossible.
- Compilers and CPUs reorder instructions and use store buffers, which can temporarily hide writes from other threads and produce outcomes that violate sequential consistency.
- Store buffers operate as per‑core FIFO queues where a write is first placed locally, reads check the buffer before main memory, and propagation to memory occurs asynchronously.
- Multi‑copy atomicity—whether all threads see memory updates in the same order—is guaranteed on x86 processors but not on IBM Power, and modern ARM implementations have moved toward stronger guarantees.
- Weak memory behaviors are acceptable in distributed systems where a global ordering is unnecessary, but higher‑level software abstractions often hide these details from application programmers.
Frequently Asked Questions
How do store buffers cause weak memory behavior?
Store buffers hold writes in a private FIFO queue on each core. When a thread reads, it first checks its own buffer; if the address is present, the pending value is returned, otherwise the read goes to main memory. Because other cores cannot see the buffered write until it propagates, they may observe an older value, creating a weak behavior that breaks sequential consistency.
What is multi-copy atomicity and which processors guarantee it?
Multi‑copy atomicity means that all threads observe memory updates in the same order, so a write becomes visible to every core simultaneously. x86 processors from Intel and AMD generally provide this guarantee, while IBM Power CPUs do not, and older ARM designs allowed weaker ordering though recent ARM implementations have strengthened it.
Who is Computerphile on YouTube?
Computerphile is a YouTube channel that publishes videos on a range of topics. Browse more summaries from this channel below.
Does this page include the full transcript of the video?
Yes, the full transcript for this video is available on this page. Click 'Show transcript' in the sidebar to read it.
Helpful resources related to this video
If you want to practice or explore the concepts discussed in the video, these commonly used tools may help.
Links may be affiliate links. We only include resources that are genuinely relevant to the topic.