Understanding the Visibility of Side-Effects

This entry is all about the Java Memory Model and the happens-before partial order it formalizes.

The official specification of the JMM is authoritative but unreadable. Fortunately, there are useful resources around that aim at making it more accessible:

Happens-Before

The JMM defines a partial order called happens-before on all actions in the program. Actions are reads and writes to variables, locks and unlocks of monitors etc. The rules for happens-before are as follows:

  • Program order rule. Each action in a thread happens-before every action in that thread that comes later in the program order.
  • Monitor lock rule. An unlock on a monitor lock happens-before every subsequent lock on that same monitor lock.
  • Volatile variable rule. A write to a volatile field happens-before every subsequent read of that same field.
  • Thread start rule. A call to Thread.start on a thread happens-before any other thread detects that thread has terminated, either by successfully return from Thread.join or by Thread.isAlive returning false.

The happens-before partial order provides clear guidance about what values are allowed to be returned when the data are read. In other words: it formalizes the visibility of side-effect across threads.

A read is allowed to return the value of a write 1) if that write is the last write to that variable before the read along some path in the happens-before order, or 2) if the write is not ordered with respect to that read in the happens-before order.

Instructions can be reordered as long as the happens-before order is preserved. The code below

t1 = 1;
t2 = 2;
t3 = t1+t2;
t4 = 2*t1;

can for instance be be reordered as follows

t2 = 2;
t1 = 1;
t4 = 2*t1;
t3 = t1+t2;

If two threads shared data without using any synchronization, writes of a thread are considered unordered with respect to the other thread; according to condition 2), the execution is not deterministic and the writes of a thread might or might not be visible to the other thread.

Let us consider the two threads below, and one given execution:

  T1        T2

s = s+1 

          s = s+1

s = s+1 

The second increment might or might not see the previous increment in T1. Similarly, the thrid increment might or might not see the side-effect of the second increment.

In improperly synchronized programs, a read will return the value of one the possible matching writes, according to conditions 1) and 2). This corresponds to data races. In properly synchronized programs, a read has one unique ancestor write.

  T1        T2

lock m
  |
s = s+1 
  |
unlock m  
          \  
          lock m
             |
          s = s+1
             |
          unlock m
         /
lock m
  |
s = s+1 
  |
unlock m                  

Implementing Happens-Before

Conditions 1) and 2) specify that side-effects might or might not be visible consistently to theads in improperly synchronized programs. They enable the implementation to not read the shared memory all the time, either using CPU caches or compiler optimization. The specification does however never speak of specific implementation choice like caching and optimizations.

For instance, the code

t2 = s+1
t3 = s*2

could be rewritten as by a compiler optimization

t = s
t2 = t+1
t3 = t*2

, where t caches the value of the read to s.

Typically, the compiler will emit memory barriers to flush the CPU caches. The JSR-133 Cookbook provides guidance about implementation issues.

Data Races

Failure to properly synchronize accesses to shared data is called a data race. Inversely, a program is data race free if all accesses to shared state are synchronized. But what does data race freedom exactly mean?

Is the program data race free if,

  1. accesses to shared data happen within critical section?
  2. shared data is declared volatile?
  3. accesses to shared data are never concurrent?

Issues about the memory model are typically discussed using the double-checked locking pattern, and the “pausable” thread pattern. They provide only partial answer to these questions.

Point 1 is true. If the accesses to shared data are protected by the same cricital section, there is no possible data race.

Point 2 is true also. If you define variables as volatile, the side effect will be made visible to the other thread and there will be no data race. But remember: data race freedom does not mean that the behavior is correct. Consider the trival example below:

counter = counter + 1;

Making the counter volatile won’t suffice to ensure all increments are recorded.

Point 3 holds, but requires some clarification of the underlying assumptions (Answers to my stackoverflow question failed be clear cut and authoritative). Let us consider the situation when multiple threads manipulate an object X that is plain data, but alternate their temporal execution (so that X is never accessed concurrently) via another object Y that rely on concurrency control (wait, notify, synchronize). Should fields of object X be volatile or not?

If we assume that threads are synchronized using at least one the concurrency control primitive — an not just temporal alternance thanks, say, to a well-time sleep statements — this implies the alternate acquisition of at least a lock m, as for instance in:

while(true) {
    synchronized(m) { wait(); }
    counter = counter + 1;
    synchronized(m) { notify(); }
}

There is a happens-before dependency between the increment statement and the release of the wait lock. By construction, when the lock is released, exactly one thread will acquire it and wait again. So exactly one increment will happen-after the release of the wait lock. By consequence the increment is guaranteed to see only the value of the previous increment–no need of volatile.

More

Leave a comment