Java concurrency

Java Concurrency

Tibor Digana

Agenda
• Java Memory Model
• Thread Confinement
• Java Atomic API
• Immutable objects
• Memory consumption

Agenda
1. Introduction
2. JMM attributes
3. JMM & final fields & aggressive reordering
4. Sequential Consistency
5. Conflicting and Data Races
6. Examples and References
• Java Atomic API

Java Memory Model
• JMM - Chapter 4 (JSR-133 after August 2004)
http://www.cs.umd.edu/~pugh/java/memoryModel/jsr133.pdf
• JSR-133 FAQ http://www.cs.umd.edu/~pugh/java/memoryModel/jsr-133-faq.html
• JVM Specification release 2 - Chapter 8
• JMM and synchronization by Doug Lee http://gee.cs.oswego.edu/dl/cpj/jmm.html
• You can see applied volatile in a code for Java 1.5 and greater, and compare against
broken JMM without volatile by
http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html
• How the memory barriers – fences, for volatile, synchronized, atomic conditionals
apply to processor instructions in assembler
http://www.infoq.com/articles/memory_barriers_jvm_concurrency
• The JSR-133 Cookbook for Compiler Writers by Doug Lee

Java Memory Model
• Do NOT read articles issued before August
2004. That time the JMM was broken;
• First a new JMM was officially introduced in
JDK 1.5 along with final release of JSR-133;
• Unofficially (as trial fix of JMM) in JRE 1.4.2

Java Memory Model
Why Should I Care?
• Concurrency bugs are very difficult to debug
• They often don't appear in testing, waiting
instead until your program is run under heavy
load, and are hard to reproduce and trap.
• You are much better off spending the extra
effort ahead of time to ensure that your
program is properly synchronized; while this is
not easy, it's a lot easier than trying to debug
a badly synchronized application.

What is a memory model, anyway?
• processors have layered memory caches which speed up data access by
reducing the traffic to shared memory;
• (*) necessary and sufficient conditions for knowing that write operations
done by other processors are visible to current processor; and also write
operations done by current processor are visible to other processors;
• exist processors with a strong memory model;
• processors with a weaker memory model, where special memory cache
instructions, are required to flush to make writes visible to other
processors, or invalidate to see the writes made by the other processors
(dirty flag);
• memory barriers (cache coherence instructions) are performed when lock
and unlock actions are taken;
• cache coherence, as a property of memory cache, satisfies conditions
(*), and more like reordering instructions (processor advantage of
performance) keeping the same program’s semantics; (exist instructions to
disable reordering).

Java Memory Model
Why do I need it?
• race conditions: what value two processors can
see when they examine same memory location at
the same time;
• JMM describes what behaviors are legal in
multithreaded code, and how threads may
interact through the shared memory;
• describes the relationship between variables in a
program;
• defines the behavior of volatile, final, and
synchronized

Java Memory Model
variables accessability
• it is impossible for one thread to access
parameters or local variables of another thread;
• for Java programmer it does not matter whether
parameters and local variables are thought of as
residing in the shared main memory or in the
working memory of the thread that owns them;
• write observed by read is valid according to
certain rules;
• execution result predicted by memory model.

Java Memory Model
Terminology
• while the processor has registers, stack, cache
(and shared or RAM), virtual memory, MMU;
• the JMM has
1. thread's working copy of a variables (load/store);
2. thread's execution engine (use/assign/lock/unlock);
3. shared main memory (read/write).
Data transfer between the main memory and a
thread's working memory is loosely coupled.

Java Memory Model
• Ordering
• Visibility
• Atomicity

Java Memory Model
Ordering
• Happens-Before is too weak (necessary but not sufficient constraint);
• Reordering is under intra-thread actions;
• Causality is Subtle, see chapter 6 in JSR-133;
• synchronization order = program order;
• Unsynchronized code before a lock cannot interleave behind unlock, and vice-
versa;
• An unlock action on monitor m synchronizes-with all subsequent lock actions on m
• Non-volatile read happen before write to the same variable;
• Volatile read returns the last write done before it in synchronization order;
• Cannot reorder access of fields to array elements, and monitor locks;
• Final fields cannot be reordered;
• Thread#start() synchronizes with first thread’s action it takes (default instance
values!)
• Finalization always happens after new;
• Other thread determines that has been interrupted by other;
• All thread’s actions happen-before other thread has called join();

Java Memory Model
Visibility
• volatile, final, synchronized
• (JVM Spec 2.9.1) working copies need to be
reconciled with the master copies in the
shared main memory only at prescribed
synchronization points, namely, when objects
are locked or unlocked;
• Thread#sleep, Thread#join, do not have any
synchronization semantics, no memory flush

Java Memory Model
Atomicity
• Volatile r/w of double and long are atomic
• Pre/post increment/decrement not atomic
• AtomicReferenceFieldUpdater
• java.util.concurrent.atomic.*
• Compare And Swap/Set
• native CAS(&ref, exp, new): actual
• Java CAS(exp, new): boolean
• ABA problem

Remark on Final Fields
• There is no defined behavior if you want to
use JNI to change final fields (Syste.arrayCopy)
• Aggressive reordering of reads

Java Memory Model
Sequential consistency
• Sequential consistency is made about visibility
and ordering in execution;
• Within sequentially consistent execution, all
totally ordered actions r/w (e.g. volatile) are
atomic and immediately visible to all threads;
• A program is correctly synchronized if and only
if all sequentially consistent executions are
free of data races.

Conflicting and Data Races
• Two accesses to (reads of or writes to) the
same variable are said to be conflicting if at
least one of the accesses is a write;
• When a program contains two conflicting
accesses that are not ordered by
a happens-before relationship, it is said to
contain a data race.

Example (two problems)
• Final instance field with HashMap.
• Two methods, namely m1 and m2, locked by
lock1 and lock2, respectively.
• Both methods read/use the field of HashMap, do
not modify the reference held by the field, but
modify the data structure of the map.
• 1st problem: lock1, lock2
• 2nd problem: final field and constructor
• Constructor to methods conflict, wrong
synchronization, and broken sequential ordering

JVM Spec 2nd Edition
• JVM Spec Chapters 2.19 and 8 about Threads
• JVM Spec Chapter 8.7 about Rules for volatile
Variables

Reference
http://docs.oracle.com/javase/specs/jvms/se5.0/html/VMSpecTOC.doc.html

Agenda
1. Introduction
2. ad hoc thread confinement
3. stack confinement
4. using class ThreadLocal
5. Examples
• Java Atomic API

Various ways to confine an object to
only single thread
• ad hoc thread confinement
• stack confinement
• using class ThreadLocal

Before confining an object to a single
thread
Accessing shared mutable data
between threads requires using
synchronization of threads including
objects creation.

One way to avoid synchronization is
to not share.

Thread Confinement
One of the simplest way to achieve thread safety
is to confine an object to only one thread.

Single thread accessible data does not need any
synchronization.

! Make sure that the object is created and
accessed by the same thread instance !

Thread Confinement
Programmer's responsibility to ensure that
thread-confined objects do not escape from their
intended thread.
No language mechanisms to enforce a variable
been guarded by a lock.
No language features, such as visibility modifiers or
local variables, help confine the object to the
target thread.
Therefore we use an informative annotation
@GuardedBy(threads, lock)

Thread Confinement
Thread Pool (of tasks, e.g. EDT) confines an
individual object to the single thread of the
thread pool.
(sharing/dispatching events across threads in
the thread poll does not confine event objects
to a single thread and therefore have to be
implemented thread safe –in most simple case
implemented as immutable event objects)

Thread Confinement
Another reason to make a subsystem
single-threaded is deadlock avoidance.

Ad-hoc Thread Confinement
Simplicity of thread confinement outweighs the
fragility of ad-hoc thread confinement.
As an example, a special case with volatile sharing variable:
• Write only by a single thread;
• Other threads see the most up-to-date value
(no synchronization of functionalities, no happens-before
relationship of functionalities, which include read operations on
this sharing volatile variable across threads);
• No write contention;
• Preventing from data race by confining modifications to the
single thread.

Many readers, one writer thread scenario.
volatile boolean isOperating, isBoken, isChanging;
!ABA problem because the writer function is not synchronized with reader!

Writer Thread Reader Threads
function { observe status of processed data as
<process data> but the returned value belongs to time
synchronized(lock) { when the flags updated
isOperating = …;
isBoken = …; synchronized(lock) return isOperating;
isChanging = …; synchronized(lock) return isBroken;
} synchronized(lock) return isChanging;
}

The previous problem can be simplified by combining stack
confinement, and merging variables into one variable as an
atomic operations.
!ABA problem because the writer function is not synchronized with reader!
volatile int status;
Writer Thread Reader Threads
function { observe status of processed data as
<process data> non-blocking operation, but the
boolean isOperating = …; returned value belongs to time when the
boolean isBoken = …; flags updated
boolean isChanging = …;
return status;//all at once
status = isOperating ? 1 : 0 | return (status & 1) == 1;
isBoken ? 2 : 0 | isChanging ? 4 : 0; return (status & 2) == 2;
} return (status & 4) == 4;

ABA problem
• Processed data A in writer thread, updated status
information, and observed by reader;
• Processed data B in writer thread and then unscheduled
leaving old status information;
• Reader reads status of data A, however the real data
holds new status of B.

In order to avoid this problem, it would then need a proper
synchronization, because data and status means one
critical section, or to inject status information into data
object.

Synchronization, Reentrant*Lock, grou
ps of threads and ThreadGroup
• Inherited ThreadGroup when first thread
creates second thread
• Typical synchronization is under ThreadGroup
hierarchy
• ReentrantLocks have own queue of threads by
using Condition: substitutes using ThreaGroup
• ReentranLocks, N-times nested lock in current
thread is considered as having monitor
counter incremented just only by one.

Stack Confinement
Thread confinement with local variable y and parameter i, but
not for x.
class XYZ {
volatile x = 1;

method(int i) {
int y = x;
y *= i;
if (y < 0) return;
x = y;
}
}

Confinement via ThreadLocal
ThreadLocal associates a per-thread value
with a value-holding object. Thread-Local
provides get and set access methods that
maintain a separate copy of the value for each
thread that uses it, so a get returns the most
recent value passed to set from the currently
executing thread.

Porting single-threaded application
to a multithreaded environment
Preserve thread safety by converting shared
global variables into ThreadLocals, if the
semantics of the shared global variables
permits this.
An application-wide cache is not useful if it
turned into a number of thread-local caches.
Use one ThreadLocal per all global variables!

Porting single-threaded application
to a multithreaded environment via ThreadLocal
• ThreadLocal instance is WeakReference in an
internal map of applicable thread
• The life cycle of ThreadLocal instance is
associated with life cycle of threads holders
• Operations on ThreadLocals are a subject to
their quantity as if hash maps
• #set() operations are slower than #get()

Immutable Objects definition
http://download.oracle.com/javase/tutorial/essential/concurrency/imstrat.html

Java Object Memory Consumption
Let’s take some complex Java Object: Semaphore
Semaphore has FairSync
FairSync extends Sync extends AQS extends AOS

Semaphore: 1 (i.e. class_object) + 1 (i.e. instance_field)
FairSync: 1 (i.e. class_object)
Sync: 1 (i.e. class_object)
AbstractQueuedSynchronizer: 1 (class_object) + 3 (i.e. instance_fields)
AbstractOwnableSynchronizer: 1 (class_object) + 1 (i.e. instance_field)
------------------------------------------------------------------------------------------
Totally consumed memory:
10 * 32Bit = 10 * 4 Bytes = 40 Bytes constructed in 32bit VM, or
64bit JVM without compacted references

JRE consumes a little object.
The only thing is to develop the applications so
that, the application itself would be able to
release a reference to a whole module, and
minimize the lifecycle of objects.
The young generations of objects have no
impact to GC. Many shortly lived objects
would not pause the application longer than
many old object generations been GC-ed.

Atomic operations
Atomic operations are single, indivisible operations.
Example of non-atomic operations on volatile :
• pre-increment
• pre-decrement
• post-increment
• post-decrement
These operations are three divisible operations:
fetch-modify-write

Java concurrency

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Java concurrency

Similar to Java concurrency (20)

More from Scheidt & Bachmann

More from Scheidt & Bachmann (6)

Recently uploaded

Recently uploaded (20)

Java concurrency