A comprehensive reference covering the Java Memory Model, thread lifecycle, classic and modern concurrency primitives, virtual threads, and production-grade patterns.
The JMM defines the rules by which threads interact through memory. It specifies when writes by one thread become visible to reads in another thread, preventing subtle bugs that arise from CPU caches, out-of-order execution, and compiler optimisations.
A guarantee that memory writes of a specific statement are observable to another statement. The ordering relation that makes visibility predictable.
Without synchronisation, a thread may see stale cached values. volatile and locks force a flush-and-refresh of the CPU's write buffers.
64-bit reads/writes (long, double) are NOT atomic on 32-bit JVMs without volatile. Always use volatile or atomics for shared primitives.
The JVM and CPU can reorder instructions as long as single-thread semantics hold. Barriers (volatile, locks) prevent cross-thread reordering hazards.
volatile, synchronized, an Atomic* wrapper, or a happens-before relationship established by a lock/thread-start/thread-join.
/** * volatile guarantees: * 1. Visibility: writes are immediately flushed to main memory. * 2. Atomicity of single read/write (NOT compound read-modify-write). * 3. Prevents reordering across the volatile access (memory fence). */ public class VolatileFlag { private volatile boolean running = true; // fence on each access public void worker() { while (running) { // always reads fresh value from memory doWork(); } System.out.println("Worker stopped cleanly."); } public void stop() { running = false; // write flushed; visible to worker thread } // ❌ WRONG — volatile does NOT protect compound ops: private volatile int counter = 0; public void unsafeIncrement() { counter++; } // read-modify-write: NOT atomic! // ✅ Use AtomicInteger instead for compound operations }
| State | Description | Transition From |
|---|---|---|
| NEW | Created but not yet started | — |
| RUNNABLE | Executing or ready to execute on CPU | start(), unblocked |
| BLOCKED | Waiting to acquire a monitor lock | synchronized entry |
| WAITING | Indefinitely waiting (wait, join, park) | Object.wait(), LockSupport.park() |
| TIMED_WAITING | Waiting with a timeout | sleep(), wait(ms), join(ms) |
| TERMINATED | run() returned or threw | completion or uncaught exception |
public class InterruptibleWorker implements Runnable { @Override public void run() { try { while (!Thread.currentThread().isInterrupted()) { processChunk(); // sleep() clears the interrupt flag and throws InterruptedException Thread.sleep(100); } } catch (InterruptedException e) { // Restore interrupt status — never silently swallow! Thread.currentThread().interrupt(); cleanup(); } } // Blocking I/O that respects interruption private void processChunk() throws InterruptedException { if (Thread.currentThread().isInterrupted()) throw new InterruptedException("Task cancelled by caller"); // ... actual work ... } }
catch (InterruptedException e) { /* ignore */ } — This silently swallows the signal. Always restore the interrupt flag with Thread.currentThread().interrupt() unless you are definitively terminating the thread.
Java's intrinsic locks (synchronized) use the object's monitor. Every Java object carries a monitor; synchronized methods lock this; static methods lock the Class object.
public class BoundedBuffer<T> { private final Object[] items; private int count, putIdx, takeIdx; public BoundedBuffer(int capacity) { items = new Object[capacity]; } // wait/notify: canonical producer-consumer on a single condition public synchronized void put(T item) throws InterruptedException { while (count == items.length) // ALWAYS loop — guard against spurious wakeups wait(); items[putIdx] = item; putIdx = (putIdx + 1) % items.length; ++count; notifyAll(); // wake all waiting consumers } @SuppressWarnings("unchecked") public synchronized T take() throws InterruptedException { while (count == 0) // loop — not if! wait(); T item = (T) items[takeIdx]; items[takeIdx] = null; // help GC takeIdx = (takeIdx + 1) % items.length; --count; notifyAll(); return item; } }
while (never if) around wait(). Spurious wakeups can occur on any JVM platform per the POSIX spec. The loop re-checks the condition after every wake.
/** * Safe DCL requires volatile on the field (Java 5+). * volatile prevents partially-constructed object from being observed * because it establishes a happens-before for the write to instance. */ public class SafeSingleton { private static volatile SafeSingleton instance; private SafeSingleton() { } public static SafeSingleton getInstance() { if (instance == null) { // first check (no lock) synchronized (SafeSingleton.class) { if (instance == null) // second check (under lock) instance = new SafeSingleton(); } } return instance; } // ✅ Prefer: Initialization-on-demand holder (zero synchronisation overhead) private static class Holder { static final SafeSingleton INSTANCE = new SafeSingleton(); } public static SafeSingleton getInstanceIdiomatic() { return Holder.INSTANCE; // class-loading guarantees safe publish } }
java.util.concurrent.locks.ReentrantLock offers more flexibility than intrinsic locks: timed acquisition, interruptible acquisition, fairness policy, and multiple associated Condition objects.
import java.util.concurrent.locks.*; public class ExplicitLockBuffer<T> { private final ReentrantLock lock = new ReentrantLock(); private final Condition notFull = lock.newCondition(); private final Condition notEmpty = lock.newCondition(); private final Object[] items; private int count, putIdx, takeIdx; public ExplicitLockBuffer(int cap) { items = new Object[cap]; } public void put(T item) throws InterruptedException { lock.lock(); try { while (count == items.length) notFull.await(); items[putIdx] = item; putIdx = (putIdx + 1) % items.length; ++count; notEmpty.signal(); // wake exactly ONE consumer } finally { lock.unlock(); // ALWAYS in finally } } // ReadWriteLock — many readers, exclusive writer private final ReadWriteLock rwLock = new ReentrantReadWriteLock(); private final Lock readLock = rwLock.readLock(); private final Lock writeLock = rwLock.writeLock(); private Map<String,String> cache = new HashMap<>(); public String get(String key) { readLock.lock(); try { return cache.get(key); } finally { readLock.unlock(); } } public void put(String k, String v) { writeLock.lock(); try { cache.put(k, v); } finally { writeLock.unlock(); } } }
import java.util.concurrent.locks.StampedLock; public class Point { private double x, y; private final StampedLock sl = new StampedLock(); public double distanceFromOrigin() { long stamp = sl.tryOptimisticRead(); // no lock acquired double cx = x, cy = y; if (!sl.validate(stamp)) { // check for concurrent write stamp = sl.readLock(); // fall back to real read lock try { cx = x; cy = y; } finally { sl.unlockRead(stamp); } } return Math.sqrt(cx * cx + cy * cy); } public void move(double dx, double dy) { long stamp = sl.writeLock(); try { x += dx; y += dy; } finally { sl.unlockWrite(stamp); } } }
The java.util.concurrent.atomic package uses CPU-native Compare-And-Swap (CAS) instructions, eliminating OS scheduling overhead for many concurrent operations.
import java.util.concurrent.atomic.*; public class AtomicPatterns { // ── Basic counter ────────────────────────────────────────── private final AtomicLong counter = new AtomicLong(0); public long nextId() { return counter.incrementAndGet(); } public long addAndGetStat(long delta) { return counter.addAndGet(delta); } // ── CAS spin loop — optimistic update ───────────────────── private final AtomicInteger max = new AtomicInteger(0); public void updateMax(int candidate) { int cur; do { cur = max.get(); if (candidate <= cur) return; } while (!max.compareAndSet(cur, candidate)); // retry if lost race } // ── AtomicReference — immutable snapshot swap ────────────── record State(List<String> items, long version) {} private final AtomicReference<State> stateRef = new AtomicReference<>(new State(List.of(), 0)); public void addItem(String item) { stateRef.updateAndGet(old -> { var newItems = new ArrayList<>(old.items()); newItems.add(item); return new State(List.copyOf(newItems), old.version() + 1); }); } // ── LongAdder — high-throughput counter (Java 8+) ───────── // Outperforms AtomicLong under high contention by striping cells private final LongAdder hits = new LongAdder(); public void recordHit() { hits.increment(); } public long totalHits() { return hits.sum(); } // only accurate at quiescence }
| Class | Internals | Best Use |
|---|---|---|
ConcurrentHashMap | Segment/bin CAS (Java 8+) | High-concurrency key-value store |
CopyOnWriteArrayList | Snapshot on write | Read-heavy, rare writes |
LinkedBlockingQueue | Two-lock linked list | Unbounded producer-consumer |
ArrayBlockingQueue | Single lock + conditions | Bounded, back-pressure |
PriorityBlockingQueue | Heap + single lock | Priority scheduling |
SynchronousQueue | Rendezvous transfer | Direct hand-off (no capacity) |
DelayQueue | PQ + delay expiry | Scheduled tasks, TTL caches |
ConcurrentLinkedQueue | Lock-free Michael-Scott queue | Non-blocking FIFO |
ConcurrentHashMap<String, AtomicInteger> freq = new ConcurrentHashMap<>(); // computeIfAbsent is atomic — safe for lazy initialisation public void recordWord(String word) { freq.computeIfAbsent(word, k -> new AtomicInteger(0)) .incrementAndGet(); } // merge — atomic read-modify-write without explicit locking ConcurrentHashMap<String, Long> scores = new ConcurrentHashMap<>(); public void addScore(String user, long points) { scores.merge(user, points, Long::sum); // atomic accumulate } // Parallel aggregation via streams on ConcurrentHashMap long total = scores.reduceValuesToLong(1, Long::longValue, 0, Long::sum); // CopyOnWriteArrayList — event listener registry private final List<Runnable> listeners = new CopyOnWriteArrayList<>(); public void addListener(Runnable r) { listeners.add(r); } public void fireAll() { listeners.forEach(Runnable::run); } // snapshot
Raw Thread creation is expensive. Thread pools amortise the cost and bound resource usage. ThreadPoolExecutor is the central tunable implementation.
import java.util.concurrent.*; // ── Tunable pool ────────────────────────────────────────────── ThreadPoolExecutor pool = new ThreadPoolExecutor( 4, // corePoolSize 16, // maximumPoolSize 60, TimeUnit.SECONDS, // keepAliveTime new LinkedBlockingQueue<>(1000), // bounded work queue new ThreadFactory() { private final AtomicInteger n = new AtomicInteger(); public Thread newThread(Runnable r) { Thread t = new Thread(r, "worker-" + n.getAndIncrement()); t.setDaemon(true); // don't prevent JVM shutdown return t; } }, new ThreadPoolExecutor.CallerRunsPolicy() // back-pressure: caller executes ); // ── Scheduled execution ────────────────────────────────────── ScheduledExecutorService scheduler = Executors.newScheduledThreadPool(2); scheduler.scheduleAtFixedRate( () -> publishMetrics(), 0, 30, TimeUnit.SECONDS // fixed-rate: next run = start + period ); scheduler.scheduleWithFixedDelay( () -> pollExternalService(), 0, 5, TimeUnit.SECONDS // fixed-delay: next run = end + delay ); // ── Graceful shutdown ──────────────────────────────────────── pool.shutdown(); // no new tasks; drain queue if (!pool.awaitTermination(30, TimeUnit.SECONDS)) { pool.shutdownNow(); // interrupt running tasks if (!pool.awaitTermination(5, TimeUnit.SECONDS)) log.error("Pool did not terminate"); }
Queue full → caller thread executes task. Natural back-pressure; slows producers automatically.
Queue full → throw RejectedExecutionException. Caller must handle or circuit-break.
Queue full → discard the head (oldest unstarted task) and retry submission.
Silently drops new submissions. Use only when loss is acceptable (metrics, fire-and-forget logs).
import java.util.concurrent.*; public class OrderService { // ── Linear async pipeline ───────────────────────────────── public CompletableFuture<Invoice> processOrder(Order order) { return CompletableFuture .supplyAsync(() -> validateOrder(order), ioPool) // async on I/O pool .thenApplyAsync(valid -> chargePayment(valid), ioPool) .thenApplyAsync(paid -> fulfil(paid), cpuPool) .thenApply(Invoice::from) // same thread .exceptionally(ex -> Invoice.failed(ex.getMessage())); } // ── Fan-out / fan-in ────────────────────────────────────── public CompletableFuture<AggregatedResult> fanOut(List<String> ids) { List<CompletableFuture<Data>> futures = ids.stream() .map(id -> CompletableFuture.supplyAsync(() -> fetch(id), ioPool)) .toList(); return CompletableFuture.allOf(futures.toArray(new CompletableFuture[0])) .thenApply(_ -> futures.stream() .map(CompletableFuture::join) // all done — join is safe .collect(collectingAndThen(toList(), AggregatedResult::new)) ); } // ── Race: first non-null response wins ───────────────────── public CompletableFuture<String> firstAvailable() { return CompletableFuture.anyOf( CompletableFuture.supplyAsync(() -> callRegionA(), ioPool), CompletableFuture.supplyAsync(() -> callRegionB(), ioPool) ).thenApply(o -> (String) o); } // ── Timeout (Java 9+) ───────────────────────────────────── public CompletableFuture<Data> withTimeout() { return CompletableFuture.supplyAsync(this::slowOperation, ioPool) .orTimeout(5, TimeUnit.SECONDS) .exceptionally(ex -> Data.FALLBACK); } }
Fork/Join uses work-stealing: idle threads steal tasks from the back of other threads' deques, keeping CPUs saturated. Designed for divide-and-conquer CPU-bound tasks.
import java.util.concurrent.*; public class ParallelMergeSort extends RecursiveAction { private static final int THRESHOLD = 1024; private final int[] array; private final int lo, hi; public ParallelMergeSort(int[] a, int lo, int hi) { this.array = a; this.lo = lo; this.hi = hi; } @Override protected void compute() { if (hi - lo < THRESHOLD) { // base case: sequential sort Arrays.sort(array, lo, hi); return; } int mid = (lo + hi) >>> 1; var left = new ParallelMergeSort(array, lo, mid); var right = new ParallelMergeSort(array, mid, hi); left.fork(); // push left to deque right.compute(); // run right in-thread (avoid overhead) left.join(); // steal-compatible wait merge(array, lo, mid, hi); } // Usage: ForkJoinPool.commonPool().invoke(new ParallelMergeSort(arr, 0, arr.length)) // Or custom pool: new ForkJoinPool(Runtime.getRuntime().availableProcessors()) }
Virtual threads are lightweight JVM-managed threads. Millions can exist simultaneously because they unmount from their carrier (OS) thread during blocking operations — no OS thread is pinned.
import java.util.concurrent.*; public class VirtualThreadDemo { // ── Spawn 1 million virtual threads ────────────────────── public static void millionRequests() throws Exception { try (var executor = Executors.newVirtualThreadPerTaskExecutor()) { IntStream.range(0, 1_000_000).forEach(i -> executor.submit(() -> { Thread.sleep(Duration.ofMillis(100)); // mounts; unblocks cheaply return i; }) ); } // auto-shutdown + await } // ── Structured Concurrency (Java 21 preview) ────────────── public Response handleRequest(String id) throws Exception { try (var scope = new StructuredTaskScope.ShutdownOnFailure()) { var user = scope.fork(() -> fetchUser(id)); var order = scope.fork(() -> fetchOrder(id)); scope.join().throwIfFailed(); // cancel both if either fails return new Response(user.get(), order.get()); } } // ── Pinning pitfall: avoid synchronized inside virtual thread ── // synchronized blocks pin the carrier thread — use ReentrantLock instead private final ReentrantLock lock = new ReentrantLock(); public void safeVirtualWork() { lock.lock(); try { doBlockingIo(); } // carrier can unmount here ✅ finally { lock.unlock(); } } }
ForkJoinPool tuned to available processors. Virtual threads do not improve CPU-bound throughput.
public class Pipeline<T> { private final BlockingQueue<T> queue; private volatile boolean running = true; public Pipeline(int capacity) { this.queue = new ArrayBlockingQueue<>(capacity); // bounded = back-pressure } public void produce(T item) throws InterruptedException { queue.put(item); // BLOCKS when full — slows producer naturally } public void startConsumers(int n, Consumer<T> fn) { ExecutorService pool = Executors.newFixedThreadPool(n); for (int i = 0; i < n; i++) { pool.execute(() -> { while (running || !queue.isEmpty()) { try { T item = queue.poll(200, TimeUnit.MILLISECONDS); if (item != null) fn.accept(item); } catch (InterruptedException e) { Thread.currentThread().interrupt(); break; } } }); } } }
public class Memoizer<K, V> { private final ConcurrentHashMap<K, CompletableFuture<V>> cache = new ConcurrentHashMap<>(); private final Function<K, V> loader; public Memoizer(Function<K, V> loader) { this.loader = loader; } public V get(K key) throws Exception { return cache.computeIfAbsent(key, k -> CompletableFuture.supplyAsync(() -> loader.apply(k)) ).get(); // computeIfAbsent is atomic — only one future per key is created // All concurrent requests for same key await the SAME future } }
/** * Striped locking: instead of one global lock, use N locks keyed by hash. * Reduces contention by factor N with no correctness change. * Google Guava: Striped<Lock> / Striped<ReadWriteLock> */ public class StripedCounter { private static final int STRIPES = 64; private final ReentrantLock[] locks = new ReentrantLock[STRIPES]; private final long[] counts = new long[STRIPES]; public StripedCounter() { Arrays.setAll(locks, _ -> new ReentrantLock()); } private int stripe(Object key) { return (key.hashCode() & 0x7FFF_FFFF) % STRIPES; } public void increment(String key) { int s = stripe(key); locks[s].lock(); try { counts[s]++; } finally { locks[s].unlock(); } } public long total() { long sum = 0; for (long c : counts) sum += c; return sum; } }
// ❌ Deadlock: T1 holds A, waits for B; T2 holds B, waits for A void transfer1(Account from, Account to) { synchronized(from) { synchronized(to) { move(from, to); } } } void transfer2(Account from, Account to) { synchronized(to) { synchronized(from) { move(from, to); } } } // ✅ Fix: consistent lock ordering by account ID void transferSafe(Account a, Account b, long amount) { Account first = a.id() < b.id() ? a : b; Account second = a.id() < b.id() ? b : a; synchronized(first) { synchronized(second) { first.debit(amount); second.credit(amount); } } } // ✅ Alternative: tryLock with back-off boolean transferTryLock(ReentrantLock l1, ReentrantLock l2, long amount) throws InterruptedException { while (true) { if (l1.tryLock(10, TimeUnit.MILLISECONDS)) { try { if (l2.tryLock(10, TimeUnit.MILLISECONDS)) { try { move(); return true; } finally { l2.unlock(); } } } finally { l1.unlock(); } } Thread.sleep(random(1,5)); // randomised back-off } }
Livelock — Threads respond to each other but make no progress. Solution: randomised back-off or priority ordering.
Starvation — Low-priority threads never get CPU. Use fair locks (new ReentrantLock(true)) or bounded waiting queues.
False Sharing — Two fields on the same cache line ping-pong between CPU caches. Pad fields with @Contended (JVM flag: -XX:-RestrictContended).
ThreadLocal Leaks — In thread pools, ThreadLocal values outlive the task. Always call remove() in a finally block.
Missed Signal — notify() sent before wait() is lost forever. Always check state condition under the lock before waiting.
# Thread dump — analyse BLOCKED/WAITING stacks for deadlock chains jstack <pid> jcmd <pid> Thread.print -l # with locked monitors # Flight Recorder — low-overhead production profiling jcmd <pid> JFR.start duration=60s filename=recording.jfr jfr print --events jdk.VirtualThreadPinned recording.jfr # Check for virtual thread pinning events -Djdk.tracePinnedThreads=full # JVM flag: prints pinning stack traces # GC impact on locks (STW pauses affect timed waits) -Xlog:gc*:file=gc.log:time,uptime # Detect data races with ThreadSanitizer-equivalent # Use: -ea -Djava.util.concurrent.ForkJoinPool.common.parallelism=N