Thread stuck in BLOCKED state

Scenario

Thread dumps show many threads in BLOCKED waiting on the same monitor. Requests time out or throughput collapses. You need to know which lock, who holds it, and why they do not release it—without guessing from class names alone.

After reading, you should be able to:

Why — BLOCKED means waiting for a Java monitor

In a thread dump, BLOCKED (on java.lang.Thread.State) means the thread is trying to enter a synchronized block or method but another thread already holds that monitor. It is not sleeping on I/O—that is usually RUNNABLE in native code or blocked in socket read depending on JVM reporting.

Thread states (quick reference)

StateMeaningTypical cause
BLOCKEDWaiting for monitor locksynchronized, contended intrinsic lock
WAITINGWaiting indefinitelyObject.wait(), LockSupport.park, join
TIMED_WAITINGWaiting with timeoutsleep, wait(timeout), pool get(timeout)
RUNNABLEExecuting or runnableOn CPU, or in JNI I/O (can look “stuck”)

java.util.concurrent.locks.ReentrantLock contention often shows as WAITING on AbstractQueuedSynchronizer, not BLOCKED—still a lock problem.

Why contention spikes in production

BLOCKED ≠ deadlock. Deadlock is a cycle of locks; many BLOCKED threads with one clear owner is contention. See deadlock guide for circular waits.

What — identify the lock and owner (in order)

  1. Capture 2–3 thread dumps 10s apart
    jcmd <pid> Thread.print > /tmp/td1.txt
    sleep 10
    jcmd <pid> Thread.print > /tmp/td2.txt
    Threads blocked on same lock in all dumps → chronic contention, not transient.
  2. Find a BLOCKED thread and read “waiting to lock”
    "http-nio-8080-exec-12" #45 BLOCKED
      at com.app.Service.process(Service.java:88)
      - waiting to lock <0x00000000f1234ab0> (a com.app.Service)
      at ...
    Address 0x00000000f1234ab0 is the monitor identity.
  3. Search dumps for that hex address as “locked”
    grep "0x00000000f1234ab0" td1.txt
    Owner line example:
    "http-nio-8080-exec-3" #12 RUNNABLE
      ...
      - locked <0x00000000f1234ab0> (a com.app.Service)
    If no owner but many waiters → JVM still starting lock or dump race; capture again.
  4. Read owner’s stack—what is it doing while holding the lock? If owner is in JDBC, HTTP, or heavy CPU inside synchronized, that is the bug.
  5. Count waiters on same monitor 50+ BLOCKED on one lock → hotspot; prioritize shrinking or removing that lock.
  6. Check for AQS / ReentrantLock WAITING Stack contains AbstractQueuedSynchronizer.acquirejava.util.concurrent.locks issue.
  7. Correlate with metrics Tomcat thread pool all busy; latency up; lock profiling (JFR event java.monitor.Wait) if available.
  8. Rule out “BLOCKED” misread Threads in TIMED_WAITING on pool queue may be pool exhausted—not monitor BLOCKED. See thread pool exhausted.

Worked example (reading the dump)

"pool-1-thread-5" BLOCKED on monitor for SimpleDateFormat
  → 40 http threads BLOCKED on same monitor
Owner "pool-1-thread-2" RUNNABLE
  - locked SimpleDateFormat
  - at java.text.SimpleDateFormat.format(...)
  - at com.app.ReportBuilder.build(...)

Fix: replace SimpleDateFormat with DateTimeFormatter (immutable) or per-thread instance—not synchronize on shared formatter.

Tools

How — fix contention and prevent recurrence

Fix patterns

ProblemFix
I/O inside synchronizedMove lock after DB/HTTP; only guard in-memory state
Global lock on serviceFiner locks per key shard; ConcurrentHashMap
Non-thread-safe helperUse thread-safe API (DateTimeFormatter)
Read-heavy shared mapReadWriteLock or concurrent copy-on-read structure
Lock on shared cache keyStriped locks by hash(key) % N
ReentrantLock misuseTry tryLock(timeout); fail fast with 503

Code direction (before / after)

// Bad: whole request serialized
public synchronized Response handle(Request r) {
  return db.query(r);  // holds lock during I/O
}

// Better: no lock on I/O; only guard shared mutable state
public Response handle(Request r) {
  Data d = db.query(r);
  return buildResponse(d);  // immutable or local
}

// Shared counter: use atomic or lock only increment
private final LongAdder count = new LongAdder();

Verify

  1. Load test at prior peak RPS.
  2. Thread dumps: zero or few BLOCKED on former monitor.
  3. Latency p99 down; throughput up without more pods.
  4. JFR: monitor wait time near zero on hot path.

Prevention

Interview one-liner

“BLOCKED means waiting for a monitor. I take thread dumps, get the ‘waiting to lock’ hex address, grep for the thread that ‘locked’ it, and inspect whether the owner does slow work while holding the lock—then I shrink the critical section or use concurrent structures.”

Related scenarios