Request thread pool exhausted under load

Scenario

Under traffic, latency spikes and errors like 503, TaskRejectedException, or “all threads are busy.” Tomcat (or Netty worker + blocking adapter) shows current threads = max threads. New requests queue or fail. You need to know whether the pool is too small, threads are stuck waiting, or work is never finishing.

After reading, you should be able to:

Why — finite workers, unbounded wait

A thread pool caps how many requests run at once. Each active thread is tied up until the handler returns. Exhaustion means every worker is busy (or blocked) and the queue is full or requests are rejected—so new work cannot start. The system looks “down” even if CPU is low: threads are waiting, not computing.

Common pools in Java services

PoolTypical configSymptom when exhausted
Tomcat / Jetty HTTPmaxThreads, acceptCountAll http-nio-*-exec-* busy; connections queue at OS
ExecutorServicecore/max + bounded queueRejectedExecutionException; custom worker names in dump
ForkJoinPool.commonPool()parallelism = CPUs - 1CompletableFuture / parallel streams stall; workers blocked in I/O
Scheduled poolfixed sizeJobs pile up; timers fire late

Root causes (not “we need more threads”)

Bigger maxThreads is not a free fix. More threads → more stack memory, more concurrent DB connections, more lock contention. Often the fix is faster or bounded work per request, not more parallelism.

What — confirm exhaustion and find what threads wait on

  1. Confirm pool saturation (metrics)
    • Tomcat: tomcat.threads.busytomcat.threads.config.max; queue / accept queue growing.
    • Micrometer: executor.active, executor.queued, executor.completed flatlines while load continues.
    • Errors: HTTP 503, org.apache.tomcat.util.threads.ThreadPoolExecutor rejections, Spring TaskRejectedException.
  2. Thread dump while unhealthy
    jcmd <pid> Thread.print > /tmp/pool-$(date +%s).txt
    Count threads named http-nio-8080-exec-* (or your executor prefix). If almost all exist and few are idle → exhaustion confirmed.
  3. Classify what busy threads are doing
    Stack topLikely cause
    socketRead0, JDBC driverSlow or stuck DB / network
    HttpClient / OkHttp readDownstream API no timeout
    HikariPool.getConnectionConnection pool smaller than thread demand
    BLOCKED / waiting to lockLock contention — see BLOCKED guide
    JVM “Found Java-level deadlock”Deadlock
    TIMED_WAITING on pool queueSubmitters waiting for a worker — secondary symptom
    ForkJoinPool.awaitJoinBlocking common pool; move blocking off FJP
  4. Correlate with dependency latency — p99 DB/API up at same time as busy threads? Trace one slow request (distributed trace span: where did 8s go?).
  5. Check connection pool vs thread pool
    # Hikari (example JMX / metrics)
    hikaricp.connections.active ≈ maximum
    hikaricp.connections.pending > 0
    Many threads in getConnection → size or speed mismatch, not Tomcat alone.
  6. Review reject policyAbortPolicy fails fast; CallerRunsPolicy blocks the caller thread (can deadlock servlet accept path if misused).
  7. Load vs capacity math (rough)

    Little’s law: concurrency ≈ throughput × latency. If RPS × p99 latency > maxThreads, you need lower latency, fewer concurrent slow calls, or more workers (last resort).

ForkJoinPool.commonPool() specifically

parallelStream() and default CompletableFuture.supplyAsync use the common pool. Blocking I/O on those threads reduces effective parallelism for the whole JVM. Thread dump: many ForkJoinPool.commonPool-worker-* in socket read or synchronized.

// Anti-pattern: blocking HTTP on common pool
list.parallelStream().forEach(id -> httpClient.get("/item/" + id));

// Better: dedicated executor with bounded queue + timeouts
executor.submit(() -> httpClient.get(...));

Capture before restart

How — restore service and fix the bottleneck

Immediate mitigation

  1. Scale replicas — spreads load if dependency is healthy; does not fix per-pod stuck threads.
  2. Throttle at edge — API gateway rate limit, load shed non-critical routes.
  3. Circuit break / disable feature — stop calling slow dependency until recovered.
  4. Temporary maxThreads bump — only with headroom on DB connections and memory; watch for worse contention.
  5. Restart stuck pods — after dumps saved; if threads were deadlocked or leaked.

Durable fixes

LeverAction
Timeouts everywhereJDBC, HTTP client, Redis—fail fast; return 504/503 with retry-safe semantics
Align pool sizesmaxThreads ≤ DB pool capacity you can afford; or async DB access with smaller blocking footprint
BulkheadSeparate executor for heavy/reporting vs API; cap concurrent calls to fragile dependency
BackpressureBounded queue + reject; don’t accept infinite work
Virtual threads (Java 21+)Many blocking I/O requests on cheap carriers; still bound DB and CPU-heavy work
Async I/O modelWebFlux / reactive only if team commits; don’t block event loop
Fix slow pathIndex, cache, batch API calls, remove N+1 queries
Never block common FJPCustom Executor for blocking tasks

Sizing guidance (starting point)

Verify

  1. Peak-load test: busy threads < 80% max under target RPS.
  2. No growth in accept queue / executor queue at steady state.
  3. Thread dumps: mix of idle workers; blocked threads only briefly.
  4. Alerts: busy threads > 85% for 5 min; pending Hikari connections > 0.

Interview one-liner

“I check whether all request threads are busy, take a dump to see if they’re waiting on DB/HTTP or locks, align connection pool with thread demand, add timeouts and bulkheads, and only then tune maxThreads—after fixing slow dependencies.”

Related scenarios