Downstream slow — circuit breaker and backpressure

Scenario

A partner API or internal microservice degrades from 50ms to 30s. Your service keeps calling it: threads block, queues grow, memory rises, and everything times out—even endpoints that do not use that dependency. You need fail fast, bounded concurrency per downstream, and a breaker that stops hammering the sick service.

After reading, you should be able to:

Why — slow dependencies infect the caller

Without limits, each incoming request may block a thread waiting on a sick downstream. Under load, all threads sit in socket read → thread pool exhausted, new requests queue at the edge, and the failure propagates upstream. A circuit breaker stops calling the failing dependency for a cooldown period after error rate or slowness crosses a threshold—fail fast instead of queue forever. Backpressure means rejecting or shedding work when you cannot process it, rather than buffering without bound.

Timeout alone is not enough

Circuit states

StateBehavior
ClosedNormal calls; failures counted
OpenCalls fail fast (no downstream hit)
Half-openLimited trial calls; success → closed, fail → open

What — recognize unbounded queuing

  1. Symptoms — one downstream span dominates traces; thread dump shows many threads in HTTP read to same host; memory/queue depth grows; error rate rises service-wide.
  2. Metrics — Resilience4j resilience4j.circuitbreaker.state, call not permitted count; Tomcat queue length; rejected executions.
  3. Which dependencytrace aggregation by downstream service name.
  4. Unbounded structuresLinkedBlockingQueue default capacity = Integer.MAX_VALUE; unbounded CompletableFuture chains; no limit on async retries.

How — layer defenses

1. Timeout (every outbound call)

# HttpClient / RestTemplate / WebClient
connectTimeout: 2s
readTimeout: 5s

Shorter than upstream gateway timeout — 502 guide.

2. Circuit breaker (Resilience4j example)

CircuitBreakerConfig config = CircuitBreakerConfig.custom()
  .failureRateThreshold(50)
  .waitDurationInOpenState(Duration.ofSeconds(30))
  .slidingWindowSize(20)
  .permittedNumberOfCallsInHalfOpenState(5)
  .slowCallDurationThreshold(Duration.ofSeconds(3))
  .slowCallRateThreshold(50)
  .build();

CircuitBreaker breaker = CircuitBreaker.of("paymentApi", config);
Supplier<PaymentResult> decorated = CircuitBreaker
  .decorateSupplier(breaker, () -> paymentClient.charge(req));

3. Bulkhead (limit concurrent calls per dependency)

Bulkhead bulkhead = Bulkhead.of("paymentApi",
  BulkheadConfig.custom().maxConcurrentCalls(10).build());

// Only 10 threads can call payment at once; rest get BulkheadFullException fast

Prevents one slow API from consuming all HTTP workers.

4. Bounded queue + reject policy

ThreadPoolExecutor executor = new ThreadPoolExecutor(
  8, 8, 0L, TimeUnit.MILLISECONDS,
  new ArrayBlockingQueue<>(100),  // bounded
  new ThreadPoolExecutor.AbortPolicy());  // fail fast when full

5. Fallback (careful)

6. Stack order (typical)

Request → Bulkhead → CircuitBreaker → TimeLimiter → HTTP call

7. Observability and ops

Verify

  1. Chaos: downstream fixed 10s delay → breaker opens; your API p99 stays bounded.
  2. Thread count stable; not all blocked on one host.
  3. Half-open recovery when dependency heals.

Interview one-liner

“I set aggressive timeouts, a bulkhead cap per downstream, and a circuit breaker on failure and slow-call rate so we fail fast instead of queuing unbounded threads— with a safe fallback or 503, and metrics on breaker state.”

Related scenarios