Reactive Programming & WebFlux

Reactive is not "async for free"—it's a programming model built on non-blocking I/O, stream composition, and backpressure. Spring WebFlux brings that model to HTTP with Mono and Flux, while Java 21 virtual threads reshaped when you actually need it.

mid senior Spring Boot 3.x · Java 21

Reactive programming concepts

Reactive systems handle many concurrent I/O operations with a small thread pool by never blocking on I/O. Spring WebFlux sits on Project Reactor (Reactive Streams implementation) and Netty—not a separate language, but a different way to compose asynchronous pipelines.

The Reactive Manifesto

Four traits define reactive systems—not every trait requires WebFlux, but together they explain the design goals:

TraitMeaning in practice
ResponsiveConsistent, low-latency responses under load; users perceive the system as always available
ResilientFailure is contained and recovered—timeouts, retries, circuit breakers, isolation
ElasticScales up/down with demand without thread explosion; work is scheduled, not pinned to threads
Message-drivenComponents communicate via async messages (events, streams) with explicit flow control
flowchart LR
  C[Client] -->|HTTP request| W[WebFlux / Netty]
  W -->|Mono/Flux| S[Service layer]
  S -->|non-blocking| DB[(R2DBC / driver)]
  S -->|WebClient| API[External API]
  W -->|backpressure| C

Backpressure — the key differentiator

In imperative code, a fast producer can overwhelm a slow consumer—buffers grow until OOM. Reactive Streams define a subscription contract: the subscriber signals how many items it can handle (request(n)), and the publisher respects that limit.

This is what separates reactive from "just use CompletableFuture everywhere." Without backpressure, unbounded queues hide problems until production.

Backpressure in Flux
Flux.range(1, 1_000_000)
    .onBackpressureBuffer(100)          // buffer overflow strategy — use carefully
    .limitRate(50)                      // prefetch cap — smoother consumer pacing
    .publishOn(Schedulers.boundedElastic())
    .subscribe(
        data -> process(data),          // slow consumer
        Throwable::printStackTrace,
        () -> System.out.println("done"),
        sub -> sub.request(10)          // explicit: pull 10 at a time
    );
🔬 Under the Hood

Reactor operators like limitRate, onBackpressureDrop, and onBackpressureLatest implement overflow policies when production outpaces consumption. Choose based on whether losing, buffering, or keeping-latest is acceptable for your domain.

Project Reactor: Mono and Flux

TypeItemsTypical use
Mono<T>0 or 1Single DB row, one HTTP response body, optional result
Flux<T>0 to N (possibly infinite)Server-Sent Events, file lines, Kafka stream, paginated cursor
Creating publishers
Mono.just("hello");
Mono.empty();
Mono.error(new IllegalStateException("not found"));

Flux.just("a", "b", "c");
Flux.fromIterable(List.of(1, 2, 3));
Flux.interval(Duration.ofMillis(100)).take(5);   // tick stream

// Deferred — created per subscription (cold)
Mono.fromCallable(() -> repo.findById(id));

In WebFlux controllers, returning Mono<ResponseEntity<Order>> tells Spring to subscribe when the response is written— the servlet thread (or event loop thread) is not blocked waiting for the DB. Contrast with blocking MVC in Spring Web MVC.

Reactive operators

Operators transform, combine, and recover streams declaratively. Master flatMap (async sequencing), error operators, and merging patterns—the rest follows.

Transformation & filtering

map, flatMap, filter
// map — 1:1 sync transform
Flux.just("alice", "bob")
    .map(String::toUpperCase);

// flatMap — 1:N async; inner publishers run concurrently (unordered by default)
Flux.just("user-1", "user-2")
    .flatMap(id -> userClient.fetchUser(id));   // Mono<User> per id

// flatMapSequential — preserve source order with limited concurrency
Flux.just("a", "b", "c")
    .flatMapSequential(this::slowLookup, 2);

// filter
Flux.range(1, 10).filter(n -> n % 2 == 0);
⚠ Pitfall

flatMap with high concurrency on a hot path can stampede downstream services. Cap with flatMap(fn, concurrency) or use concatMap for strict sequential execution.

Combining streams

zip, merge, concat
Mono<User> user = userRepo.findById(id);
Mono<Account> account = accountRepo.findByUserId(id);

// zip — pair latest from each when both emit (waits for slowest)
Mono.zip(user, account, (u, a) -> new Dashboard(u, a));

// merge — interleave as they arrive (hot fan-in)
Flux.merge(temperatureSensor(), humiditySensor());

// concat — strictly sequential: first publisher completes, then second
Flux.concat(page1(), page2(), page3());

Defaults & fallbacks

switchIfEmpty, defaultIfEmpty
orderRepo.findById(id)
    .switchIfEmpty(Mono.error(new OrderNotFoundException(id)));

configRepo.findFlag("feature-x")
    .defaultIfEmpty(false);

Error recovery & resilience

onErrorResume, retry, timeout
pricingClient.getRate("USD")
    .timeout(Duration.ofSeconds(2))
    .retryWhen(Retry.backoff(3, Duration.ofMillis(200))
        .filter(ex -> ex instanceof WebClientResponseException))
    .onErrorResume(WebClientResponseException.NotFound.class,
        ex -> Mono.just(FallbackRate.USD_DEFAULT))
    .onErrorMap(TimeoutException.class,
        ex -> new ServiceUnavailableException("pricing down"));

cache

cache() turns a cold publisher into a hot, replaying one—subsequent subscribers get the same result without re-executing side effects. Useful for expensive lookups within a request scope; dangerous if the cached value becomes stale.

cache()
Mono<ExchangeRate> rate = fxClient.fetch("EUR")
    .cache(Duration.ofMinutes(5));   // Reactor 3.4+ timed cache

// Multiple subscribers within 5 min share one HTTP call
rate.subscribe(r -> chargeInEur(r));
rate.subscribe(r -> auditLog(r));
🎯 Interview

"map vs flatMap?" — map transforms synchronously (T → R). flatMap transforms asynchronously (T → Publisher<R>) and flattens inner streams. Use flatMap when the next step returns a Mono/Flux (DB call, HTTP call).

Hot vs cold publishers

Cold publishers start work on each subscribe()—like a fresh HTTP call every time. Hot publishers emit regardless of subscribers—like a live ticker or message bus.

ColdHot
When data is producedOn subscribeIndependent of subscribers
ExamplesMono.fromCallable, Flux.rangeSinks.many(), Flux.share(), SSE feed
Missed eventsN/A—starts from beginningLate subscribers miss prior events unless buffered
Make cold → hot.cache(), .share(), .publish().connect()
Hot sink for application events
Sinks.Many<MetricEvent> sink = Sinks.many().multicast().onBackpressureBuffer();

public void emit(MetricEvent e) {
  sink.tryEmitNext(e);
}

public Flux<MetricEvent> stream() {
  return sink.asFlux();
}

Schedulers — where work runs

Reactor is single-threaded on the event loop by default. publishOn / subscribeOn shift execution to other thread pools.

SchedulerPoolUse for
Schedulers.immediate()Current threadNo thread hop; stay on event loop
Schedulers.parallel()CPU-sized poolCPU-bound transforms (parsing, crypto)—keep work short
Schedulers.boundedElastic()Grows on demand, cappedBlocking calls you cannot eliminate (legacy JDBC, file I/O)
Schedulers.single()One threadOrdered side effects, single-writer scenarios
subscribeOn vs publishOn
Mono.fromCallable(() -> legacyBlockingApi())
    .subscribeOn(Schedulers.boundedElastic())  // subscription + source run on elastic
    .map(this::transform)
    .publishOn(Schedulers.parallel())        // downstream operators on parallel
    .subscribe(result -> { });
⚠ Pitfall

Never call blocking JDBC, Thread.sleep, or block() on the Netty event loop thread—it stalls all connections on that loop. Offload to boundedElastic or switch to R2DBC.

subscribe() vs block()

subscribe() is non-blocking: registers callbacks and returns immediately. block() / blockOptional() waits for completion on the calling thread—convenient in tests and CLI tools, forbidden on reactive server threads.

When blocking is acceptable
// ✅ JUnit test — main thread, no event loop
@Test
void findOrder() {
  Order order = orderService.findById(1L).block();
  assertThat(order.id()).isEqualTo(1L);
}

// ✅ Spring MVC controller bridging to reactive client (use sparingly)
@GetMapping("/sync-report")
Report syncReport() {
  return reportClient.fetch().block(Duration.ofSeconds(5));
}

// ❌ WebFlux controller — never block
@GetMapping("/orders/{id}")
Mono<Order> getOrder(@PathVariable Long id) {
  return orderService.findById(id);   // return Mono, don't .block()
}
Next: Spring WebFlux →

Spring WebFlux

WebFlux is Spring's reactive web stack on Netty (default), Jetty, or Undertow. Add spring-boot-starter-webflux—do not mix spring-boot-starter-web in the same app (one stack per application).

Annotation model — same surface as MVC

@RestController, @GetMapping, @PathVariable, and validation annotations work the same as in Web MVC—the difference is the return type and the runtime (DispatcherHandler vs DispatcherServlet).

Reactive @RestController
@RestController
@RequestMapping("/api/orders")
class OrderController {

  private final OrderService orders;

  @GetMapping("/{id}")
  Mono<OrderDto> get(@PathVariable Long id) {
    return orders.findById(id).map(OrderDto::from);
  }

  @GetMapping
  Flux<OrderDto> list(@RequestParam(defaultValue = "0") int page) {
    return orders.findPage(page).map(OrderDto::from);
  }

  @PostMapping
  Mono<ResponseEntity<OrderDto>> create(@Valid @RequestBody Mono<CreateOrderRequest> body) {
    return body
        .flatMap(orders::create)
        .map(dto -> ResponseEntity.status(HttpStatus.CREATED).body(dto));
  }

  @GetMapping(value = "/stream", produces = MediaType.TEXT_EVENT_STREAM_VALUE)
  Flux<OrderEvent> stream() {
    return orders.eventStream();
  }
}
💡 Pro Tip

For @RequestBody Mono<T>, Spring subscribes once and binds the body reactively—useful for large payloads or multipart streams without buffering entirely in memory.

Functional endpoints — RouterFunction + HandlerFunction

An alternative to annotations: route definitions as beans, handlers as functions. Fits DSL-style APIs, gateway routing tables, and teams that prefer composition over reflection.

RouterFunction
@Configuration
class OrderRoutes {

  @Bean
  RouterFunction<ServerResponse> routes(OrderHandler handler) {
    return RouterFunctions.route()
        .GET("/api/orders/{id}", handler::getOrder)
        .GET("/api/orders", handler::listOrders)
        .POST("/api/orders", handler::createOrder)
        .build();
  }
}

@Component
class OrderHandler {
  Mono<ServerResponse> getOrder(ServerRequest req) {
    Long id = Long.parseLong(req.pathVariable("id"));
    return orderService.findById(id)
        .map(dto -> ServerResponse.ok().bodyValue(dto))
        .switchIfEmpty(ServerResponse.notFound().build());
  }
}

You can mix annotation controllers and RouterFunction in one app—both register on the same DispatcherHandler.

WebClient — reactive HTTP client

WebClient replaces blocking RestTemplate for reactive apps. It works in MVC apps too (fire-and-forget outbound calls)—but shines when composed into Mono/Flux pipelines.

WebClient bean
@Bean
WebClient paymentClient(WebClient.Builder builder) {
  return builder
      .baseUrl("https://api.payments.example")
      .defaultHeader(HttpHeaders.ACCEPT, MediaType.APPLICATION_JSON_VALUE)
      .filter(ExchangeFilterFunction.ofRequestProcessor(req ->
          Mono.just(ClientRequest.from(req)
              .header("X-Request-Id", UUID.randomUUID().toString())
              .build())))
      .build();
}

retrieve() vs exchangeToMono()

APIBehaviorPrefer when
retrieve()Assumes 4xx/5xx are errors; simpler body decodingMost CRUD-style calls with standard error handling
exchangeToMono()Full access to ClientResponse; you decide how to handle statusCustom status mapping, HEAD requests, reading headers before body
retrieve() and exchangeToMono()
// retrieve — idiomatic
Mono<PaymentResult> charge = client.post()
    .uri("/charges")
    .bodyValue(new ChargeRequest(100, "USD"))
    .retrieve()
    .onStatus(HttpStatusCode::is4xxClientError,
        resp -> resp.bodyToMono(ErrorBody.class)
            .map(body -> new PaymentRejectedException(body.message())))
    .bodyToMono(PaymentResult.class);

// exchangeToMono — full control
Mono<RateQuote> quote = client.get()
    .uri("/rates/{ccy}", "EUR")
    .exchangeToMono(resp -> {
      if (resp.statusCode().is2xxSuccessful()) {
        return resp.bodyToMono(RateQuote.class);
      }
      if (resp.statusCode().equals(HttpStatus.NOT_FOUND)) {
        return Mono.just(RateQuote.fallback("EUR"));
      }
      return resp.createException().flatMap(Mono::error);
    });

Error handling, retry, timeout

Resilient outbound call
public Mono<InventorySnapshot> fetchInventory(String sku) {
  return client.get()
      .uri("/inventory/{sku}", sku)
      .retrieve()
      .bodyToMono(InventorySnapshot.class)
      .timeout(Duration.ofSeconds(3))
      .retryWhen(Retry.backoff(3, Duration.ofMillis(100))
          .maxBackoff(Duration.ofSeconds(2))
          .filter(ex -> ex instanceof WebClientRequestException))
      .onErrorResume(WebClientResponseException.NotFound.class,
          ex -> Mono.just(InventorySnapshot.empty(sku)));
}

Streaming responses

Flux body — NDJSON / SSE
Flux<LogLine> tailLogs = client.get()
    .uri("/logs/stream")
    .accept(MediaType.TEXT_EVENT_STREAM)
    .retrieve()
    .bodyToFlux(LogLine.class);

// Proxy stream through your controller
@GetMapping(value = "/logs", produces = MediaType.TEXT_EVENT_STREAM_VALUE)
Flux<LogLine> proxyLogs() {
  return tailLogs;
}
🌍 Real World

Test WebClient with @WebFluxTest + WebTestClient or MockWebServer (OkHttp). For contract tests against external APIs, combine with WireMock from the Testing chapter.

R2DBC — reactive relational access

JPA/Hibernate is inherently blocking—do not use it on WebFlux event loop threads. R2DBC (Reactive Relational Database Connectivity) provides non-blocking drivers and Spring Data repositories returning Mono/Flux.

Dependencies
implementation "org.springframework.boot:spring-boot-starter-data-r2dbc"
runtimeOnly "org.postgresql:r2dbc-postgresql"
Reactive repository
@Table("orders")
record Order(@Id Long id, String status, Instant createdAt) {}

interface OrderRepository extends ReactiveCrudRepository<Order, Long> {
  Flux<Order> findByStatus(String status);
}

@Service
class OrderService {
  Mono<Order> ship(Long id) {
    return orders.findById(id)
        .switchIfEmpty(Mono.error(new OrderNotFoundException(id)))
        .flatMap(o -> orders.save(o.withStatus("SHIPPED")));
  }
}
⚠ Pitfall

R2DBC lacks JPA features: no lazy loading, no @OneToMany graphs, no second-level cache. Model explicit queries. For complex ORM needs, many teams stay on MVC + JPA rather than forcing WebFlux + R2DBC.

🔬 Under the Hood

R2DBC speaks the database protocol without blocking threads waiting on sockets. Spring Data R2DBC maps rows to entities; transactions use TransactionalOperator instead of @Transactional on reactive methods (blocking transaction sync does not apply).

WebFlux vs MVC — when to choose which

The decision is not "reactive is always faster." Match the stack to workload shape, team skills, and ecosystem constraints.

FactorSpring MVCSpring WebFlux
Threading modelThread-per-request (or virtual thread)Few event-loop threads + small worker pools
Sweet spotCRUD, JPA, mature librariesI/O-bound fan-out, streaming, high concurrency with slow backends
PersistenceJPA, JDBC, Spring Data JPAR2DBC, Mongo reactive, Cassandra reactive
Learning curveLower — imperativeHigher — reactive operators, debugging stack traces
Blocking codeNatural fitMust isolate or eliminate
flowchart TD
  Q{Workload type?}
  Q -->|CRUD + JPA + typical REST| MVC[Spring MVC]
  Q -->|SSE / NDJSON streaming| WF[WebFlux]
  Q -->|Many slow outbound HTTP calls in parallel| WF
  Q -->|Backpressure end-to-end| WF
  Q -->|Heavy CPU computation| MVC[MVC + scale instances]
  Q -->|Team knows JPA only| MVC
  • I/O-bound — thread waits on network or disk. WebFlux or virtual-thread MVC both scale well; WebFlux adds backpressure and streaming primitives.
  • CPU-bound — threads actively compute. Neither model removes CPU cost; add cores, partition work, or use async job queues. Reactive does not magic away CPU.
🎯 Interview

"Would you rewrite our Spring MVC monolith to WebFlux?" — Strong answer: only with clear I/O bottlenecks, streaming requirements, or end-to-end reactive data stores—and team readiness. Otherwise MVC + virtual threads + efficient JDBC pool often wins on total cost of ownership.

Virtual threads vs reactive

Spring Boot 3.2 on Java 21 made virtual threads a first-class option for servlet stacks. You can get WebFlux-like concurrency for blocking code—without rewriting to Mono and Flux.

Enable virtual threads (MVC)
spring:
  threads:
    virtual:
      enabled: true

With virtual threads enabled, Tomcat handles each request on a lightweight virtual thread. Blocking JDBC, HTTP, and sleep no longer pin scarce platform threads—MVC becomes competitive for many I/O-heavy workloads that previously motivated WebFlux adoption.

flowchart LR
  subgraph vt["MVC + virtual threads"]
    T1[vthread 1] --> JDBC[(blocking JDBC)]
    T2[vthread 2] --> HTTP[blocking HTTP]
    T3[vthread 3..N] --> Wait[I/O wait — cheap]
  end
  subgraph rx["WebFlux + Reactor"]
    EL[Event loop] --> Mono[Mono pipeline]
    Mono --> R2DBC[(R2DBC)]
    Mono --> WC[WebClient]
  end

When reactive is still preferred

  • Backpressure — consumer-driven flow control end-to-end (Kafka → process → SSE client)
  • StreamingFlux as first-class response type; incremental parsing of large payloads
  • Full reactive data path — R2DBC, reactive Mongo, reactive Redis already in architecture
  • Gateway / BFF fan-out — compose many outbound calls with declarative operators and unified error handling

Programming model vs scalability tool

Reactive (WebFlux)Virtual threads (MVC)
What it isA programming model — pipelines, operators, typesA JVM threading feature — scalability for blocking code
Code styleMono/Flux, no block()Imperative — familiar loops, JPA, JDBC
DebuggingAsync stack traces, reactor checkpointsConventional thread dumps (virtual thread aware)
Best fitStreaming + backpressure + reactive driversExisting blocking Spring apps seeking throughput
📌 Version Note

Virtual threads require Java 21+. Pin carriers with -Djdk.virtualThreadScheduler.parallelism=N only when tuning under load. Avoid synchronized blocks on hot paths—they pin the virtual thread to a carrier (improving in newer JDKs but still worth profiling).

💡 Pro Tip

Pragmatic hybrid: MVC + virtual threads for your API and JPA layer; WebClient for selective non-blocking outbound calls where you block() at the boundary or use toFuture(). You do not need full WebFlux to benefit from reactive HTTP clients.

🌍 Real World

Netflix, LinkedIn, and others adopted reactive for specific high-throughput pipelines—not as a blanket replacement for every service. Post–Java 21, many greenfield CRUD APIs default to MVC + virtual threads unless streaming is in the requirements doc on day one.