Reactive Programming & WebFlux
Reactive is not "async for free"—it's a programming model built on non-blocking I/O, stream composition, and backpressure. Spring WebFlux brings that model to HTTP with Mono and Flux, while Java 21 virtual threads reshaped when you actually need it.
Reactive programming concepts
Reactive systems handle many concurrent I/O operations with a small thread pool by never blocking on I/O. Spring WebFlux sits on Project Reactor (Reactive Streams implementation) and Netty—not a separate language, but a different way to compose asynchronous pipelines.
The Reactive Manifesto
Four traits define reactive systems—not every trait requires WebFlux, but together they explain the design goals:
| Trait | Meaning in practice |
|---|---|
| Responsive | Consistent, low-latency responses under load; users perceive the system as always available |
| Resilient | Failure is contained and recovered—timeouts, retries, circuit breakers, isolation |
| Elastic | Scales up/down with demand without thread explosion; work is scheduled, not pinned to threads |
| Message-driven | Components communicate via async messages (events, streams) with explicit flow control |
flowchart LR C[Client] -->|HTTP request| W[WebFlux / Netty] W -->|Mono/Flux| S[Service layer] S -->|non-blocking| DB[(R2DBC / driver)] S -->|WebClient| API[External API] W -->|backpressure| C
Backpressure — the key differentiator
In imperative code, a fast producer can overwhelm a slow consumer—buffers grow until OOM. Reactive Streams define a subscription contract: the subscriber signals how many items it can handle (request(n)), and the publisher respects that limit.
This is what separates reactive from "just use CompletableFuture everywhere." Without backpressure, unbounded queues hide problems until production.
Flux.range(1, 1_000_000)
.onBackpressureBuffer(100) // buffer overflow strategy — use carefully
.limitRate(50) // prefetch cap — smoother consumer pacing
.publishOn(Schedulers.boundedElastic())
.subscribe(
data -> process(data), // slow consumer
Throwable::printStackTrace,
() -> System.out.println("done"),
sub -> sub.request(10) // explicit: pull 10 at a time
);
Reactor operators like limitRate, onBackpressureDrop, and onBackpressureLatest implement overflow policies when production outpaces consumption. Choose based on whether losing, buffering, or keeping-latest is acceptable for your domain.
Project Reactor: Mono and Flux
| Type | Items | Typical use |
|---|---|---|
| Mono<T> | 0 or 1 | Single DB row, one HTTP response body, optional result |
| Flux<T> | 0 to N (possibly infinite) | Server-Sent Events, file lines, Kafka stream, paginated cursor |
Mono.just("hello");
Mono.empty();
Mono.error(new IllegalStateException("not found"));
Flux.just("a", "b", "c");
Flux.fromIterable(List.of(1, 2, 3));
Flux.interval(Duration.ofMillis(100)).take(5); // tick stream
// Deferred — created per subscription (cold)
Mono.fromCallable(() -> repo.findById(id));
In WebFlux controllers, returning Mono<ResponseEntity<Order>> tells Spring to subscribe when the response is written— the servlet thread (or event loop thread) is not blocked waiting for the DB. Contrast with blocking MVC in Spring Web MVC.
Reactive operators
Operators transform, combine, and recover streams declaratively. Master flatMap (async sequencing), error operators, and merging patterns—the rest follows.
Transformation & filtering
// map — 1:1 sync transform
Flux.just("alice", "bob")
.map(String::toUpperCase);
// flatMap — 1:N async; inner publishers run concurrently (unordered by default)
Flux.just("user-1", "user-2")
.flatMap(id -> userClient.fetchUser(id)); // Mono<User> per id
// flatMapSequential — preserve source order with limited concurrency
Flux.just("a", "b", "c")
.flatMapSequential(this::slowLookup, 2);
// filter
Flux.range(1, 10).filter(n -> n % 2 == 0);
flatMap with high concurrency on a hot path can stampede downstream services. Cap with flatMap(fn, concurrency) or use concatMap for strict sequential execution.
Combining streams
Mono<User> user = userRepo.findById(id);
Mono<Account> account = accountRepo.findByUserId(id);
// zip — pair latest from each when both emit (waits for slowest)
Mono.zip(user, account, (u, a) -> new Dashboard(u, a));
// merge — interleave as they arrive (hot fan-in)
Flux.merge(temperatureSensor(), humiditySensor());
// concat — strictly sequential: first publisher completes, then second
Flux.concat(page1(), page2(), page3());
Defaults & fallbacks
orderRepo.findById(id)
.switchIfEmpty(Mono.error(new OrderNotFoundException(id)));
configRepo.findFlag("feature-x")
.defaultIfEmpty(false);
Error recovery & resilience
pricingClient.getRate("USD")
.timeout(Duration.ofSeconds(2))
.retryWhen(Retry.backoff(3, Duration.ofMillis(200))
.filter(ex -> ex instanceof WebClientResponseException))
.onErrorResume(WebClientResponseException.NotFound.class,
ex -> Mono.just(FallbackRate.USD_DEFAULT))
.onErrorMap(TimeoutException.class,
ex -> new ServiceUnavailableException("pricing down"));
cache
cache() turns a cold publisher into a hot, replaying one—subsequent subscribers get the same result without re-executing side effects. Useful for expensive lookups within a request scope; dangerous if the cached value becomes stale.
Mono<ExchangeRate> rate = fxClient.fetch("EUR")
.cache(Duration.ofMinutes(5)); // Reactor 3.4+ timed cache
// Multiple subscribers within 5 min share one HTTP call
rate.subscribe(r -> chargeInEur(r));
rate.subscribe(r -> auditLog(r));
"map vs flatMap?" — map transforms synchronously (T → R). flatMap transforms asynchronously (T → Publisher<R>) and flattens inner streams. Use flatMap when the next step returns a Mono/Flux (DB call, HTTP call).
Hot vs cold publishers
Cold publishers start work on each subscribe()—like a fresh HTTP call every time. Hot publishers emit regardless of subscribers—like a live ticker or message bus.
| Cold | Hot | |
|---|---|---|
| When data is produced | On subscribe | Independent of subscribers |
| Examples | Mono.fromCallable, Flux.range | Sinks.many(), Flux.share(), SSE feed |
| Missed events | N/A—starts from beginning | Late subscribers miss prior events unless buffered |
| Make cold → hot | .cache(), .share(), .publish().connect() | |
Sinks.Many<MetricEvent> sink = Sinks.many().multicast().onBackpressureBuffer();
public void emit(MetricEvent e) {
sink.tryEmitNext(e);
}
public Flux<MetricEvent> stream() {
return sink.asFlux();
}
Schedulers — where work runs
Reactor is single-threaded on the event loop by default. publishOn / subscribeOn shift execution to other thread pools.
| Scheduler | Pool | Use for |
|---|---|---|
| Schedulers.immediate() | Current thread | No thread hop; stay on event loop |
| Schedulers.parallel() | CPU-sized pool | CPU-bound transforms (parsing, crypto)—keep work short |
| Schedulers.boundedElastic() | Grows on demand, capped | Blocking calls you cannot eliminate (legacy JDBC, file I/O) |
| Schedulers.single() | One thread | Ordered side effects, single-writer scenarios |
Mono.fromCallable(() -> legacyBlockingApi())
.subscribeOn(Schedulers.boundedElastic()) // subscription + source run on elastic
.map(this::transform)
.publishOn(Schedulers.parallel()) // downstream operators on parallel
.subscribe(result -> { });
Never call blocking JDBC, Thread.sleep, or block() on the Netty event loop thread—it stalls all connections on that loop. Offload to boundedElastic or switch to R2DBC.
subscribe() vs block()
subscribe() is non-blocking: registers callbacks and returns immediately. block() / blockOptional() waits for completion on the calling thread—convenient in tests and CLI tools, forbidden on reactive server threads.
// ✅ JUnit test — main thread, no event loop
@Test
void findOrder() {
Order order = orderService.findById(1L).block();
assertThat(order.id()).isEqualTo(1L);
}
// ✅ Spring MVC controller bridging to reactive client (use sparingly)
@GetMapping("/sync-report")
Report syncReport() {
return reportClient.fetch().block(Duration.ofSeconds(5));
}
// ❌ WebFlux controller — never block
@GetMapping("/orders/{id}")
Mono<Order> getOrder(@PathVariable Long id) {
return orderService.findById(id); // return Mono, don't .block()
}
Spring WebFlux
WebFlux is Spring's reactive web stack on Netty (default), Jetty, or Undertow. Add spring-boot-starter-webflux—do not mix spring-boot-starter-web in the same app (one stack per application).
Annotation model — same surface as MVC
@RestController, @GetMapping, @PathVariable, and validation annotations work the same as in Web MVC—the difference is the return type and the runtime (DispatcherHandler vs DispatcherServlet).
@RestController
@RequestMapping("/api/orders")
class OrderController {
private final OrderService orders;
@GetMapping("/{id}")
Mono<OrderDto> get(@PathVariable Long id) {
return orders.findById(id).map(OrderDto::from);
}
@GetMapping
Flux<OrderDto> list(@RequestParam(defaultValue = "0") int page) {
return orders.findPage(page).map(OrderDto::from);
}
@PostMapping
Mono<ResponseEntity<OrderDto>> create(@Valid @RequestBody Mono<CreateOrderRequest> body) {
return body
.flatMap(orders::create)
.map(dto -> ResponseEntity.status(HttpStatus.CREATED).body(dto));
}
@GetMapping(value = "/stream", produces = MediaType.TEXT_EVENT_STREAM_VALUE)
Flux<OrderEvent> stream() {
return orders.eventStream();
}
}
For @RequestBody Mono<T>, Spring subscribes once and binds the body reactively—useful for large payloads or multipart streams without buffering entirely in memory.
Functional endpoints — RouterFunction + HandlerFunction
An alternative to annotations: route definitions as beans, handlers as functions. Fits DSL-style APIs, gateway routing tables, and teams that prefer composition over reflection.
@Configuration
class OrderRoutes {
@Bean
RouterFunction<ServerResponse> routes(OrderHandler handler) {
return RouterFunctions.route()
.GET("/api/orders/{id}", handler::getOrder)
.GET("/api/orders", handler::listOrders)
.POST("/api/orders", handler::createOrder)
.build();
}
}
@Component
class OrderHandler {
Mono<ServerResponse> getOrder(ServerRequest req) {
Long id = Long.parseLong(req.pathVariable("id"));
return orderService.findById(id)
.map(dto -> ServerResponse.ok().bodyValue(dto))
.switchIfEmpty(ServerResponse.notFound().build());
}
}
You can mix annotation controllers and RouterFunction in one app—both register on the same DispatcherHandler.
WebClient — reactive HTTP client
WebClient replaces blocking RestTemplate for reactive apps. It works in MVC apps too (fire-and-forget outbound calls)—but shines when composed into Mono/Flux pipelines.
@Bean
WebClient paymentClient(WebClient.Builder builder) {
return builder
.baseUrl("https://api.payments.example")
.defaultHeader(HttpHeaders.ACCEPT, MediaType.APPLICATION_JSON_VALUE)
.filter(ExchangeFilterFunction.ofRequestProcessor(req ->
Mono.just(ClientRequest.from(req)
.header("X-Request-Id", UUID.randomUUID().toString())
.build())))
.build();
}
retrieve() vs exchangeToMono()
| API | Behavior | Prefer when |
|---|---|---|
| retrieve() | Assumes 4xx/5xx are errors; simpler body decoding | Most CRUD-style calls with standard error handling |
| exchangeToMono() | Full access to ClientResponse; you decide how to handle status | Custom status mapping, HEAD requests, reading headers before body |
// retrieve — idiomatic
Mono<PaymentResult> charge = client.post()
.uri("/charges")
.bodyValue(new ChargeRequest(100, "USD"))
.retrieve()
.onStatus(HttpStatusCode::is4xxClientError,
resp -> resp.bodyToMono(ErrorBody.class)
.map(body -> new PaymentRejectedException(body.message())))
.bodyToMono(PaymentResult.class);
// exchangeToMono — full control
Mono<RateQuote> quote = client.get()
.uri("/rates/{ccy}", "EUR")
.exchangeToMono(resp -> {
if (resp.statusCode().is2xxSuccessful()) {
return resp.bodyToMono(RateQuote.class);
}
if (resp.statusCode().equals(HttpStatus.NOT_FOUND)) {
return Mono.just(RateQuote.fallback("EUR"));
}
return resp.createException().flatMap(Mono::error);
});
Error handling, retry, timeout
public Mono<InventorySnapshot> fetchInventory(String sku) {
return client.get()
.uri("/inventory/{sku}", sku)
.retrieve()
.bodyToMono(InventorySnapshot.class)
.timeout(Duration.ofSeconds(3))
.retryWhen(Retry.backoff(3, Duration.ofMillis(100))
.maxBackoff(Duration.ofSeconds(2))
.filter(ex -> ex instanceof WebClientRequestException))
.onErrorResume(WebClientResponseException.NotFound.class,
ex -> Mono.just(InventorySnapshot.empty(sku)));
}
Streaming responses
Flux<LogLine> tailLogs = client.get()
.uri("/logs/stream")
.accept(MediaType.TEXT_EVENT_STREAM)
.retrieve()
.bodyToFlux(LogLine.class);
// Proxy stream through your controller
@GetMapping(value = "/logs", produces = MediaType.TEXT_EVENT_STREAM_VALUE)
Flux<LogLine> proxyLogs() {
return tailLogs;
}
Test WebClient with @WebFluxTest + WebTestClient or MockWebServer (OkHttp). For contract tests against external APIs, combine with WireMock from the Testing chapter.
R2DBC — reactive relational access
JPA/Hibernate is inherently blocking—do not use it on WebFlux event loop threads. R2DBC (Reactive Relational Database Connectivity) provides non-blocking drivers and Spring Data repositories returning Mono/Flux.
implementation "org.springframework.boot:spring-boot-starter-data-r2dbc"
runtimeOnly "org.postgresql:r2dbc-postgresql"
@Table("orders")
record Order(@Id Long id, String status, Instant createdAt) {}
interface OrderRepository extends ReactiveCrudRepository<Order, Long> {
Flux<Order> findByStatus(String status);
}
@Service
class OrderService {
Mono<Order> ship(Long id) {
return orders.findById(id)
.switchIfEmpty(Mono.error(new OrderNotFoundException(id)))
.flatMap(o -> orders.save(o.withStatus("SHIPPED")));
}
}
R2DBC lacks JPA features: no lazy loading, no @OneToMany graphs, no second-level cache. Model explicit queries. For complex ORM needs, many teams stay on MVC + JPA rather than forcing WebFlux + R2DBC.
R2DBC speaks the database protocol without blocking threads waiting on sockets. Spring Data R2DBC maps rows to entities; transactions use TransactionalOperator instead of @Transactional on reactive methods (blocking transaction sync does not apply).
WebFlux vs MVC — when to choose which
The decision is not "reactive is always faster." Match the stack to workload shape, team skills, and ecosystem constraints.
| Factor | Spring MVC | Spring WebFlux |
|---|---|---|
| Threading model | Thread-per-request (or virtual thread) | Few event-loop threads + small worker pools |
| Sweet spot | CRUD, JPA, mature libraries | I/O-bound fan-out, streaming, high concurrency with slow backends |
| Persistence | JPA, JDBC, Spring Data JPA | R2DBC, Mongo reactive, Cassandra reactive |
| Learning curve | Lower — imperative | Higher — reactive operators, debugging stack traces |
| Blocking code | Natural fit | Must isolate or eliminate |
flowchart TD
Q{Workload type?}
Q -->|CRUD + JPA + typical REST| MVC[Spring MVC]
Q -->|SSE / NDJSON streaming| WF[WebFlux]
Q -->|Many slow outbound HTTP calls in parallel| WF
Q -->|Backpressure end-to-end| WF
Q -->|Heavy CPU computation| MVC[MVC + scale instances]
Q -->|Team knows JPA only| MVC
- I/O-bound — thread waits on network or disk. WebFlux or virtual-thread MVC both scale well; WebFlux adds backpressure and streaming primitives.
- CPU-bound — threads actively compute. Neither model removes CPU cost; add cores, partition work, or use async job queues. Reactive does not magic away CPU.
"Would you rewrite our Spring MVC monolith to WebFlux?" — Strong answer: only with clear I/O bottlenecks, streaming requirements, or end-to-end reactive data stores—and team readiness. Otherwise MVC + virtual threads + efficient JDBC pool often wins on total cost of ownership.
Virtual threads vs reactive
Spring Boot 3.2 on Java 21 made virtual threads a first-class option for servlet stacks. You can get WebFlux-like concurrency for blocking code—without rewriting to Mono and Flux.
spring:
threads:
virtual:
enabled: true
With virtual threads enabled, Tomcat handles each request on a lightweight virtual thread. Blocking JDBC, HTTP, and sleep no longer pin scarce platform threads—MVC becomes competitive for many I/O-heavy workloads that previously motivated WebFlux adoption.
flowchart LR
subgraph vt["MVC + virtual threads"]
T1[vthread 1] --> JDBC[(blocking JDBC)]
T2[vthread 2] --> HTTP[blocking HTTP]
T3[vthread 3..N] --> Wait[I/O wait — cheap]
end
subgraph rx["WebFlux + Reactor"]
EL[Event loop] --> Mono[Mono pipeline]
Mono --> R2DBC[(R2DBC)]
Mono --> WC[WebClient]
end
When reactive is still preferred
- Backpressure — consumer-driven flow control end-to-end (Kafka → process → SSE client)
- Streaming — Flux as first-class response type; incremental parsing of large payloads
- Full reactive data path — R2DBC, reactive Mongo, reactive Redis already in architecture
- Gateway / BFF fan-out — compose many outbound calls with declarative operators and unified error handling
Programming model vs scalability tool
| Reactive (WebFlux) | Virtual threads (MVC) | |
|---|---|---|
| What it is | A programming model — pipelines, operators, types | A JVM threading feature — scalability for blocking code |
| Code style | Mono/Flux, no block() | Imperative — familiar loops, JPA, JDBC |
| Debugging | Async stack traces, reactor checkpoints | Conventional thread dumps (virtual thread aware) |
| Best fit | Streaming + backpressure + reactive drivers | Existing blocking Spring apps seeking throughput |
Virtual threads require Java 21+. Pin carriers with -Djdk.virtualThreadScheduler.parallelism=N only when tuning under load. Avoid synchronized blocks on hot paths—they pin the virtual thread to a carrier (improving in newer JDKs but still worth profiling).
Pragmatic hybrid: MVC + virtual threads for your API and JPA layer; WebClient for selective non-blocking outbound calls where you block() at the boundary or use toFuture(). You do not need full WebFlux to benefit from reactive HTTP clients.
Netflix, LinkedIn, and others adopted reactive for specific high-throughput pipelines—not as a blanket replacement for every service. Post–Java 21, many greenfield CRUD APIs default to MVC + virtual threads unless streaming is in the requirements doc on day one.