Service Communication
Once services are bounded, they must talk—synchronously for queries and commands that need an immediate answer, asynchronously when decoupling and resilience matter more than milliseconds. This chapter covers protocols, discovery, gateways, and messaging with the Spring/Java implementations teams ship in production.
How services talk — the decision space
Communication style is an architecture choice with long-lived consequences for availability, consistency, and team autonomy—not a framework checkbox.
In a monolith, a method call is cheap and failure is binary: the process is up or down. In microservices, every interaction crosses the network, passes through load balancers and possibly sidecars, and can fail partially while other paths succeed. Your communication patterns determine whether a slow Inventory service stalls Checkout entirely or only delays a background notification.
Two families dominate: synchronous request/response (HTTP REST, gRPC, GraphQL) and asynchronous messaging (Kafka, RabbitMQ, SNS/SQS). Most production platforms use both deliberately— sync on the critical user path where the UI needs an answer now; async for fan-out, analytics, and workflows that tolerate seconds of delay.
flowchart LR
subgraph sync [Synchronous path]
C1[Client] --> GW[Gateway]
GW --> S1[Order Service]
S1 --> S2[Payment Service]
end
subgraph async [Asynchronous path]
S1 --> K[(Kafka)]
K --> S3[Analytics]
K --> S4[Email]
end
More sync calls mean simpler mental models and strong read-your-writes UX; more async means higher availability under partial failure but harder debugging and eventual consistency everywhere downstream.
REST over HTTP/HTTPS — still the default
JSON over HTTP remains the lingua franca for public APIs, BFFs, and many internal calls because it is debuggable with curl, visible in proxies, and understood by every language.
REST fits resource-oriented domains: you model nouns, use standard verbs, and leverage HTTP semantics (caching, status codes, content negotiation). For internal microservices, teams often standardize on JSON + OpenAPI (see contract-first design) and enforce timeouts and retries at the client (covered in Resilience Patterns).
Spring client options
| Client | Characteristics | When to use |
|---|---|---|
| RestTemplate | Blocking, mature, maintenance mode. | Legacy codebases; avoid for new services. |
| WebClient | Reactive/non-blocking, Spring 5+ default. | New code; integrates with Project Reactor; configure connection pools. |
| OpenFeign | Declarative interfaces + Spring Cloud integration. | Many similar CRUD calls; pair with LoadBalancer + resilience4j. |
@Bean
WebClient inventoryWebClient(WebClient.Builder builder) {
HttpClient http = HttpClient.create()
.responseTimeout(Duration.ofSeconds(2));
return builder
.clientConnector(new ReactorClientHttpConnector(http))
.baseUrl("http://inventory-service")
.build();
}
public Mono<StockLevel> fetchStock(String sku) {
return inventoryWebClient.get()
.uri("/api/v1/stock/{sku}", sku)
.retrieve()
.bodyToMono(StockLevel.class);
}
Each REST call typically allocates a socket, negotiates TLS, serializes JSON on the CPU, and waits for TCP ACKs. At hundreds of RPS per instance, connection pool sizing and keep-alive matter as much as algorithm choice.
gRPC — typed, fast, HTTP/2-native
gRPC uses Protocol Buffers for schema-first contracts and HTTP/2 for multiplexing, header compression, and optional bidirectional streaming.
Where REST sends self-describing JSON text, gRPC sends compact binary frames with strongly typed messages generated from .proto files. That yields smaller payloads and faster parsing—often 5–10× throughput improvements on high-volume internal paths, though benchmarks depend on message size and hardware. Unary calls feel like RPC; streaming suits live feeds, log tailing, and large result sets without pagination hacks.
REST vs gRPC (internal services)
| Aspect | REST + JSON | gRPC + Protobuf |
|---|---|---|
| Contract | OpenAPI, optional codegen | Protobuf required; strict evolution rules |
| Browser / public API | Native | Needs grpc-web proxy |
| Streaming | SSE, chunked HTTP (awkward) | First-class server/client/bidi streams |
| Debugging | curl, browser devtools | grpcurl, wireshark with proto defs |
| Load balancers | L7 HTTP familiar | Need L7 aware of HTTP/2 streams or client-side LB |
syntax = "proto3";
package orders.v1;
service OrderService {
rpc GetOrder(GetOrderRequest) returns (OrderResponse);
rpc StreamOrderEvents(StreamRequest) returns (stream OrderEvent);
}
message GetOrderRequest {
string order_id = 1;
}
Spring Boot 3 integrates gRPC via grpc-spring-boot-starter: generated stubs, server interceptors for auth, and metadata propagation for tracing. Pair with service mesh mTLS for zero-trust internal calls.
Use gRPC for hot internal east-west traffic (payments ↔ ledger); keep REST at the north-south edge for mobile/web clients unless you standardize on grpc-web everywhere.
GraphQL — flexible queries, federation complexity
GraphQL lets clients request exactly the fields they need in one round trip—powerful for varied UIs, challenging when ownership is split across many microservices.
In a monolith or single BFF, GraphQL resolvers map cleanly to database queries. In microservices, each field might call another service—classic N+1 problem becomes N+1 HTTP/gRPC calls unless you batch with DataLoader or pre-join via a dedicated read model (CQRS). Apollo Federation (and similar) lets each service expose a subgraph; a gateway stitches the supergraph, but operational ownership of schema changes becomes a platform concern.
GraphQL fits when:
- Multiple clients (web, iOS, Android) need different field shapes from the same backend.
- You expose a BFF or federated graph with a team maintaining schema governance.
- Read-heavy workloads tolerate resolver-level caching.
Avoid as default internal protocol when: simple CRUD between services, strict latency SLOs on deep graphs, or immature observability for resolver fan-out.
Exposing GraphQL directly on every microservice without a gateway—clients must know service topology and you lose centralized rate limiting and auth policy enforcement.
Choosing synchronous vs asynchronous
Sync optimizes for immediate feedback and simpler failure semantics; async optimizes for decoupling, buffering, and surviving downstream outages.
| Factor | Prefer sync | Prefer async |
|---|---|---|
| User waiting | Yes — checkout total, auth gate | No — send receipt email later |
| Consistency | Need answer now (with caveats) | Eventual OK; saga compensations |
| Fan-out | One dependency | Many subscribers (OrderPlaced → 5 services) |
| Peak load | Spike hits callee immediately | Queue absorbs spike; consumer scales |
| Failure | Callee down → caller fails fast | Messages retained; retry when healthy |
For “design food delivery,” place order sync through Payment authorization; driver assignment and ETA updates async via events; explain what the user sees if the async pipeline lags (optimistic UI vs polling).
Service discovery — finding instances in a dynamic world
Hardcoded IPs and static host lists break the moment Kubernetes reschedules a pod or autoscaling adds a replica. Discovery tells clients where healthy instances live right now.
The problem: Order Service might run as 10.0.14.7:8080 at noon and 10.0.22.19:8080 after a deploy. DNS names help, but something must register and health-check instances and expose metadata (zone, version, weight).
Client-side discovery
The client (or a library in-process) queries a registry, caches instance lists, and load-balances locally. Netflix Eureka and HashiCorp Consul popularized this in JVM ecosystems. Spring Cloud Netflix Eureka: @EnableEurekaServer on registry; services register with spring-cloud-starter-netflix-eureka-client; callers use Spring Cloud LoadBalancer (Ribbon is deprecated) with @LoadBalanced RestTemplate or Feign.
spring:
application:
name: inventory-service
eureka:
client:
service-url:
defaultZone: http://eureka:8761/eureka/
instance:
prefer-ip-address: true
Server-side discovery
The client talks to a stable virtual IP or DNS name; a load balancer or platform router consults the registry and forwards. AWS ALB/NLB, Kubernetes Service objects, and cloud mesh ingress work this way—the client stays dumb; infrastructure picks the pod.
DNS-based discovery (Kubernetes)
CoreDNS resolves inventory-service.default.svc.cluster.local to the ClusterIP Service, which kube-proxy or eBPF dataplane load-balances to endpoints. No Eureka required on K8s-native estates if you accept platform coupling.
sequenceDiagram participant Client participant LB as Load Balancer / K8s Service participant Reg as Registry (optional) participant Svc as Inventory Pod Client->>LB: GET /stock (inventory-service) LB->>Reg: lookup healthy instances Reg-->>LB: 10.0.1.4, 10.0.1.9 LB->>Svc: forward request Svc-->>Client: 200 OK
Netflix built Eureka for AWS EC2 volatility; many teams on Kubernetes dropped Eureka in favor of K8s Services + mesh, keeping Eureka only for VM/bare-metal mixed estates.
Load balancing — spreading work across instances
Without balancing, the first instance in a list absorbs traffic until it melts while siblings idle—a failure mode even junior on-call runbooks mention.
Client-side balancing
Spring Cloud LoadBalancer chooses an instance from the registry per request. Algorithms include round-robin, random, and weighted response time (favor faster nodes). Configure per service via LoadBalancerClientFactory custom beans when defaults are naive for your latency profile.
Server-side balancing
Nginx, HAProxy, AWS ALB, and Kubernetes Services terminate client connections and pick backends. TLS often ends at the LB; HTTP/2 and gRPC require L7 awareness so streams stick to one backend correctly.
Sticky sessions — anti-pattern in microservices
Session affinity (cookie → same pod) fights autoscaling and rolling deploys: draining a node breaks users mid-session. Prefer stateless services with JWT or server-side session stores in Redis if you must share session state. Sticky sessions are a legacy monolith habit, not a microservices default.
Enabling sticky sessions to “fix” in-memory shopping carts instead of storing cart state in Redis or the client—deployments become painful and failover fails.
API gateway — single north-south entry point
A gateway terminates TLS, authenticates callers, routes to internal services, and applies cross-cutting policies so domain services stay focused on business logic.
Typical responsibilities:
- Routing and path rewriting (/api/orders/** → Order Service)
- Authentication and JWT validation; optional token relay downstream
- Rate limiting and WAF integration
- Request/response transformation (header injection, legacy XML → JSON)
- Response caching for idempotent GETs
- Circuit breaking at the edge (often duplicated with service-level resilience—coordinate policies)
Spring Cloud Gateway
Reactive (WebFlux) gateway using RouteLocator, predicates (Path, Header, Cookie), and filters (RewritePath, AddRequestHeader, RequestRateLimiter). GlobalFilter beans run on every route—ideal for correlation IDs and auth. Custom filters implement org-specific audit logging.
@Bean
RouteLocator routes(RouteLocatorBuilder builder) {
return builder.routes()
.route("orders", r -> r.path("/api/v1/orders/**")
.filters(f -> f
.stripPrefix(0)
.requestRateLimiter(c -> c
.setRateLimiter(redisRateLimiter())
.setKeyResolver(exchange -> exchange.getRequest().getRemoteAddress()))
.circuitBreaker(c -> c.setName("orderCb")))
.uri("lb://order-service"))
.build();
}
Alternatives
- Kong / APISIX — plugin ecosystem, declarative config, often paired with K8s Ingress.
- AWS API Gateway — managed, Lambda/VPC links; good for serverless-heavy estates.
- Nginx / Envoy — high performance; more manual config unless wrapped by mesh/Gateway API.
Gateway adds one network hop and becomes a shared failure point—run it HA (multiple replicas), health-check aggressively, and keep business rules out of 500-line filter chains that nobody tests.
Backend for Frontend (BFF)
Instead of one generic API for every client, a BFF shapes responses per experience—mobile gets minimal payloads; admin web gets rich aggregates.
A single “god gateway” accumulating every client’s special case becomes unmaintainable. The BFF pattern assigns a small backend (often a Spring Boot app) per client type: mobile-bff, partner-bff. Each BFF orchestrates calls to domain microservices, handles pagination and field selection, and owns release cadence with its UI team.
BFFs are allowed to be chatty internally; they should not leak orchestration into domain services via “special query params for the app team.” Domain services stay generic; BFFs absorb presentation-driven change.
SoundCloud and Spotify documented BFF-style layers to tame mobile bandwidth limits while keeping core catalog services client-agnostic.
Asynchronous communication fundamentals
Messaging decouples producers from consumers in time and space—the producer does not wait for subscribers to finish, and new subscribers can appear without redeploying the sender.
Message vs event
A command message tells a specific consumer to do something: “ReserveInventory” with a reply address—point-to-point, often one queue. An event announces a fact: “OrderPlaced”—zero or many subscribers on a topic. Commands couple names and intent; events couple less if schemas evolve carefully.
Point-to-point vs publish-subscribe
- Queue — one consumer group processes each message once (competing consumers scale throughput).
- Topic — each subscriber group gets a copy (fan-out for analytics, email, search indexing).
Why async helps resilience: if Email Service is down, Kafka retains OrderPlaced events until consumers recover (within retention limits). Producers stay up; users see “order confirmed” while email arrives minutes later—design UX accordingly.
You trade immediate consistency for complexity: duplicate messages require idempotent consumers; ordering guarantees vary by broker partition design.
Apache Kafka — the distributed commit log
Kafka stores records in append-only partitions, replicates for durability, and lets consumer groups track offsets—making it the default event backbone at scale.
Core concepts
- Topic — named stream split into partitions for parallelism.
- Partition — ordered log; key hash routes related events to same partition for ordering per key.
- Producer — sends records with acks config (0, 1, all) trading speed vs durability.
- Consumer group — each partition assigned to one consumer in group; scale consumers ≤ partitions.
- Offset — position in log; committed to __consumer_offsets or external store.
- Log compaction — retains latest record per key—useful for changelog topics (KTables).
Delivery semantics
At-most-once — commit offset before processing; may lose messages on crash. At-least-once — process then commit; duplicates possible without idempotency. Exactly-once (EOS) — transactional producer + idempotent producer + read-process-write with Kafka Streams or careful Spring Kafka config; still requires idempotent side effects in external databases.
@Service
public class OrderEventPublisher {
private final KafkaTemplate<String, OrderPlacedEvent> kafka;
public void publish(OrderPlacedEvent event) {
kafka.send("order.events.v1", event.orderId(), event)
.whenComplete((result, ex) -> {
if (ex != null) log.error("publish failed", ex);
});
}
}
Creating more consumers than partitions—they sit idle. Plan partition count for peak parallel throughput upfront; increasing partitions later can break key ordering assumptions.
RabbitMQ — flexible routing, classic message broker
RabbitMQ excels at work queues, routing flexibility, and per-message TTL—often chosen when Kafka’s log retention model is heavier than needed.
Producers publish to exchanges; bindings route to queues consumers read from. Exchange types: direct (routing key match), topic (pattern keys), fanout (broadcast), headers (metadata match). Dead-letter exchanges (DLX) capture poison messages after N failures or TTL expiry—essential for ops visibility.
Use Rabbit when you need classic competing consumers on a queue, per-message ack, and complex routing without retaining infinite history. Use Kafka when multiple teams replay history, stream processing, or high-throughput event sourcing backbones dominate.
@RabbitListener(queues = "inventory.reserve")
public void onReserve(ReserveInventoryCommand cmd) {
inventoryService.reserve(cmd.orderId(), cmd.lines());
}
Spring Cloud Stream — broker abstraction
Spring Cloud Stream wraps Kafka and RabbitMQ behind binders so application code focuses on functions and channels, not broker-specific APIs—at the cost of lowest-common-denominator features.
Modern apps use the functional model: define Consumer<T>, Function<T,R>, or Supplier<T> beans; binders map them to destinations via spring.cloud.stream.bindings. Legacy @EnableBinding is deprecated—migrate to functions for Spring Cloud 2020+.
@Bean
Consumer<OrderPlacedEvent> orderPlacedHandler(InventoryService inventory) {
return event -> inventory.reserveForOrder(event);
}
spring:
cloud:
stream:
bindings:
orderPlacedHandler-in-0:
destination: order.events.v1
group: inventory-service
kafka:
binder:
brokers: kafka:9092
default:
consumer:
max-attempts: 3
Configure DLQ destinations for failed messages; consumer groups isolate scaling per service name. Test with Testcontainers Kafka/Rabbit in CI to catch binding typos before deploy.
Outbox pattern — reliable publishing with DB transactions
The dual-write problem: saving business data and publishing to Kafka must appear atomic. The outbox writes the event in the same database transaction as the aggregate, then a relay publishes asynchronously.
Without outbox, this sequence fails often: (1) commit order to PostgreSQL, (2) publish OrderPlaced to Kafka. If step 2 fails, other services never see the order; if you publish first and DB rolls back, consumers act on ghost orders. The fix: insert into an outbox table in the same transaction as the order row; a separate process reads outbox rows and publishes, marking them sent.
Relay options
- Polling publisher — simple scheduled job; slight latency; watch for hot rows and use SKIP LOCKED.
- Debezium CDC — reads database WAL, streams outbox inserts to Kafka with minimal app code—popular in Spring + PostgreSQL stacks.
sequenceDiagram
autonumber
participant OS as Order Service
participant PG as PostgreSQL
participant RL as Outbox Relay
participant KA as Kafka
rect rgb(30, 40, 70)
Note over OS,PG: Single ACID transaction
OS->>+PG: BEGIN
OS->>PG: INSERT INTO orders
OS->>PG: INSERT INTO outbox
PG-->>-OS: COMMIT
end
alt Polling relay
RL->>PG: SELECT unpublished rows FOR UPDATE SKIP LOCKED
RL->>KA: produce OrderPlaced
RL->>PG: UPDATE outbox SET published = true
else Debezium CDC
RL->>PG: read WAL via replication slot
RL->>KA: stream outbox insert as change event
end
Full implementation details, saga pairing, and idempotency keys live in Data Management Patterns. Communication layer rule: never call kafkaTemplate.send directly from domain code without outbox or transactional messaging unless you accept gaps.
Use the same event envelope (schema ID, correlationId, occurredAt) in outbox rows as on the wire—relay becomes dumb plumbing, schema registry stays authoritative.