Home
Patterns
Cheat Sheets
Hub
Service Core
Cheat Sheets
Cheat Sheets
Three dense quick references by persona—HTTP and Spring annotations for developers, decision trees for tech leads,
and platform comparisons for architects. Use Copy sheet for plain text, persona filters to focus one role, or print (Cmd/Ctrl+P).
developer
lead
architect
Developer
Tech Lead
Architect
REST API status codes
Code Meaning When to use Retry?
200 OKSuccess with body GET, PUT sync update —
201 CreatedResource created POST create; include Location —
204 No ContentSuccess, empty body DELETE, PUT no response body —
400 Bad RequestClient error Validation failed, malformed JSON No
401 UnauthorizedNot authenticated Missing/invalid token No
403 ForbiddenNot authorized Valid token, insufficient scope No
404 Not FoundResource missing Unknown ID (not “wrong method”) No
409 ConflictState conflict Duplicate, version mismatch No
422 UnprocessableSemantic error Valid JSON, business rule fail No
429 Too Many RequestsRate limited Include Retry-After Yes, after delay
500 Internal ErrorServer bug Unexpected exception Maybe
502 Bad GatewayUpstream invalid Proxy/gateway upstream fail Yes
503 Service UnavailableTemporarily down Overload, maintenance, breaker open Yes + backoff
504 Gateway TimeoutUpstream timeout Downstream exceeded deadline Yes
// Problem+JSON error body (RFC 7807)
{
"type": "https://api.example.com/errors/insufficient-stock",
"title": "Insufficient stock",
"status": 409,
"detail": "SKU-42 has 0 units available",
"instance": "/orders/req-8f2a"
}
Spring Cloud annotations
Annotation / config Module Purpose
@EnableDiscoveryClientDiscovery Register with Eureka / Consul / K8s discovery
@LoadBalancedLoadBalancer RestTemplate/WebClient resolves http://order-service
@FeignClient("order-service")OpenFeign Declarative HTTP client interface
@EnableFeignClientsOpenFeign Scan Feign interfaces on startup
spring.cloud.gateway.routesGateway Route predicates, filters, URI targets
TokenRelay filterGateway Forward OAuth2 bearer to downstream
@RefreshScopeConfig Rebind beans on config server push
@ConfigurationPropertiesBoot Type-safe bind from YAML/env
spring.cloud.stream.function.definitionStream Functional binding: processOrder;publishEvent
@KafkaListenerKafka Topic consumer method (Spring Kafka)
@FeignClient(name = "inventory-service", configuration = FeignConfig.class)
public interface InventoryClient {
@GetMapping("/api/v1/stock/{sku}")
StockResponse getStock(@PathVariable String sku);
}
@Bean
@LoadBalanced
RestTemplate restTemplate() { return new RestTemplate(); }
Resilience4j annotations
Annotation States / behavior Key YAML keys
@CircuitBreaker(name, fallbackMethod)CLOSED → OPEN → HALF_OPEN failureRateThreshold, waitDurationInOpenState
@Retry(name, fallbackMethod)Re-attempt transient errors maxAttempts, waitDuration, exponential backoff
@Bulkhead(name, type)THREADPOOL or SEMAPHORE maxConcurrentCalls, maxWaitDuration
@TimeLimiter(name)CompletableFuture timeout timeoutDuration
@RateLimiter(name)Token bucket per period limitForPeriod, limitRefreshPeriod
// Decorator order (outer → inner): RateLimiter → CircuitBreaker → Bulkhead → Retry → TimeLimiter → call
@CircuitBreaker(name = "payment", fallbackMethod = "payFallback")
@Retry(name = "payment")
@Bulkhead(name = "payment", type = Bulkhead.Type.SEMAPHORE)
public PaymentResult charge(Order order) { ... }
// application.yml
resilience4j.circuitbreaker:
instances:
payment:
slidingWindowSize: 20
failureRateThreshold: 50
waitDurationInOpenState: 30s
Kafka consumer config
Property Typical value Notes
spring.kafka.bootstrap-serversbroker:9092Comma-separated brokers
spring.kafka.consumer.group-idorder-serviceOne group per logical consumer app
auto-offset-resetearliest / latestWhen no committed offset
enable-auto-commitfalsePrefer manual ack after processing
max-poll-records100–500Batch size vs processing time
max-poll-interval-ms300000+Must exceed max batch process time
isolation.levelread_committedSkip uncommitted transactional msgs
key/value-deserializerJson / Avro Schema Registry for Avro
@KafkaListener(topics = "order-placed", groupId = "inventory-service")
public void onOrderPlaced(ConsumerRecord<String, OrderPlacedEvent> record, Acknowledgment ack) {
process(record.value());
ack.acknowledge(); // after idempotent success
}
Kafka producer config
Property Typical value Notes
acksallWait for ISR replicas (durable)
retries3+With enable.idempotence=true
enable.idempotencetrueExactly-once per producer instance
transactional.idorder-service-txTransactional producer + consume-transform-produce
compression.typelz4 / zstdCPU vs bandwidth trade-off
linger.ms5–20Batch more records before send
batch.size16384+Bytes per batch target
key/value-serializerJson / Avro Match consumer + schema version
spring:
kafka:
producer:
acks: all
properties:
enable.idempotence: true
max.in.flight.requests.per.connection: 5
template:
default-topic: domain-events
Pattern decision tree
flowchart TD
START[New integration problem] --> Q1{Migrating monolith?}
Q1 -->|yes| STR[Strangler Fig plus ACL]
Q1 -->|no| Q2{Multi-service write?}
Q2 -->|yes| SAG[Saga plus Outbox]
Q2 -->|no| Q3{Need instant read-your-writes?}
Q3 -->|yes| SYNC[Sync REST or gRPC plus Timeout Breaker]
Q3 -->|no| ASYNC[Events Kafka plus Idempotent consumer]
SYNC --> Q4{Client-specific API?}
Q4 -->|yes| BFF[BFF or API Composition]
Q4 -->|no| DONE[Ship with correlation ID and RED metrics]
Sync vs async guide
Choose sync when Choose async when
User waits for result in same request Temporal decoupling acceptable
Strong read-your-writes needed Eventual consistency OK
Simple query/ command, 1–2 hops Fan-out to many subscribers
Low latency critical path under 200 ms Audit trail and replay matter
Failure must surface immediately to user Peak buffering beats sync overload
Sync stack Async stack
REST / gRPC + WebClient Kafka + outbox relay
Timeout + breaker + bulkhead Idempotent consumer + DLQ
Trace context on HTTP headers Trace in Kafka record headers
Max chain depth 3–4 hops Saga choreography or orchestration
Rule of thumb: if the user did not trigger it and does not wait — async.
If failure blocks the UI — sync with resilience, not fire-and-forget.
CAP theorem cheat
Letter Meaning During partition you pick
C onsistencyAll nodes see same data Reject writes/reads → CP
A vailabilityEvery request gets a response Allow stale reads → AP
P artitionNetwork split between nodes Not optional — must choose C or A
Store / pattern Typical CAP PACELC else (no partition)
PostgreSQL single primary CP Low latency, strong consistency
Cassandra tunable AP default Latency vs consistency per query
Redis primary-replica AP (async repl) Low latency reads
Microservices + events AP across services Eventual consistency by design
2PC / XA across DBs CP but fragile Avoid — use saga
PACELC: If Partition → A or C; Else → Latency or Consistency.
Document per-service in ADR — not one slide for whole platform.
Data pattern selection guide
Problem Pattern Avoid
Service autonomy DB per service Shared tables
Multi-DB business transaction Saga 2PC across services
Reliable event after commit Transactional outbox Dual write without outbox
Read scale ≠ write scale CQRS One model for everything
Full audit / replay Event sourcing Update-in-place only
One screen, few services API composition Deep sync graphs
Cross-service report Read model / data warehouse Cross-service SQL join
Combine: DB per service + outbox + saga + idempotent consumers
is the default production stack for write paths.
Saga vs 2PC decision
Dimension 2PC / XA Saga
Consistency Global ACID (theoretical) Eventual + compensations
Availability under partition Blocking — locks held Each step commits locally
Failure handling Rollback all participants Compensating transactions
Coupling Tight — all DBs in one tx Loose — events/commands
Ops complexity Coordinator SPOF, heuristics Idempotency, outbox, tracing
Microservices fit Poor — avoid Standard approach
Use 2PC only when Use saga when
Single modular monolith, one DB Multiple service-owned databases
Regulatory mandate + single vendor XA Long-running business processes
Short transactions, same data center Need independent deploy per service
Orchestrated saga: central coordinator — easier trace, coupling risk.
Choreographed saga: domain events — decoupled, harder debug.
Both need: outbox, idempotent consumers, explicit compensations.
Service mesh comparison
Option Data plane Best for Trade-off
Istio Envoy sidecar Full L7 policy, canary, mTLS, multi-cluster YAML surface, resource cost
Linkerd Linkerd2-proxy Simple mTLS + metrics, low overhead Fewer traffic CRDs
Consul Connect Envoy HashiCorp stack, VM + K8s Consul ops dependency
Cilium (mesh mode) eBPF Performance, K8s-native networking Feature maturity varies
No mesh App libs + gateway <15 homogeneous Java services Inconsistent polyglot policy
Capability Mesh App (Resilience4j + OTel)
mTLS east-west Automatic sidecar Manual certs or lib
RED metrics L7 without code Micrometer / actuator
Business spans No OpenTelemetry in app
Canary traffic split VirtualService weights Flagger / Argo Rollouts
Retry policy Envoy route config Resilience4j — pick one layer
Observability stack options
Stack Metrics Logs Traces Notes
LGTM Mimir/Prometheus Loki Tempo Grafana unified; cost-effective at scale
ELK + Jaeger Elastic APM Elasticsearch Jaeger Rich log search; higher index cost
Datadog / New Relic SaaS SaaS SaaS Fast setup; vendor lock-in, cost
Cloud native CloudWatch / Azure Monitor Same X-Ray / App Insights Good if all-in on one cloud
OpenTelemetry hub OTel Collector → any OTel logs OTel traces Instrument once, swap backends
Signal Alert on? Store
Metrics (RED, SLO burn) Yes — primary paging Prometheus / Mimir
Traces Via tail sampling rules Tempo / Jaeger
Logs Rare — security anomalies Loki / Elasticsearch
Correlation ID — All three signals
Minimum viable: Micrometer + Prometheus + Grafana + OTel traces + JSON logs + Loki.
Gate prod on: RED dashboards, one SLO, trace_id in every log line.
Deployment strategy comparison
Strategy Traffic shift Rollback speed Cost / complexity Best for
Rolling update Gradual pod replace Re-roll previous image Low — K8s default Stateless, backward-compatible
Blue-green Instant flip Seconds — switch back 2× infra during cutover Critical paths, schema compatible
Canary 1→5→25→100% Auto on SLO regression Medium — metrics gates High-risk releases
Feature flags Code deployed dark Toggle off instantly Flag lifecycle debt Decouple deploy from release
GitOps Git PR → reconcile Revert Git commit Medium — Argo/Flux setup Auditable prod changes
Combine Why
GitOps + rolling Default safe baseline
GitOps + canary + SLO Automated promotion/rollback
Feature flags + any deploy Kill switch without redeploy
Blue-green + DB migrations Expand-contract schema phases
Never kubectl set image in prod without Git record.
Canary promotion gates: error rate, p99 latency, business metric (conversion).