Cheat Sheets

Three dense quick references by persona—HTTP and Spring annotations for developers, decision trees for tech leads, and platform comparisons for architects. Use Copy sheet for plain text, persona filters to focus one role, or print (Cmd/Ctrl+P).

developer lead architect

Developer Cheat Sheet

REST semantics, Spring Cloud wiring, Resilience4j annotations, Kafka config. Chapters: Communication, Resilience.

developer

REST API status codes

Code	Meaning	When to use	Retry?
`200 OK`	Success with body	GET, PUT sync update	—
`201 Created`	Resource created	POST create; include `Location`	—
`204 No Content`	Success, empty body	DELETE, PUT no response body	—
`400 Bad Request`	Client error	Validation failed, malformed JSON	No
`401 Unauthorized`	Not authenticated	Missing/invalid token	No
`403 Forbidden`	Not authorized	Valid token, insufficient scope	No
`404 Not Found`	Resource missing	Unknown ID (not “wrong method”)	No
`409 Conflict`	State conflict	Duplicate, version mismatch	No
`422 Unprocessable`	Semantic error	Valid JSON, business rule fail	No
`429 Too Many Requests`	Rate limited	Include `Retry-After`	Yes, after delay
`500 Internal Error`	Server bug	Unexpected exception	Maybe
`502 Bad Gateway`	Upstream invalid	Proxy/gateway upstream fail	Yes
`503 Service Unavailable`	Temporarily down	Overload, maintenance, breaker open	Yes + backoff
`504 Gateway Timeout`	Upstream timeout	Downstream exceeded deadline	Yes

// Problem+JSON error body (RFC 7807)
{
  "type": "https://api.example.com/errors/insufficient-stock",
  "title": "Insufficient stock",
  "status": 409,
  "detail": "SKU-42 has 0 units available",
  "instance": "/orders/req-8f2a"
}

Spring Cloud annotations

Annotation / config	Module	Purpose
`@EnableDiscoveryClient`	Discovery	Register with Eureka / Consul / K8s discovery
`@LoadBalanced`	LoadBalancer	RestTemplate/WebClient resolves `http://order-service`
`@FeignClient("order-service")`	OpenFeign	Declarative HTTP client interface
`@EnableFeignClients`	OpenFeign	Scan Feign interfaces on startup
`spring.cloud.gateway.routes`	Gateway	Route predicates, filters, URI targets
`TokenRelay` filter	Gateway	Forward OAuth2 bearer to downstream
`@RefreshScope`	Config	Rebind beans on config server push
`@ConfigurationProperties`	Boot	Type-safe bind from YAML/env
`spring.cloud.stream.function.definition`	Stream	Functional binding: `processOrder;publishEvent`
`@KafkaListener`	Kafka	Topic consumer method (Spring Kafka)

@FeignClient(name = "inventory-service", configuration = FeignConfig.class)
public interface InventoryClient {
  @GetMapping("/api/v1/stock/{sku}")
  StockResponse getStock(@PathVariable String sku);
}

@Bean
@LoadBalanced
RestTemplate restTemplate() { return new RestTemplate(); }

Resilience4j annotations

Annotation	States / behavior	Key YAML keys
`@CircuitBreaker(name, fallbackMethod)`	CLOSED → OPEN → HALF_OPEN	`failureRateThreshold`, `waitDurationInOpenState`
`@Retry(name, fallbackMethod)`	Re-attempt transient errors	`maxAttempts`, `waitDuration`, exponential backoff
`@Bulkhead(name, type)`	THREADPOOL or SEMAPHORE	`maxConcurrentCalls`, `maxWaitDuration`
`@TimeLimiter(name)`	CompletableFuture timeout	`timeoutDuration`
`@RateLimiter(name)`	Token bucket per period	`limitForPeriod`, `limitRefreshPeriod`

// Decorator order (outer → inner): RateLimiter → CircuitBreaker → Bulkhead → Retry → TimeLimiter → call

@CircuitBreaker(name = "payment", fallbackMethod = "payFallback")
@Retry(name = "payment")
@Bulkhead(name = "payment", type = Bulkhead.Type.SEMAPHORE)
public PaymentResult charge(Order order) { ... }

// application.yml
resilience4j.circuitbreaker:
  instances:
    payment:
      slidingWindowSize: 20
      failureRateThreshold: 50
      waitDurationInOpenState: 30s

Kafka consumer config

Property	Typical value	Notes
`spring.kafka.bootstrap-servers`	`broker:9092`	Comma-separated brokers
`spring.kafka.consumer.group-id`	`order-service`	One group per logical consumer app
`auto-offset-reset`	`earliest` / `latest`	When no committed offset
`enable-auto-commit`	`false`	Prefer manual ack after processing
`max-poll-records`	`100`–`500`	Batch size vs processing time
`max-poll-interval-ms`	`300000`+	Must exceed max batch process time
`isolation.level`	`read_committed`	Skip uncommitted transactional msgs
`key/value-deserializer`	Json / Avro	Schema Registry for Avro

@KafkaListener(topics = "order-placed", groupId = "inventory-service")
public void onOrderPlaced(ConsumerRecord<String, OrderPlacedEvent> record, Acknowledgment ack) {
  process(record.value());
  ack.acknowledge();  // after idempotent success
}

Kafka producer config

Property	Typical value	Notes
`acks`	`all`	Wait for ISR replicas (durable)
`retries`	`3`+	With `enable.idempotence=true`
`enable.idempotence`	`true`	Exactly-once per producer instance
`transactional.id`	`order-service-tx`	Transactional producer + consume-transform-produce
`compression.type`	`lz4` / `zstd`	CPU vs bandwidth trade-off
`linger.ms`	`5`–`20`	Batch more records before send
`batch.size`	`16384`+	Bytes per batch target
`key/value-serializer`	Json / Avro	Match consumer + schema version

spring:
  kafka:
    producer:
      acks: all
      properties:
        enable.idempotence: true
        max.in.flight.requests.per.connection: 5
    template:
      default-topic: domain-events

Tech Lead Cheat Sheet

When to pick which pattern, sync vs async, CAP/PACELC, data pattern matrix. See Patterns → Decision helper.

lead

Pattern decision tree

flowchart TD
  START[New integration problem] --> Q1{Migrating monolith?}
  Q1 -->|yes| STR[Strangler Fig plus ACL]
  Q1 -->|no| Q2{Multi-service write?}
  Q2 -->|yes| SAG[Saga plus Outbox]
  Q2 -->|no| Q3{Need instant read-your-writes?}
  Q3 -->|yes| SYNC[Sync REST or gRPC plus Timeout Breaker]
  Q3 -->|no| ASYNC[Events Kafka plus Idempotent consumer]
  SYNC --> Q4{Client-specific API?}
  Q4 -->|yes| BFF[BFF or API Composition]
  Q4 -->|no| DONE[Ship with correlation ID and RED metrics]

Scenario	Start here	Also consider
Legacy ERP integration	Anti-Corruption Layer	Strangler if replacing module
Checkout spans 3 DBs	Saga + Outbox	Not 2PC
Mobile vs web payloads	BFF	Not one fat API
Downstream flaky	Breaker + Timeout	Fallback for reads
Audit full history	Event Sourcing	CQRS read models
Public API abuse	Rate limit at gateway	Auth + quotas

Sync vs async guide

Choose sync when	Choose async when
User waits for result in same request	Temporal decoupling acceptable
Strong read-your-writes needed	Eventual consistency OK
Simple query/ command, 1–2 hops	Fan-out to many subscribers
Low latency critical path under 200 ms	Audit trail and replay matter
Failure must surface immediately to user	Peak buffering beats sync overload

Sync stack	Async stack
REST / gRPC + WebClient	Kafka + outbox relay
Timeout + breaker + bulkhead	Idempotent consumer + DLQ
Trace context on HTTP headers	Trace in Kafka record headers
Max chain depth 3–4 hops	Saga choreography or orchestration

Rule of thumb: if the user did not trigger it and does not wait — async.
If failure blocks the UI — sync with resilience, not fire-and-forget.

CAP theorem cheat

Letter	Meaning	During partition you pick
Consistency	All nodes see same data	Reject writes/reads → CP
Availability	Every request gets a response	Allow stale reads → AP
Partition	Network split between nodes	Not optional — must choose C or A

Store / pattern	Typical CAP	PACELC else (no partition)
PostgreSQL single primary	CP	Low latency, strong consistency
Cassandra tunable	AP default	Latency vs consistency per query
Redis primary-replica	AP (async repl)	Low latency reads
Microservices + events	AP across services	Eventual consistency by design
2PC / XA across DBs	CP but fragile	Avoid — use saga

PACELC: If Partition → A or C; Else → Latency or Consistency.
Document per-service in ADR — not one slide for whole platform.

Data pattern selection guide

Problem	Pattern	Avoid
Service autonomy	DB per service	Shared tables
Multi-DB business transaction	Saga	2PC across services
Reliable event after commit	Transactional outbox	Dual write without outbox
Read scale ≠ write scale	CQRS	One model for everything
Full audit / replay	Event sourcing	Update-in-place only
One screen, few services	API composition	Deep sync graphs
Cross-service report	Read model / data warehouse	Cross-service SQL join

Combine: DB per service + outbox + saga + idempotent consumers
is the default production stack for write paths.

Architect Cheat Sheet

Saga vs 2PC, mesh options, observability stacks, deployment strategies. Chapters: Data, Mesh, Observability, Deployment.

architect

Saga vs 2PC decision

Dimension	2PC / XA	Saga
Consistency	Global ACID (theoretical)	Eventual + compensations
Availability under partition	Blocking — locks held	Each step commits locally
Failure handling	Rollback all participants	Compensating transactions
Coupling	Tight — all DBs in one tx	Loose — events/commands
Ops complexity	Coordinator SPOF, heuristics	Idempotency, outbox, tracing
Microservices fit	Poor — avoid	Standard approach

Use 2PC only when	Use saga when
Single modular monolith, one DB	Multiple service-owned databases
Regulatory mandate + single vendor XA	Long-running business processes
Short transactions, same data center	Need independent deploy per service

Orchestrated saga: central coordinator — easier trace, coupling risk.
Choreographed saga: domain events — decoupled, harder debug.
Both need: outbox, idempotent consumers, explicit compensations.

Service mesh comparison

Option	Data plane	Best for	Trade-off
Istio	Envoy sidecar	Full L7 policy, canary, mTLS, multi-cluster	YAML surface, resource cost
Linkerd	Linkerd2-proxy	Simple mTLS + metrics, low overhead	Fewer traffic CRDs
Consul Connect	Envoy	HashiCorp stack, VM + K8s	Consul ops dependency
Cilium (mesh mode)	eBPF	Performance, K8s-native networking	Feature maturity varies
No mesh	App libs + gateway	<15 homogeneous Java services	Inconsistent polyglot policy

Capability	Mesh	App (Resilience4j + OTel)
mTLS east-west	Automatic sidecar	Manual certs or lib
RED metrics	L7 without code	Micrometer / actuator
Business spans	No	OpenTelemetry in app
Canary traffic split	VirtualService weights	Flagger / Argo Rollouts
Retry policy	Envoy route config	Resilience4j — pick one layer

Observability stack options

Stack	Metrics	Logs	Traces	Notes
LGTM	Mimir/Prometheus	Loki	Tempo	Grafana unified; cost-effective at scale
ELK + Jaeger	Elastic APM	Elasticsearch	Jaeger	Rich log search; higher index cost
Datadog / New Relic	SaaS	SaaS	SaaS	Fast setup; vendor lock-in, cost
Cloud native	CloudWatch / Azure Monitor	Same	X-Ray / App Insights	Good if all-in on one cloud
OpenTelemetry hub	OTel Collector → any	OTel logs	OTel traces	Instrument once, swap backends

Signal	Alert on?	Store
Metrics (RED, SLO burn)	Yes — primary paging	Prometheus / Mimir
Traces	Via tail sampling rules	Tempo / Jaeger
Logs	Rare — security anomalies	Loki / Elasticsearch
Correlation ID	—	All three signals

Minimum viable: Micrometer + Prometheus + Grafana + OTel traces + JSON logs + Loki.
Gate prod on: RED dashboards, one SLO, trace_id in every log line.

Deployment strategy comparison

Strategy	Traffic shift	Rollback speed	Cost / complexity	Best for
Rolling update	Gradual pod replace	Re-roll previous image	Low — K8s default	Stateless, backward-compatible
Blue-green	Instant flip	Seconds — switch back	2× infra during cutover	Critical paths, schema compatible
Canary	1→5→25→100%	Auto on SLO regression	Medium — metrics gates	High-risk releases
Feature flags	Code deployed dark	Toggle off instantly	Flag lifecycle debt	Decouple deploy from release
GitOps	Git PR → reconcile	Revert Git commit	Medium — Argo/Flux setup	Auditable prod changes

Combine	Why
GitOps + rolling	Default safe baseline
GitOps + canary + SLO	Automated promotion/rollback
Feature flags + any deploy	Kill switch without redeploy
Blue-green + DB migrations	Expand-contract schema phases

Never kubectl set image in prod without Git record.
Canary promotion gates: error rate, p99 latency, business metric (conversion).