Same input, different outcome in the logs
Scenario
Support replays a user request: same JSON body, sometimes success, sometimes failure. Log lines for “the same call” disagree—different balances, different error messages, or missing steps. The system looks flaky. You must decide if logs are misleading, inputs are not actually identical, or the service is non-deterministic under concurrency.
After reading, you should be able to:
- Separate logging gaps from real behavioral non-determinism.
- Group logs by correlation id and compare winning vs losing traces.
- Identify races, replica lag, flags, retries, and multi-version traffic.
- Fix with idempotency, structured context, and trace-backed debugging.
Why — “same input” rarely means same system state
Logs feel inconsistent when you compare lines without shared context (request id, pod, code version, flag values) or when the application truly branches on timing and shared mutable state. Production adds replicas, caches, read replicas, retries, and feature flags—local single-threaded runs hide that.
Two different problems
| Type | Meaning | Example |
|---|---|---|
| Observability gap | Behavior may be consistent; logs incomplete or not correlatable | No trace id; async log after response |
| Real non-determinism | Same visible input, different results | Race, stale cache, retry double-charge |
Common causes of different outcomes
- Hidden input differences — tenant header, auth principal,
Idempotency-Key, clock, geo, canary routing. - Feature flags / A-B — different code paths per request.
- Multiple app versions — rolling deploy; old and new pods behind one LB.
- In-memory state per pod — local cache not shared; hit pod A vs pod B.
- Read replica lag — write on primary, read on stale replica → “sometimes missing row.”
- Cache inconsistency — TTL, invalidation race, thundering herd reload.
- Concurrent updates — lost update without versioning — overlaps with data races.
- Retries — client or gateway retry hits partial success; non-idempotent side effects.
- External dependency variance — partner returns 429 then 200.
- Async ordering — logs print out of order; success log before failure in another thread.
Do not trust grep alone. Searching by user id across all services without trace_id mixes unrelated requests and creates false “inconsistency.”
What — prove whether behavior or logs differ
- Pick one failing and one succeeding example — same business id, close timestamps. Export full log JSON for both (not screenshots).
-
Align on correlation fields
—
trace_id,request_id,span_id,pod,deployment_revision,feature_flags. If missing → fix logging first; investigation is blocked. -
Diff the effective request
# Fields that must match for "same input" method, path, normalized body hash, tenantId, Authorization subject, Idempotency-Key, X-Request-Id
Mismatch here explains “inconsistent” without a code bug. - Compare decision points in logs — branch taken: cache HIT/MISS, DB source (primary vs replica), downstream status codes, retry count.
- Check deploy and traffic split — two build ids during incident window? Canary 5% on new logic?
-
Database view
— row
version/updated_at; two concurrent updates? Use transaction history or audit table if available. - Reproduce under controlled concurrency — parallel requests with identical payload; if failure rate > 0 → race or idempotency bug.
- Distributed trace — one trace per attempt; see duplicate spans from retries — distributed trace guide.
Log patterns that hint root cause
| Pattern in logs | Likely cause |
|---|---|
Different pod / cache HIT mix | Per-pod cache; need distributed cache or sticky invalidation |
Read from replica after write | Replication lag; read-your-writes routing |
retry=2 on failures only | Non-idempotent retry; duplicate side effect |
Flag newCheckout=true only sometimes | Feature flag targeting |
| Interleaved thread names on one request id | MDC not propagated to async threads |
| Success then error same id | Two different HTTP attempts sharing business id in grep |
MDC propagation (async gap)
// Bad: child thread loses trace context
executor.submit(() -> log.info("charged")); // no trace_id in log
// Better: wrap tasks (Spring @Async with TaskDecorator, Micrometer context)
How — make behavior and logs deterministic enough to debug
Fix behavioral inconsistency
| Cause | Fix |
|---|---|
| Race on shared state | Atomics, DB transaction, optimistic lock — races guide |
| Non-idempotent POST | Idempotency key + unique constraint |
| Stale replica read | Route critical reads to primary or session stickiness after write |
| Per-pod cache | Redis/Caffeine cluster-wide; TTL + event invalidation |
| Deploy skew | Faster rollouts, readiness until version uniform, feature flags |
Fix log inconsistency (observability)
- Structured JSON logs:
trace_id,span_id,user_id,tenant_id,pod,build_version. - Propagate MDC to
@Async,CompletableFuture, reactive chains. - Log decisions: cache hit/miss, which DB datasource, flag snapshot per request.
- One log line per outbound call with latency and status (avoid scattered debug).
- Never reuse business ids as sole correlation key across retries—use per-attempt request id.
Structured log example
{
"msg": "order_created",
"trace_id": "7f3a…",
"request_id": "req_9b2…",
"pod": "checkout-7d4f9",
"build": "1.42.0",
"tenant_id": "t_12",
"cache": "MISS",
"db": "primary",
"order_id": "ord_88",
"outcome": "success"
}
Verify
- Replay 100 parallel identical requests in staging: 0 unexpected failures.
- Support query: one
trace_idshows full story across services. - Post-deploy: no mixed build ids during steady state.
Interview one-liner
“I first verify the requests are truly identical—headers, flags, pod, version—then group logs by trace id. If behavior still differs, I look for races, replica lag, caches, and non-idempotent retries; I fix the root cause and add structured decision logging so the next incident is obvious.”