Same input, different outcome in the logs

Scenario

Support replays a user request: same JSON body, sometimes success, sometimes failure. Log lines for “the same call” disagree—different balances, different error messages, or missing steps. The system looks flaky. You must decide if logs are misleading, inputs are not actually identical, or the service is non-deterministic under concurrency.

After reading, you should be able to:

Why — “same input” rarely means same system state

Logs feel inconsistent when you compare lines without shared context (request id, pod, code version, flag values) or when the application truly branches on timing and shared mutable state. Production adds replicas, caches, read replicas, retries, and feature flags—local single-threaded runs hide that.

Two different problems

TypeMeaningExample
Observability gapBehavior may be consistent; logs incomplete or not correlatableNo trace id; async log after response
Real non-determinismSame visible input, different resultsRace, stale cache, retry double-charge

Common causes of different outcomes

Do not trust grep alone. Searching by user id across all services without trace_id mixes unrelated requests and creates false “inconsistency.”

What — prove whether behavior or logs differ

  1. Pick one failing and one succeeding example — same business id, close timestamps. Export full log JSON for both (not screenshots).
  2. Align on correlation fieldstrace_id, request_id, span_id, pod, deployment_revision, feature_flags. If missing → fix logging first; investigation is blocked.
  3. Diff the effective request
    # Fields that must match for "same input"
    method, path, normalized body hash, tenantId,
    Authorization subject, Idempotency-Key, X-Request-Id
    Mismatch here explains “inconsistent” without a code bug.
  4. Compare decision points in logs — branch taken: cache HIT/MISS, DB source (primary vs replica), downstream status codes, retry count.
  5. Check deploy and traffic split — two build ids during incident window? Canary 5% on new logic?
  6. Database view — row version / updated_at; two concurrent updates? Use transaction history or audit table if available.
  7. Reproduce under controlled concurrency — parallel requests with identical payload; if failure rate > 0 → race or idempotency bug.
  8. Distributed trace — one trace per attempt; see duplicate spans from retries — distributed trace guide.

Log patterns that hint root cause

Pattern in logsLikely cause
Different pod / cache HIT mixPer-pod cache; need distributed cache or sticky invalidation
Read from replica after writeReplication lag; read-your-writes routing
retry=2 on failures onlyNon-idempotent retry; duplicate side effect
Flag newCheckout=true only sometimesFeature flag targeting
Interleaved thread names on one request idMDC not propagated to async threads
Success then error same idTwo different HTTP attempts sharing business id in grep

MDC propagation (async gap)

// Bad: child thread loses trace context
executor.submit(() -> log.info("charged"));  // no trace_id in log

// Better: wrap tasks (Spring @Async with TaskDecorator, Micrometer context)

How — make behavior and logs deterministic enough to debug

Fix behavioral inconsistency

CauseFix
Race on shared stateAtomics, DB transaction, optimistic lock — races guide
Non-idempotent POSTIdempotency key + unique constraint
Stale replica readRoute critical reads to primary or session stickiness after write
Per-pod cacheRedis/Caffeine cluster-wide; TTL + event invalidation
Deploy skewFaster rollouts, readiness until version uniform, feature flags

Fix log inconsistency (observability)

Structured log example

{
  "msg": "order_created",
  "trace_id": "7f3a…",
  "request_id": "req_9b2…",
  "pod": "checkout-7d4f9",
  "build": "1.42.0",
  "tenant_id": "t_12",
  "cache": "MISS",
  "db": "primary",
  "order_id": "ord_88",
  "outcome": "success"
}

Verify

  1. Replay 100 parallel identical requests in staging: 0 unexpected failures.
  2. Support query: one trace_id shows full story across services.
  3. Post-deploy: no mixed build ids during steady state.

Interview one-liner

“I first verify the requests are truly identical—headers, flags, pod, version—then group logs by trace id. If behavior still differs, I look for races, replica lag, caches, and non-idempotent retries; I fix the root cause and add structured decision logging so the next incident is obvious.”

Related scenarios