Delivery Guarantees & Reliability

The three delivery semantics — in depth

Delivery semantics describe what happens when networks fail, processes crash, and retries fire. Kafka spans producer, broker, and consumer—each layer must align or your “guarantee” collapses at the weakest link.

flowchart TB
  subgraph producer["Producer layer"]
    P1[acks / retries / idempotence]
  end
  subgraph broker["Broker layer"]
    B1[replication / ISR / min.insync.replicas]
  end
  subgraph consumer["Consumer layer"]
    C1[offset commit timing / idempotent handler]
  end
  P1 --> B1 --> C1

At-most-once

Definition: a record may be lost; it will not be delivered twice. Typically: acks=0 (fire-and-forget), or commit offsets before processing with no retry on failure.

Where it is acceptable:

Metrics and telemetry where statistical sampling tolerates gaps (dashboard approximations)
High-volume click streams where 0.1% loss is cheaper than 2× infrastructure for durability
Ephemeral notifications (live presence pings) that self-heal on next event
Debug/logging pipelines—not audit, billing, or inventory

acks=0
retries=0
enable.idempotence=false

At-least-once

Definition: a record will not be lost under retry scenarios; it may be delivered more than once. This is the default for most production Kafka systems: acks=all or acks=1 + retries > 0 on producer, process-then-commit (or auto-commit risks) on consumer.

Requirement: consumers and downstream sinks must be idempotent—same message applied twice produces the same state. Patterns: natural keys in DB upserts, dedup table in Redis/PostgreSQL, idempotency keys in event payload.

for (ConsumerRecord<String, OrderEvent> record : records) {
  String idempotencyKey = record.topic() + "-" + record.partition() + "-" + record.offset();

  if (dedupStore.exists(idempotencyKey)) {
    continue;  // already processed this exact broker record
  }

  processOrder(record.value());
  dedupStore.put(idempotencyKey, Duration.ofDays(7));
  consumer.commitSync(Map.of(
      new TopicPartition(record.topic(), record.partition()),
      new OffsetAndMetadata(record.offset() + 1)
  ));
}

💡 Pro Tip

Prefer business-level idempotency keys (orderId) over offset-based keys when replaying from an earlier offset must not skip legitimate reprocessed events.

Exactly-once

Definition: each record’s effect appears exactly once in downstream state—no loss, no duplicate side effects. Kafka does not magically guarantee this for your database writes; it provides building blocks. Two approaches:

Approach 1: Idempotent consumer (application-level)

Deduplicate using message ID in external store (Redis, DB unique constraint, DynamoDB conditional write). Works with any broker config; you own TTL, storage cost, and replay semantics. Still at-least-once on the wire—exactly-once is an application invariant.

Approach 2: Kafka transactions (platform-level)

Atomic consume → process → produce with offset commit tied to output records in one transaction. Kafka Streams processing.guarantee=exactly_once_v2 uses this internally. Requires transactional producer, read_committed consumers, and compatible brokers.

Semantic	Loss	Duplicates	Typical config
At-most-once	Possible	No	`acks=0` or commit-before-process
At-least-once	No (with retries)	Possible	`acks=all` + retries + idempotent handler
Exactly-once	No	No (within guarantee scope)	Transactions + read_committed + idempotent stores for external writes

🎯 Interview Tip

“Does Kafka support exactly-once?” — answer in layers: idempotent producer = no duplicate batches per session; transactions = atomic multi-partition write + offset commit; end-to-end EOS still needs external side effects idempotent or inside transaction boundary. Kafka EOS ≠ distributed ACID across arbitrary databases.

Kafka transactions deep dive

Transactions add a coordination layer on top of idempotent producers—grouping writes and consumer offset commits into atomic units visible only after commit.

transactional.id

Stable string per producer instance role (e.g. inventory-processor-pod-7). Broker maps it to a Producer ID (PID) and epoch. When a new producer registers with the same transactional.id, epoch increments and fences the old producer— zombie instances get ProducerFencedException on next operation.

Transaction coordinator

One broker acts as transaction coordinator per transactional.id (hash of id → broker). Manages transaction state machine: Ongoing → PrepareCommit → CompleteCommit (or Abort). Not the same as partition leader—dedicated coordination role.

__transaction_state internal topic

Compact internal topic (default 50 partitions, RF=3) storing transaction metadata: transactional id, producer id, epoch, status. Survives coordinator failover. Never manually delete in production.

Transaction phases

initTransactions() — register PID, recover pending transactions, fence zombies
beginTransaction() — start new txn; records tagged with transactional markers
send() — batches include PID, epoch, sequence; written to partition logs as uncommitted
sendOffsetsToTransaction(Map, groupMetadata) — atomically commit consumer offsets with output records
commitTransaction() — write COMMIT marker; records become visible to read_committed
abortTransaction() — write ABORT marker; records filtered out for consumers

sequenceDiagram
  participant App as Application
  participant TC as Transaction Coordinator
  participant L as Partition Leaders
  participant C as read_committed Consumer

  App->>TC: initTransactions()
  App->>TC: beginTransaction()
  App->>L: produce (uncommitted data)
  App->>TC: sendOffsetsToTransaction
  App->>TC: commitTransaction()
  TC->>L: COMMIT control record
  C->>L: fetch — sees committed data only

Partition log timeline:
[txn-start][data][data][ABORT]     ← aborted txn — invisible to read_committed
[txn-start][data][data][COMMIT]    ← committed txn — visible
         ↑ uncommitted until COMMIT marker appended

read_committed isolation

Consumers with isolation.level=read_committed filter aborted transactions and delay visibility of open transactions until commit marker arrives—prevents consumers from seeing partial writes. Default read_uncommitted sees all records immediately (including aborted txn data before abort marker—ordering nuance).

producer.initTransactions();

while (running) {
  ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(200));
  if (records.isEmpty()) continue;

  try {
    producer.beginTransaction();

    for (ConsumerRecord<String, String> record : records) {
      String output = transform(record.value());
      producer.send(new ProducerRecord<>("output-topic", record.key(), output));
    }

    Map<TopicPartition, OffsetAndMetadata> offsets = new HashMap<>();
    for (TopicPartition tp : records.partitions()) {
      List<ConsumerRecord<String, String>> part = records.records(tp);
      long last = part.get(part.size() - 1).offset();
      offsets.put(tp, new OffsetAndMetadata(last + 1));
    }

    producer.sendOffsetsToTransaction(offsets, consumer.groupMetadata());
    producer.commitTransaction();
  } catch (Exception e) {
    producer.abortTransaction();
    throw e;
  }
}

spring.kafka:
  producer:
    transaction-id-prefix: inventory-txn-
    properties:
      enable.idempotence: true
      acks: all
  consumer:
    enable-auto-commit: false
    properties:
      isolation.level: read_committed

Performance cost

Transactions add coordination round-trips per commit, control records on partitions, and coordinator load. Typical overhead: ~3–5% throughput reduction and higher p99 latency per commit boundary compared to idempotent at-least-once. Batch more records per transaction where business logic allows.

⚖️ Trade-off

Transactions serialize commit points—high-frequency micro-transactions (one record per txn) destroy throughput. Aim for poll-batch-sized transactions unless latency SLA demands otherwise.

⚠️ Pitfall

transaction.timeout.ms (default 60s) — transaction open longer than broker allows is aborted automatically. Long processing inside one transaction without tuning causes TransactionAbortedException.

Property	Default	Production	Impact
`transactional.id`	null	Unique per logical producer	Enables transactions + fencing
`transaction.timeout.ms`	60000	60000–300000	Max open transaction duration
`isolation.level`	read_uncommitted	read_committed	Consumer visibility of txn data

Replication & durability

Producer acks mean nothing if replicas never caught the write. Broker replication is the durability floor— ISR membership and min.insync.replicas decide whether you prefer availability or data loss on failure.

Replication factor recommendations

Production: replication.factor=3 — survives single broker loss with quorum
Never < 2 in any environment that cares about data—single replica is a single disk failure from loss
Development: RF=1 acceptable on single-broker Docker; not representative of production behavior

The safe production triad

⚙️ Config

replication.factor=3 + min.insync.replicas=2 + producer acks=all — write acknowledged only when at least 2 replicas (including leader) have the record. One broker down: writes continue. Two simultaneous failures: risk of unavailability (correct) not silent loss.

# Broker defaults
default.replication.factor=3
min.insync.replicas=2
unclean.leader.election.enable=false

# Per-topic override (kafka-configs.sh)
min.insync.replicas=2
cleanup.policy=delete

What happens when a broker fails

Leader broker dies — controller detects via ZK/KRaft session; partitions with leader on dead broker become unavailable for produce/fetch briefly
Leader election — controller elects new leader from ISR only (if unclean disabled)
Followers catch up — new leader serves; out-of-sync replicas truncate to HW and replicate
Clients refresh metadata — producers/consumers redirect to new leaders (automatic metadata refresh)
Under-replicated partitions — metric rises until followers rejoin ISR

sequenceDiagram
  participant P as Producer
  participant B1 as Broker 1 Leader
  participant B2 as Broker 2 Follower
  participant C as Controller
  participant B3 as Broker 3 Follower

  P->>B1: produce acks=all
  B1->>B2: replicate
  B1->>B3: replicate
  Note over B1: Broker 1 crashes
  C->>B2: elect leader for partition
  P->>B2: produce resumes after metadata refresh

Preferred leader election

Each partition has a preferred leader (replica 0 in assignment). After broker restart, kafka-leader-election.sh --election-type preferred or auto.leader.rebalance.enable (legacy) moves leadership back to preferred replica— rebalancing load to intended topology and avoiding stranded leaders on replacement brokers.

Unclean leader election

unclean.leader.election.enable=false (production default): never elect a leader from outside ISR. Partition may go offline rather than serve stale data. true: availability over consistency—may promote lagging replica and lose unreplicated records.

Setting	Broker down (1 of 3)	ISR = 1 replica
unclean=false, min.insync=2	Writes continue	Produces with acks=all fail (NOT_ENOUGH_REPLICAS)
unclean=true	May elect out-of-sync leader	Writes may succeed with data loss risk

⚠️ Pitfall

ISR shrink under load — slow followers drop from ISR; with min.insync.replicas=2 and only leader in ISR, acks=all producers halt. Alert on UnderReplicatedPartitions before producers fail.

📦 Real World

Confluent and major cloud providers enforce RF≥3 and min.insync.replicas=2 on production tiers. LinkedIn runs RF=3 with unclean election disabled—outages prefer partition unavailability over silent truncation.

Idempotent producer details

Idempotence solves producer retry duplication—not consumer duplicates, not cross-session duplicates. Understand PID, epoch, and sequence numbers to debug fencing errors and ordering guarantees.

Producer ID (PID)

On first connection with enable.idempotence=true, broker assigns a unique Producer ID (64-bit). All batches from this producer session carry this PID. Broker uses PID + partition + sequence to deduplicate.

Epoch

When a producer with transactional.id calls initTransactions(), broker increments epoch. Any older producer instance with same transactional id but lower epoch receives ProducerFencedException—prevents zombie writers after failover or rolling deploy.

Sequence number

Per-partition monotonic counter starting at 0 for each PID epoch. Broker rejects:

Duplicate sequence — retry of already-committed batch (silent success, no duplicate in log)
Out-of-order sequence — gap detected (e.g. max.in.flight > 1 without idempotence caused reorder); with idempotence, broker enforces strict order per partition

PID=7021, Partition orders-0:
  seq=0  batch [A,B]   → accepted
  seq=1  batch [C]     → network timeout, producer retries
  seq=1  batch [C]     → duplicate → broker acks without rewrite
  seq=2  batch [D]     → accepted

Automatic config coupling

When enable.idempotence=true, client enforces:

acks=all
retries > 0 (Integer.MAX_VALUE)
max.in.flight.requests.per.connection ≤ 5 (and safe with 5)

Limitations

Deduplication only within a producer session (same PID)
Producer restart without transactional.id → new PID → retries across sessions can duplicate
Does not deduplicate logically identical messages sent with different sequence contexts (two explicit sends)
No cross-partition atomicity—that requires transactions

enable.idempotence=true
# Implicit: acks=all, retries=MAX, max.in.flight=5
# No transactional.id — no consume-transform-produce atomicity

🔬 Under the Hood

Broker maintains in-memory sequence state per (PID, partition) with periodic snapshot to disk. High PID churn (frequent producer restarts) adds coordinator overhead—reuse long-lived producer instances in apps and connection pools.

🎯 Interview Tip

“Idempotent producer vs transactions?” — idempotent = per-partition dedup of retry batches; transactions = multi-partition atomic commit + offset commit + fencing via transactional.id. Idempotent is lighter; transactions required for read-process-write EOS within Kafka.

End-to-end exactly-once

Kafka-native EOS covers consume → produce → offset commit inside the broker boundary. The moment you write to PostgreSQL or call Stripe, you need an explicit strategy—or you only have at-least-once with extra steps.

The full Kafka-native pattern

Transactional producer with stable transactional.id
Consumer with enable.auto.commit=false and isolation.level=read_committed
Process records from poll batch
Produce output records in same transaction
sendOffsetsToTransaction — offset commit atomic with produces
commitTransaction — all visible together or none

flowchart LR
  IN[input topic]
  CON[Consumer poll]
  TXN[Transactional producer]
  OUT[output topic]
  OFF[__consumer_offsets]
  IN --> CON --> TXN
  TXN --> OUT
  TXN --> OFF

Kafka Streams exactly_once_v2

processing.guarantee=exactly_once_v2 (EOS v2, Kafka 2.5+) wraps the same primitives: transactional producer per stream thread, changelog topics for state stores, offset commits tied to output and state updates in one transaction boundary.

application.id=inventory-aggregator
processing.guarantee=exactly_once_v2
replication.factor=3
# Creates internal changelog + repartition topics with RF=3

External side effects — the honesty section

Kafka EOS does not roll back your database. Patterns for external writes:

Pattern	Mechanism	EOS scope
Idempotent upsert	Natural key in DB; retry safe	Effective EOS at business level
Outbox table	DB txn writes row + event; CDC publishes	At-least-once to Kafka; DB ACID
Transactional messaging	ChainedKafkaTransactionManager (Spring) — Kafka + DB	Best-effort 2PC; DB must support XA or custom
Saga compensation	Emit compensating event on failure	Eventual consistency

Cost — when to enable EOS

Enable when duplicate output records cause financial errors, double inventory deduction, or incorrect aggregates
Skip when idempotent sinks are cheap (upsert by key) and 1–2% duplicate rate is harmless
Measure throughput and p99 before/after—expect 3–10% hit depending on commit frequency and partition count

# Producer
enable.idempotence: true
transactional.id: ${APP_NAME}-${INSTANCE_ID}
acks: all

# Consumer
enable.auto.commit: false
isolation.level: read_committed

# Broker / topic
replication.factor: 3
min.insync.replicas: 2
unclean.leader.election.enable: false

# Streams (if applicable)
processing.guarantee: exactly_once_v2

⚖️ Trade-off

Durability vs throughput vs exactly-once — you can maximize two, not all three. High-throughput firehose: at-least-once + idempotent sink. Ledger pipeline: acks=all + transactions + read_committed + accept latency tax.

💡 Pro Tip

Start with idempotent producer + at-least-once consumer + DB upsert idempotency. Add transactions only when you prove duplicates occur in consume-transform-produce loops and idempotency at sink is insufficient.

📦 Real World

Netflix uses Kafka EOS in critical stream processors where duplicate billing events would corrupt revenue metrics. Many payment pipelines still prefer at-least-once + idempotent ledger entries—simpler ops, easier replay, audit-friendly dedup keys.