Skip to content

Kafka Producer Tuning Cheat Sheet: Throughput, Latency & Durability

BackendBytes Engineering Team
BackendBytes Engineering Team
4 min read
Kafka Producer Tuning Cheat Sheet: Throughput, Latency & Durability

Key Takeaways

  • acks=1 silently loses data on broker failover; only acks=all + min.insync.replicas=2 survives a broker failure without data loss
  • linger.ms (not batch.size) is the biggest throughput lever — raising from 0 to 20ms can 4x throughput on small records by batching before sending
  • Idempotence (enable.idempotence=true) adds 5 bytes per record and eliminates duplicates on retry with only ~3-5% throughput cost — always enable it
  • One KafkaProducer per application (thread-safe); creating per-message triggers metadata fetch storms; reuse it or you'll exhaust connections

The classic Kafka producer production data-loss incident. A service ships with acks=1, linger.ms=0, retries=0, and enable.idempotence=false — the default-derived config that looks fine in staging. A broker fails over during peak traffic. Roughly 30 seconds of in-flight messages vanish silently because acks=1 only confirms the leader, not the followers. We debugged this exact failure mode on multiple production teams.

That config predates Kafka 3.0, which flipped the Java producer to safe-by-default — enable.idempotence=true and acks=all (KIP-679). On a current cluster you reach this failure mode through legacy configs, explicit overrides (someone set acks=1 chasing latency), or non-Java clients that still default weaker — not the out-of-box Java producer. Verify every value below rather than trusting an inherited config.

Kafka Producer Tuning Essentials

Every production Kafka producer loses data or throughput at one moment: when config defaults deploy without understanding the tradeoff. acks=1 silently drops messages on failover. linger.ms=0 tanks throughput. retries=0 stops on network blips.

TL;DR

Start with: acks=all, enable.idempotence=true, batch.size=65536 (default is 16384 — this is a recommended increase), linger.ms=5 (default is 0), compression.type=lz4. Idempotence eliminates duplicates on retry with minimal overhead — Confluent's published benchmarks put the throughput cost in the low single digits on typical workloads (source). Your numbers will vary with record size and acks setting; measure before and after when it matters. Document every deviation from these values.

  • Survive broker failover: acks=all + broker min.insync.replicas=2 (3-replica cluster)
  • Biggest throughput lever: linger.ms — raise from 0 to 5–20ms, not batch.size
  • One KafkaProducer per app; reuse it thread-safe, never create per-message
graph LR
    App["Application<br/>producer.send()"] --> RA["RecordAccumulator<br/>buffer per partition"]
    RA -->|"batch.size or<br/>linger.ms expires"| ST["Sender Thread"]
    ST -->|"max.in.flight<br/>batches"| Broker["Kafka Broker"]
    Broker -->|"acks=all"| ST
    ST -->|"callback"| App

The Quick Start

Goalacksbatch.sizelinger.mscompressionmax.in.flightTrade-off
Durability first (default)all65536 (64K)5lz45Loses ~5ms latency, gains no data loss
Maximum throughputall131072 (128K)20zstd5Adds ~20ms latency, 15% compression gain
Latency-critical (<10ms p99)all163840none1Drops to single partition throughput, keeps durability
Metrics you can lose0655365lz45Fastest; data loss on any broker crash

Durability: Choosing acks

[Kafka producer config]

acks is the kill switch. acks=0 loses data on restart. acks=1 loses data on leader failover — it looks safe in tests and fails silently in production. Use acks=all everywhere except metrics.

Pair it with broker-side min.insync.replicas=2 on a 3-replica cluster. This is the only combination that survives a broker failure without data loss.

Throughput: Batching + Linger

[Apache Kafka Docs]

Three knobs move throughput: batch.size, linger.ms, and compression.type. batch.size is a ceiling per partition (default 16384 bytes). linger.ms is the biggest lever most teams miss — raising from 0 to 5–20ms can 4x throughput on small records. Use lz4 (good ratio/CPU) unless you're network-bound; zstd achieves better compression ratios but costs more CPU.

Measure batch fill ratio: record-size-avg / batch-size-avg. Below 0.5 means raise linger.ms.

Buffer + sender thread lifecycle

send() doesn't talk to Kafka directly. It appends to an in-memory RecordAccumulator where records group into per-partition batches. A background Sender thread drains batches when they fill OR when linger.ms expires.

sequenceDiagram
    participant App
    participant Acc as Accumulator
    participant Sender
    participant Broker

    App->>Acc: send(record)
    App->>Acc: send(record)
    App->>Acc: send(record)
    Note over Acc: batch fills OR<br/>linger.ms expires
    Acc->>Sender: flush batch
    Sender->>Broker: Produce (acks=all)
    Broker-->>Sender: ack (ISR copied)
    Sender-->>App: onCompletion

If buffer.memory fills before the Sender can drain, send() blocks until space frees. This is the hidden backpressure path — misconfigured linger.ms + slow broker can stall the application thread.

Idempotence & Ordering

[Kafka producer config]

Enable enable.idempotence=true unconditionally. It adds 5 bytes per record and eliminates duplicates on retry. With idempotence, max.in.flight.requests.per.connection stays at 5; without it, drop to 1 for ordering.

Order is guaranteed within a partition. Key records by entity ID for strict per-entity ordering:

producer.send(new ProducerRecord<>("orders", order.customerId(), order));

Don't use transactions for ordering — they're for atomic multi-partition writes and exactly-once semantics, and cut throughput 20%. [Kafka producer config]

Tune by symptom

[Kafka producer config]

When a Kafka producer is misbehaving, the question is "what is the symptom?" — not "which config knob shall I tweak?" Route by what you measured:

graph TD
    Sym{What is<br/>broken?} -->|Data loss on failover| Loss[acks=all<br/>+ min.insync.replicas=2<br/>+ enable.idempotence=true]
    Sym -->|Low throughput<br/>under 10k msg/s| Tp[linger.ms 5 to 20<br/>+ batch.size 65536<br/>+ compression.type=lz4]
    Sym -->|High p99 latency| Lat[linger.ms back to 0 to 1<br/>+ acks=1 if data-loss tolerable<br/>+ smaller batch.size]
    Sym -->|Duplicates downstream| Dup[enable.idempotence=true<br/>+ transactional.id for exactly-once]
    Sym -->|Out-of-order messages| Ord[max.in.flight.requests=1<br/>or enable.idempotence=true<br/>which keeps order with five in-flight]
    Sym -->|Producer blocks on send| Buf[buffer.memory 256 MiB or higher<br/>+ check broker backpressure]
    Sym -->|Network-blip outages| Net[retries=MAX_INT<br/>+ delivery.timeout.ms 120 seconds<br/>+ enable.idempotence=true]
    style Loss fill:#fdd
    style Dup fill:#ffd
    style Ord fill:#ffd
    style Buf fill:#fdd
    style Net fill:#dfd
    style Tp fill:#dfd
    style Lat fill:#dfd

The diagram is the entire tuning cheat-sheet in one picture[Kafka producer config]: classify the symptom, pick the three knobs that apply, never tune in isolation.

Common Gotchas

  • acks=1 + min.insync.replicas=1: Loses data on failover silently. Use acks=all with min.insync.replicas=2.
  • retries=0: Drops on network blips. Use enable.idempotence=true + retries=Integer.MAX_VALUE.
  • 32 MiB buffer on high-throughput: Fills at 10k msg/s, blocks send(). Raise to 256 MiB+.
  • linger.ms=0 everywhere: Default is 0 for legacy reasons. Set to 5–20ms on non-latency-critical producers.
  • Per-message KafkaProducer: The client is thread-safe. Create one per app, never per-request; causes metadata fetch storms.

Producer Metrics Worth Alerting On

Five JMX/MBean metrics that make production producer issues visible before they become incidents[Apache Kafka Docs]:

MetricThresholdWhat it meansFix
record-error-rate> 0.001 (0.1%)Send failures (after retries) — broker rejecting or unrecoverableCheck broker logs; verify ACL; raise delivery.timeout.ms
record-queue-time-avg> linger.ms × 2Records waiting in producer buffer too longbatch.size too small OR broker under-acking; profile broker
record-send-rate vs record-error-rateerror / send > 0.005Cluster instability or topic mis-configCheck metadata-fetch-rate for storms; verify min.insync.replicas
request-latency-avg> 100 ms p99Network or broker slownesstcpdump between producer and broker; check broker GC pauses
buffer-available-bytes / buffer-total-bytes< 0.10 (10%)Buffer about to fill; send() will blockRaise buffer.memory; check downstream broker backpressure

Wire these into Prometheus via the JMX exporter for JVM producers, or use the kafka-client's built-in metrics() map for Go producers. Burn-rate alerts on record-error-rate and saturation alerts on buffer-available-bytes catch the failure modes before customers see lost messages.

Sizing Worker Threads in Java Apps

The producer is non-blocking by default — send() returns a Future immediately and the I/O thread handles delivery. Application threads should NOT wait on the future synchronously; use a callback or accumulate batches:

// Anti-pattern: blocks application thread per message
producer.send(record).get();  // synchronous — defeats batching
 
// Production pattern: callback + structured error handling
producer.send(record, (metadata, exception) -> {
    if (exception != null) {
        log.error("send failed: topic={}, partition={}", record.topic(), record.partition(), exception);
        deadLetterQueue.offer(record);  // application-side fallback
    }
});

For Go (sarama or franz-go), the equivalent is the async-producer pattern with a result-channel goroutine consuming Successes and Errors. Always handle errors at the channel level — silently dropping send failures is the highest-cost-to-debug bug in the catalog.

Production Tuning Checklist

Apply in this order — each step depends on the prior:

  1. Durability first: acks=all, min.insync.replicas=2 on a 3-replica topic. Confirm with kafka-configs.sh --describe.
  2. Idempotence: enable.idempotence=true. Free in Kafka 3.x — do this before any other tuning.
  3. Throughput second: linger.ms=10, batch.size=65536, compression.type=lz4. Measure record-send-rate before/after.
  4. Buffer + delivery timeouts: buffer.memory=268435456 (256 MiB), delivery.timeout.ms=120000. Prevents send() blocking under broker stalls.
  5. Per-tenant quotas (multi-tenant clusters): Configure broker-side quotas via kafka-configs.sh --add-config 'producer_byte_rate=...'. Producer-side enforcement is fragile.
  6. Observability: Wire the 5 metrics above to Prometheus + alerting. Verify with jconsole or an equivalent for non-JVM producers.
  7. Schema evolution: Use Avro / Protobuf with a schema registry from day one. Adding a tags: List<string> field two years in is a migration if you started with raw JSON.
  8. Dead-letter topics: For consumer-side resilience, configure DLQ topics with retention.ms=2592000000 (30 days). Producer-side, route the application's send() callback failures to a local DLQ so retries are bounded.
  9. Partition key strategy: Pick a high-cardinality key (order_id, not customer_id) to avoid hot partitions. Re-partitioning a topic at scale is non-trivial.
  10. Cluster topology check: 3 brokers minimum for min.insync.replicas=2 to survive 1 broker loss. Cross-AZ replication for cloud deployments.

Idempotent Producers and Transactional Writes

Idempotence and transactions are two different durability layers that often get conflated. Idempotence eliminates duplicates inside a single producer session — the broker assigns a Producer ID (PID) at startup and tracks the highest sequence number per partition, rejecting any retry that would produce a duplicate. Transactions add atomicity across multiple partitions and sessions: either all records in the transaction commit, or none do, and a transactional consumer reading with isolation.level=read_committed will skip aborted batches entirely.

The classic use case for transactions is the consume-process-produce pipeline — read a record from topic A, transform it, write to topic B, and commit the consumer offset and the produced record atomically. Without transactions, a crash between the produce and the offset commit leaves the system in an inconsistent state: either the downstream record is lost (if the offset commits first) or duplicated on restart (if the produce commits first).

Properties props = new Properties();
props.put("bootstrap.servers", "broker1:9092,broker2:9092,broker3:9092");
props.put("transactional.id", "order-processor-instance-7");
props.put("enable.idempotence", "true");
props.put("acks", "all");
props.put("retries", Integer.MAX_VALUE);
props.put("max.in.flight.requests.per.connection", "5");
props.put("delivery.timeout.ms", "120000");
props.put("transaction.timeout.ms", "60000");
 
KafkaProducer<String, OrderEvent> producer = new KafkaProducer<>(props);
producer.initTransactions();  // fences any prior producer with the same transactional.id
 
while (running) {
    ConsumerRecords<String, RawOrder> batch = consumer.poll(Duration.ofMillis(500));
    if (batch.isEmpty()) continue;
 
    producer.beginTransaction();
    try {
        for (ConsumerRecord<String, RawOrder> in : batch) {
            OrderEvent out = transform(in.value());
            producer.send(new ProducerRecord<>("orders.normalized", out.orderId(), out));
        }
        producer.sendOffsetsToTransaction(
            currentOffsets(batch),
            consumer.groupMetadata()
        );
        producer.commitTransaction();
    } catch (ProducerFencedException | OutOfOrderSequenceException fatal) {
        producer.close();
        throw fatal;  // a newer instance has taken over; do not restart this one
    } catch (KafkaException e) {
        producer.abortTransaction();
        // re-poll the same offsets next iteration — consumer has not advanced
    }
}

The transactional.id must be stable across restarts of the same logical instance. If you derive it from a hostname plus a stable ordinal (StatefulSet pod index, ECS task slot), a restarting pod inherits the same ID and fences any zombie predecessor. Random UUIDs defeat the fencing guarantee — a partitioned-but-still-running zombie producer can keep writing under its old ID until its transaction.timeout.ms elapses.

Throughput cost of transactions is roughly 15–20% on small records and closer to 5% on large records, because the per-batch overhead amortizes across record size. The bigger latency hit is commitTransaction(), which blocks until the transaction coordinator persists the commit marker — typically 5–15 ms. Batch many records into a single transaction (a poll loop's worth) rather than one transaction per record. [Kafka producer config]

Partition Assignment Strategies

Producer partitioning controls where records land; consumer assignment controls who reads them. Both shape ordering, hot-spot risk, and rebalance behaviour. The default producer partitioner since Kafka 2.4 is the sticky partitioner, which writes to a single partition until the batch fills or linger.ms expires, then rotates. Sticky partitioning improves batching efficiency on keyless records by 30–50% versus the legacy round-robin partitioner without sacrificing distribution at the topic level. [Kafka producer config]

When records have keys, the default partitioner hashes key.bytes with murmur2 and modulos by partition count. This is deterministic — the same key always lands on the same partition for a fixed partition count. Adding partitions later breaks this property for the new key space, which is why re-partitioning at scale is non-trivial.

For consumers, the assignment strategy is set via partition.assignment.strategy:

# Cooperative sticky — recommended default since Kafka 2.4
partition.assignment.strategy=org.apache.kafka.clients.consumer.CooperativeStickyAssignor
 
# Range — assigns contiguous partition ranges per topic; risks skew with multiple topics
# partition.assignment.strategy=org.apache.kafka.clients.consumer.RangeAssignor
 
# Round-robin — distributes evenly but reshuffles all partitions on every rebalance
# partition.assignment.strategy=org.apache.kafka.clients.consumer.RoundRobinAssignor
 
# Sticky (legacy, eager) — minimises movement but stops the world during rebalance
# partition.assignment.strategy=org.apache.kafka.clients.consumer.StickyAssignor
 
session.timeout.ms=45000
heartbeat.interval.ms=3000
max.poll.interval.ms=300000
group.instance.id=consumer-pod-2  # static membership — survives short pod restarts without rebalance

CooperativeStickyAssignor does incremental rebalances: only the partitions that need to move actually pause. With the eager protocols, every consumer revokes every partition during a rebalance — a 50-partition consumer group sees a complete consumption stall whenever a single pod restarts. Cooperative rebalancing is the default in Kafka 3.0+ for new consumer groups but old groups need an explicit migration through both protocols simultaneously.

For producers, custom partitioners are rare but useful for tenant-isolation requirements (route customer X to a dedicated partition subset) or hot-key splitting (route the top N hot keys round-robin and everything else by hash). Implement org.apache.kafka.clients.producer.Partitioner and set partitioner.class in producer config — keep the partition() method O(1) and free of locks because it runs on the application thread.

ProducerInterceptor for Auditing

Producer interceptors are the cleanest place to attach cross-cutting concerns — audit logging, schema enforcement, PII redaction, header injection — without polluting business code. The ProducerInterceptor interface has two hot-path methods: onSend() runs synchronously on the application thread before the record enters the accumulator, and onAcknowledgement() runs on the I/O thread after the broker acks (or fails). Both must be fast and non-blocking; a slow interceptor stalls every producer in the JVM.

public class AuditingInterceptor implements ProducerInterceptor<String, byte[]> {
 
    private static final Logger AUDIT = LoggerFactory.getLogger("kafka.audit");
    private final AtomicLong sent = new AtomicLong();
    private final AtomicLong failed = new AtomicLong();
 
    @Override
    public ProducerRecord<String, byte[]> onSend(ProducerRecord<String, byte[]> record) {
        record.headers()
            .add("audit.producer", System.getenv("HOSTNAME").getBytes(StandardCharsets.UTF_8))
            .add("audit.timestamp", Long.toString(System.currentTimeMillis()).getBytes())
            .add("audit.trace-id", currentTraceId().getBytes(StandardCharsets.UTF_8));
        return record;
    }
 
    @Override
    public void onAcknowledgement(RecordMetadata metadata, Exception exception) {
        if (exception == null) {
            sent.incrementAndGet();
        } else {
            failed.incrementAndGet();
            AUDIT.warn("send failed: topic={} partition={} cause={}",
                metadata != null ? metadata.topic() : "unknown",
                metadata != null ? metadata.partition() : -1,
                exception.getClass().getSimpleName());
        }
    }
 
    @Override public void close() { /* flush metrics */ }
    @Override public void configure(Map<String, ?> configs) { /* read interceptor config */ }
}

Wire interceptors via interceptor.classes, comma-separated for multiple stages — they execute in declaration order on onSend() and reverse order on onAcknowledgement(), so put validation first and metrics last for symmetric instrumentation. Anything that allocates per-record (logging, serialization to JSON, regex matching) should be benchmarked under load — a 10 microsecond regression in onSend() becomes the producer's bottleneck at 100k records/second.

A common production pattern is wrapping schema-registry validation in an interceptor: parse the record value as Avro, look up the writer schema, fail fast on incompatible writes before the record reaches the broker. The fail-fast behaviour saves the team from discovering schema bugs only at consumer-side deserialization, by which point hundreds of bad records have already been replicated across the cluster.

CMK Encryption with KMS

Kafka brokers persist data on disk in plaintext by default. For compliance regimes that mandate encryption-at-rest with customer-managed keys (HIPAA, PCI-DSS, FedRAMP), the broker-level disk encryption that cloud Kafka services advertise is often insufficient — auditors want envelope encryption with a CMK the customer rotates and audits. The standard pattern is client-side payload encryption before the record leaves the producer, using a data encryption key (DEK) wrapped by a CMK in AWS KMS, GCP Cloud KMS, or HashiCorp Vault Transit.

public class KmsEncryptingSerializer implements Serializer<OrderEvent> {
 
    private final KmsClient kms;
    private final String cmkArn;
    private final Cache<String, SecretKey> dekCache; // PT5M TTL, ~10k entries
    private final Serializer<OrderEvent> delegate; // e.g. Avro
 
    @Override
    public byte[] serialize(String topic, OrderEvent event) {
        byte[] plaintext = delegate.serialize(topic, event);
        DataKey dek = generateOrReuseDek(topic);
 
        Cipher cipher = Cipher.getInstance("AES/GCM/NoPadding");
        byte[] iv = new byte[12];
        SecureRandom.getInstanceStrong().nextBytes(iv);
        cipher.init(Cipher.ENCRYPT_MODE, dek.plaintext(), new GCMParameterSpec(128, iv));
        byte[] ciphertext = cipher.doFinal(plaintext);
 
        // envelope: [version=1][iv_len=12][iv][wrapped_dek_len][wrapped_dek][ciphertext]
        return EncryptedEnvelope.pack(iv, dek.wrappedBytes(), ciphertext);
    }
 
    private DataKey generateOrReuseDek(String topic) {
        return dekCache.get(topic, t -> {
            GenerateDataKeyResponse resp = kms.generateDataKey(req -> req
                .keyId(cmkArn)
                .keySpec(DataKeySpec.AES_256)
                .encryptionContext(Map.of("topic", t, "purpose", "kafka-payload")));
            return new DataKey(resp.plaintext(), resp.ciphertextBlob().asByteArray());
        });
    }
}

Cache DEKs aggressively — generating one per record costs a KMS round-trip (~5–20 ms) plus a per-call API charge that becomes the dominant cost at scale. A 5-minute DEK rotation TTL means roughly one KMS call per topic per producer instance per 5 minutes, which for a 10-topic cluster across 50 instances is ~6,000 KMS calls per hour, well within free-tier limits.

Use the encryptionContext parameter (AWS) or additionalAuthenticatedData (GCP) to bind ciphertext to its topic. An attacker who pulls a ciphertext from one topic and tries to decrypt it as if it came from another topic gets a verification failure — even if they have the wrapped DEK. Consumers reverse the process: unpack the envelope, call kms.decrypt() on the wrapped DEK with the same context, then AES-GCM decrypt the payload. Both sides must agree on the envelope format — version it from day one.

Kafka 4.0 KRaft-Mode Operational Changes

Kafka 4.0 (early 2026) is the first version to ship without ZooKeeper support — the long-standing migration to KRaft (Kafka Raft) is now mandatory. From a producer perspective, the wire protocol is unchanged: existing client code keeps working. The operational surface is what shifts. Controller quorum nodes (typically 3 or 5) replace the ZooKeeper ensemble, broker startup is faster (no ZK session establishment), and metadata propagation latency drops by an order of magnitude on large clusters because the metadata log is now a regular Kafka topic that brokers replicate via the same fetch protocol they use for data.

For producers, two operational changes matter. First, dynamic broker config changes propagate via the metadata log, so delivery.timeout.ms can be tuned tighter on producers because broker config changes converge in milliseconds instead of seconds. Second, the controller failover window shrinks from 5–30 seconds (ZK session timeout) to under a second (Raft leader election), so producers see far fewer NOT_CONTROLLER retries during planned controller restarts.

# kraft-broker.yaml — Kafka 4.0 broker process config
process.roles: broker  # or "controller" or "broker,controller" for combined mode
node.id: 11
controller.quorum.voters: 1@controller-1:9093,2@controller-2:9093,3@controller-3:9093
listeners: PLAINTEXT://:9092
inter.broker.listener.name: PLAINTEXT
log.dirs: /var/kafka/data
metadata.log.dir: /var/kafka/metadata
metadata.log.max.record.bytes.between.snapshots: 20971520
metadata.max.idle.interval.ms: 500
num.partitions: 12
default.replication.factor: 3
min.insync.replicas: 2
unclean.leader.election.enable: false
auto.create.topics.enable: false

Run controllers on dedicated nodes for clusters above ~50 brokers — combined mode (broker + controller on the same JVM) is fine for development and small production clusters but creates head-of-line blocking under heavy data load. Size the controller quorum at 3 nodes for clusters up to a few hundred brokers and 5 nodes only if you need to tolerate two simultaneous controller failures, which is rare. The Raft log is small (megabytes per day for a typical cluster) so the controllers themselves can run on modest hardware — the bottleneck is fsync latency on the metadata log device, so use NVMe or equivalent.

The migration path from ZK-based 3.x clusters runs through kafka.metadata.migration.enable=true in dual-write mode, where the controller writes to both ZK and KRaft simultaneously. After verifying KRaft has caught up, flip clients to the KRaft endpoint, then decommission ZK. Plan for a maintenance window — the migration itself is online but the cutover requires a controller restart.

Frequently Asked Questions

Why enable idempotence?

Before idempotence, retries forced a choice between dropping messages (retries=0) or accepting duplicates. Idempotence adds a per-session sequence number that lets the broker deduplicate retries. The cost is 5 bytes per record — always enable it.

When do I use transactions?

Transactions exist for atomic multi-partition writes and exactly-once consume-process-produce pipelines. They cut throughput by roughly 20%. For single-partition ordering, idempotence plus consistent message keying is enough. [Kafka producer config]

How do I measure whether tuning is working?

Watch record-queue-time-avg (time spent in the producer buffer) and the batch fill ratio (record-size-avg / batch-size-avg). High queue time means linger.ms is too aggressive; a fill ratio below 0.5 means linger.ms is too low and you should raise it.

Keep Reading

BackendBytes Engineering Team
BackendBytes Engineering Team

Engineering Team

Backend engineers writing production-grade references for Go, Java, and distributed systems.

Read Next