Kafka Producer Tuning Cheat Sheet: Throughput, Latency & Durability
Key Takeaways
- →acks=1 silently loses data on broker failover; only acks=all + min.insync.replicas=2 survives a broker failure without data loss
- →linger.ms (not batch.size) is the biggest throughput lever — raising from 0 to 20ms can 4x throughput on small records by batching before sending
- →Idempotence (enable.idempotence=true) adds 5 bytes per record and eliminates duplicates on retry with only ~3-5% throughput cost — always enable it
- →One KafkaProducer per application (thread-safe); creating per-message triggers metadata fetch storms; reuse it or you'll exhaust connections
The classic Kafka producer production data-loss incident. A service ships with
acks=1,linger.ms=0,retries=0, andenable.idempotence=false— the default-derived config that looks fine in staging. A broker fails over during peak traffic. Roughly 30 seconds of in-flight messages vanish silently because acks=1 only confirms the leader, not the followers. We debugged this exact failure mode on multiple production teams.
That config predates Kafka 3.0, which flipped the Java producer to safe-by-default — enable.idempotence=true and acks=all (KIP-679). On a current cluster you reach this failure mode through legacy configs, explicit overrides (someone set acks=1 chasing latency), or non-Java clients that still default weaker — not the out-of-box Java producer. Verify every value below rather than trusting an inherited config.
Kafka Producer Tuning Essentials
Every production Kafka producer loses data or throughput at one moment: when config defaults deploy without understanding the tradeoff. acks=1 silently drops messages on failover. linger.ms=0 tanks throughput. retries=0 stops on network blips.
Start with: acks=all, enable.idempotence=true, batch.size=65536 (default is 16384 — this is a recommended increase), linger.ms=5 (default is 0), compression.type=lz4. Idempotence eliminates duplicates on retry with minimal overhead — Confluent's published benchmarks put the throughput cost in the low single digits on typical workloads (source). Your numbers will vary with record size and acks setting; measure before and after when it matters. Document every deviation from these values.
- Survive broker failover:
acks=all+ brokermin.insync.replicas=2(3-replica cluster) - Biggest throughput lever:
linger.ms— raise from 0 to 5–20ms, not batch.size - One
KafkaProducerper app; reuse it thread-safe, never create per-message
graph LR
App["Application<br/>producer.send()"] --> RA["RecordAccumulator<br/>buffer per partition"]
RA -->|"batch.size or<br/>linger.ms expires"| ST["Sender Thread"]
ST -->|"max.in.flight<br/>batches"| Broker["Kafka Broker"]
Broker -->|"acks=all"| ST
ST -->|"callback"| App
The Quick Start
| Goal | acks | batch.size | linger.ms | compression | max.in.flight | Trade-off |
|---|---|---|---|---|---|---|
| Durability first (default) | all | 65536 (64K) | 5 | lz4 | 5 | Loses ~5ms latency, gains no data loss |
| Maximum throughput | all | 131072 (128K) | 20 | zstd | 5 | Adds ~20ms latency, 15% compression gain |
Latency-critical (<10ms p99) | all | 16384 | 0 | none | 1 | Drops to single partition throughput, keeps durability |
| Metrics you can lose | 0 | 65536 | 5 | lz4 | 5 | Fastest; data loss on any broker crash |
Durability: Choosing acks
[Kafka producer config]
acks is the kill switch. acks=0 loses data on restart. acks=1 loses data on leader failover — it looks safe in tests and fails silently in production. Use acks=all everywhere except metrics.
Pair it with broker-side min.insync.replicas=2 on a 3-replica cluster. This is the only combination that survives a broker failure without data loss.
Throughput: Batching + Linger
[Apache Kafka Docs]Three knobs move throughput: batch.size, linger.ms, and compression.type. batch.size is a ceiling per partition (default 16384 bytes). linger.ms is the biggest lever most teams miss — raising from 0 to 5–20ms can 4x throughput on small records. Use lz4 (good ratio/CPU) unless you're network-bound; zstd achieves better compression ratios but costs more CPU.
Measure batch fill ratio: record-size-avg / batch-size-avg. Below 0.5 means raise linger.ms.
Buffer + sender thread lifecycle
send() doesn't talk to Kafka directly. It appends to an in-memory RecordAccumulator where records group into per-partition batches. A background Sender thread drains batches when they fill OR when linger.ms expires.
sequenceDiagram
participant App
participant Acc as Accumulator
participant Sender
participant Broker
App->>Acc: send(record)
App->>Acc: send(record)
App->>Acc: send(record)
Note over Acc: batch fills OR<br/>linger.ms expires
Acc->>Sender: flush batch
Sender->>Broker: Produce (acks=all)
Broker-->>Sender: ack (ISR copied)
Sender-->>App: onCompletion
If buffer.memory fills before the Sender can drain, send() blocks until space frees. This is the hidden backpressure path — misconfigured linger.ms + slow broker can stall the application thread.
Idempotence & Ordering
[Kafka producer config]Enable enable.idempotence=true unconditionally. It adds 5 bytes per record and eliminates duplicates on retry. With idempotence, max.in.flight.requests.per.connection stays at 5; without it, drop to 1 for ordering.
Order is guaranteed within a partition. Key records by entity ID for strict per-entity ordering:
producer.send(new ProducerRecord<>("orders", order.customerId(), order));Don't use transactions for ordering — they're for atomic multi-partition writes and exactly-once semantics, and cut throughput 20%. [Kafka producer config]
Tune by symptom
[Kafka producer config]When a Kafka producer is misbehaving, the question is "what is the symptom?" — not "which config knob shall I tweak?" Route by what you measured:
graph TD
Sym{What is<br/>broken?} -->|Data loss on failover| Loss[acks=all<br/>+ min.insync.replicas=2<br/>+ enable.idempotence=true]
Sym -->|Low throughput<br/>under 10k msg/s| Tp[linger.ms 5 to 20<br/>+ batch.size 65536<br/>+ compression.type=lz4]
Sym -->|High p99 latency| Lat[linger.ms back to 0 to 1<br/>+ acks=1 if data-loss tolerable<br/>+ smaller batch.size]
Sym -->|Duplicates downstream| Dup[enable.idempotence=true<br/>+ transactional.id for exactly-once]
Sym -->|Out-of-order messages| Ord[max.in.flight.requests=1<br/>or enable.idempotence=true<br/>which keeps order with five in-flight]
Sym -->|Producer blocks on send| Buf[buffer.memory 256 MiB or higher<br/>+ check broker backpressure]
Sym -->|Network-blip outages| Net[retries=MAX_INT<br/>+ delivery.timeout.ms 120 seconds<br/>+ enable.idempotence=true]
style Loss fill:#fdd
style Dup fill:#ffd
style Ord fill:#ffd
style Buf fill:#fdd
style Net fill:#dfd
style Tp fill:#dfd
style Lat fill:#dfd
The diagram is the entire tuning cheat-sheet in one picture[Kafka producer config]: classify the symptom, pick the three knobs that apply, never tune in isolation.
Common Gotchas
acks=1+min.insync.replicas=1: Loses data on failover silently. Useacks=allwithmin.insync.replicas=2.retries=0: Drops on network blips. Useenable.idempotence=true+retries=Integer.MAX_VALUE.- 32 MiB buffer on high-throughput: Fills at 10k msg/s, blocks
send(). Raise to 256 MiB+. linger.ms=0everywhere: Default is 0 for legacy reasons. Set to 5–20ms on non-latency-critical producers.- Per-message
KafkaProducer: The client is thread-safe. Create one per app, never per-request; causes metadata fetch storms.
Producer Metrics Worth Alerting On
Five JMX/MBean metrics that make production producer issues visible before they become incidents[Apache Kafka Docs]:
| Metric | Threshold | What it means | Fix |
|---|---|---|---|
record-error-rate | > 0.001 (0.1%) | Send failures (after retries) — broker rejecting or unrecoverable | Check broker logs; verify ACL; raise delivery.timeout.ms |
record-queue-time-avg | > linger.ms × 2 | Records waiting in producer buffer too long | batch.size too small OR broker under-acking; profile broker |
record-send-rate vs record-error-rate | error / send > 0.005 | Cluster instability or topic mis-config | Check metadata-fetch-rate for storms; verify min.insync.replicas |
request-latency-avg | > 100 ms p99 | Network or broker slowness | tcpdump between producer and broker; check broker GC pauses |
buffer-available-bytes / buffer-total-bytes | < 0.10 (10%) | Buffer about to fill; send() will block | Raise buffer.memory; check downstream broker backpressure |
Wire these into Prometheus via the JMX exporter for JVM producers, or use the kafka-client's built-in metrics() map for Go producers. Burn-rate alerts on record-error-rate and saturation alerts on buffer-available-bytes catch the failure modes before customers see lost messages.
Sizing Worker Threads in Java Apps
The producer is non-blocking by default — send() returns a Future immediately and the I/O thread handles delivery. Application threads should NOT wait on the future synchronously; use a callback or accumulate batches:
// Anti-pattern: blocks application thread per message
producer.send(record).get(); // synchronous — defeats batching
// Production pattern: callback + structured error handling
producer.send(record, (metadata, exception) -> {
if (exception != null) {
log.error("send failed: topic={}, partition={}", record.topic(), record.partition(), exception);
deadLetterQueue.offer(record); // application-side fallback
}
});For Go (sarama or franz-go), the equivalent is the async-producer pattern with a result-channel goroutine consuming Successes and Errors. Always handle errors at the channel level — silently dropping send failures is the highest-cost-to-debug bug in the catalog.
Production Tuning Checklist
Apply in this order — each step depends on the prior:
- Durability first:
acks=all,min.insync.replicas=2on a 3-replica topic. Confirm withkafka-configs.sh --describe. - Idempotence:
enable.idempotence=true. Free in Kafka 3.x — do this before any other tuning. - Throughput second:
linger.ms=10,batch.size=65536,compression.type=lz4. Measurerecord-send-ratebefore/after. - Buffer + delivery timeouts:
buffer.memory=268435456(256 MiB),delivery.timeout.ms=120000. Preventssend()blocking under broker stalls. - Per-tenant quotas (multi-tenant clusters): Configure broker-side quotas via
kafka-configs.sh --add-config 'producer_byte_rate=...'. Producer-side enforcement is fragile. - Observability: Wire the 5 metrics above to Prometheus + alerting. Verify with
jconsoleor an equivalent for non-JVM producers. - Schema evolution: Use Avro / Protobuf with a schema registry from day one. Adding a
tags: List<string>field two years in is a migration if you started with raw JSON. - Dead-letter topics: For consumer-side resilience, configure DLQ topics with
retention.ms=2592000000(30 days). Producer-side, route the application'ssend()callback failures to a local DLQ so retries are bounded. - Partition key strategy: Pick a high-cardinality key (
order_id, notcustomer_id) to avoid hot partitions. Re-partitioning a topic at scale is non-trivial. - Cluster topology check: 3 brokers minimum for
min.insync.replicas=2to survive 1 broker loss. Cross-AZ replication for cloud deployments.
Idempotent Producers and Transactional Writes
Idempotence and transactions are two different durability layers that often get conflated. Idempotence eliminates duplicates inside a single producer session — the broker assigns a Producer ID (PID) at startup and tracks the highest sequence number per partition, rejecting any retry that would produce a duplicate. Transactions add atomicity across multiple partitions and sessions: either all records in the transaction commit, or none do, and a transactional consumer reading with isolation.level=read_committed will skip aborted batches entirely.
The classic use case for transactions is the consume-process-produce pipeline — read a record from topic A, transform it, write to topic B, and commit the consumer offset and the produced record atomically. Without transactions, a crash between the produce and the offset commit leaves the system in an inconsistent state: either the downstream record is lost (if the offset commits first) or duplicated on restart (if the produce commits first).
Properties props = new Properties();
props.put("bootstrap.servers", "broker1:9092,broker2:9092,broker3:9092");
props.put("transactional.id", "order-processor-instance-7");
props.put("enable.idempotence", "true");
props.put("acks", "all");
props.put("retries", Integer.MAX_VALUE);
props.put("max.in.flight.requests.per.connection", "5");
props.put("delivery.timeout.ms", "120000");
props.put("transaction.timeout.ms", "60000");
KafkaProducer<String, OrderEvent> producer = new KafkaProducer<>(props);
producer.initTransactions(); // fences any prior producer with the same transactional.id
while (running) {
ConsumerRecords<String, RawOrder> batch = consumer.poll(Duration.ofMillis(500));
if (batch.isEmpty()) continue;
producer.beginTransaction();
try {
for (ConsumerRecord<String, RawOrder> in : batch) {
OrderEvent out = transform(in.value());
producer.send(new ProducerRecord<>("orders.normalized", out.orderId(), out));
}
producer.sendOffsetsToTransaction(
currentOffsets(batch),
consumer.groupMetadata()
);
producer.commitTransaction();
} catch (ProducerFencedException | OutOfOrderSequenceException fatal) {
producer.close();
throw fatal; // a newer instance has taken over; do not restart this one
} catch (KafkaException e) {
producer.abortTransaction();
// re-poll the same offsets next iteration — consumer has not advanced
}
}The transactional.id must be stable across restarts of the same logical instance. If you derive it from a hostname plus a stable ordinal (StatefulSet pod index, ECS task slot), a restarting pod inherits the same ID and fences any zombie predecessor. Random UUIDs defeat the fencing guarantee — a partitioned-but-still-running zombie producer can keep writing under its old ID until its transaction.timeout.ms elapses.
Throughput cost of transactions is roughly 15–20% on small records and closer to 5% on large records, because the per-batch overhead amortizes across record size. The bigger latency hit is commitTransaction(), which blocks until the transaction coordinator persists the commit marker — typically 5–15 ms. Batch many records into a single transaction (a poll loop's worth) rather than one transaction per record. [Kafka producer config]
Partition Assignment Strategies
Producer partitioning controls where records land; consumer assignment controls who reads them. Both shape ordering, hot-spot risk, and rebalance behaviour. The default producer partitioner since Kafka 2.4 is the sticky partitioner, which writes to a single partition until the batch fills or linger.ms expires, then rotates. Sticky partitioning improves batching efficiency on keyless records by 30–50% versus the legacy round-robin partitioner without sacrificing distribution at the topic level. [Kafka producer config]
When records have keys, the default partitioner hashes key.bytes with murmur2 and modulos by partition count. This is deterministic — the same key always lands on the same partition for a fixed partition count. Adding partitions later breaks this property for the new key space, which is why re-partitioning at scale is non-trivial.
For consumers, the assignment strategy is set via partition.assignment.strategy:
# Cooperative sticky — recommended default since Kafka 2.4
partition.assignment.strategy=org.apache.kafka.clients.consumer.CooperativeStickyAssignor
# Range — assigns contiguous partition ranges per topic; risks skew with multiple topics
# partition.assignment.strategy=org.apache.kafka.clients.consumer.RangeAssignor
# Round-robin — distributes evenly but reshuffles all partitions on every rebalance
# partition.assignment.strategy=org.apache.kafka.clients.consumer.RoundRobinAssignor
# Sticky (legacy, eager) — minimises movement but stops the world during rebalance
# partition.assignment.strategy=org.apache.kafka.clients.consumer.StickyAssignor
session.timeout.ms=45000
heartbeat.interval.ms=3000
max.poll.interval.ms=300000
group.instance.id=consumer-pod-2 # static membership — survives short pod restarts without rebalanceCooperativeStickyAssignor does incremental rebalances: only the partitions that need to move actually pause. With the eager protocols, every consumer revokes every partition during a rebalance — a 50-partition consumer group sees a complete consumption stall whenever a single pod restarts. Cooperative rebalancing is the default in Kafka 3.0+ for new consumer groups but old groups need an explicit migration through both protocols simultaneously.
For producers, custom partitioners are rare but useful for tenant-isolation requirements (route customer X to a dedicated partition subset) or hot-key splitting (route the top N hot keys round-robin and everything else by hash). Implement org.apache.kafka.clients.producer.Partitioner and set partitioner.class in producer config — keep the partition() method O(1) and free of locks because it runs on the application thread.
ProducerInterceptor for Auditing
Producer interceptors are the cleanest place to attach cross-cutting concerns — audit logging, schema enforcement, PII redaction, header injection — without polluting business code. The ProducerInterceptor interface has two hot-path methods: onSend() runs synchronously on the application thread before the record enters the accumulator, and onAcknowledgement() runs on the I/O thread after the broker acks (or fails). Both must be fast and non-blocking; a slow interceptor stalls every producer in the JVM.
public class AuditingInterceptor implements ProducerInterceptor<String, byte[]> {
private static final Logger AUDIT = LoggerFactory.getLogger("kafka.audit");
private final AtomicLong sent = new AtomicLong();
private final AtomicLong failed = new AtomicLong();
@Override
public ProducerRecord<String, byte[]> onSend(ProducerRecord<String, byte[]> record) {
record.headers()
.add("audit.producer", System.getenv("HOSTNAME").getBytes(StandardCharsets.UTF_8))
.add("audit.timestamp", Long.toString(System.currentTimeMillis()).getBytes())
.add("audit.trace-id", currentTraceId().getBytes(StandardCharsets.UTF_8));
return record;
}
@Override
public void onAcknowledgement(RecordMetadata metadata, Exception exception) {
if (exception == null) {
sent.incrementAndGet();
} else {
failed.incrementAndGet();
AUDIT.warn("send failed: topic={} partition={} cause={}",
metadata != null ? metadata.topic() : "unknown",
metadata != null ? metadata.partition() : -1,
exception.getClass().getSimpleName());
}
}
@Override public void close() { /* flush metrics */ }
@Override public void configure(Map<String, ?> configs) { /* read interceptor config */ }
}Wire interceptors via interceptor.classes, comma-separated for multiple stages — they execute in declaration order on onSend() and reverse order on onAcknowledgement(), so put validation first and metrics last for symmetric instrumentation. Anything that allocates per-record (logging, serialization to JSON, regex matching) should be benchmarked under load — a 10 microsecond regression in onSend() becomes the producer's bottleneck at 100k records/second.
A common production pattern is wrapping schema-registry validation in an interceptor: parse the record value as Avro, look up the writer schema, fail fast on incompatible writes before the record reaches the broker. The fail-fast behaviour saves the team from discovering schema bugs only at consumer-side deserialization, by which point hundreds of bad records have already been replicated across the cluster.
CMK Encryption with KMS
Kafka brokers persist data on disk in plaintext by default. For compliance regimes that mandate encryption-at-rest with customer-managed keys (HIPAA, PCI-DSS, FedRAMP), the broker-level disk encryption that cloud Kafka services advertise is often insufficient — auditors want envelope encryption with a CMK the customer rotates and audits. The standard pattern is client-side payload encryption before the record leaves the producer, using a data encryption key (DEK) wrapped by a CMK in AWS KMS, GCP Cloud KMS, or HashiCorp Vault Transit.
public class KmsEncryptingSerializer implements Serializer<OrderEvent> {
private final KmsClient kms;
private final String cmkArn;
private final Cache<String, SecretKey> dekCache; // PT5M TTL, ~10k entries
private final Serializer<OrderEvent> delegate; // e.g. Avro
@Override
public byte[] serialize(String topic, OrderEvent event) {
byte[] plaintext = delegate.serialize(topic, event);
DataKey dek = generateOrReuseDek(topic);
Cipher cipher = Cipher.getInstance("AES/GCM/NoPadding");
byte[] iv = new byte[12];
SecureRandom.getInstanceStrong().nextBytes(iv);
cipher.init(Cipher.ENCRYPT_MODE, dek.plaintext(), new GCMParameterSpec(128, iv));
byte[] ciphertext = cipher.doFinal(plaintext);
// envelope: [version=1][iv_len=12][iv][wrapped_dek_len][wrapped_dek][ciphertext]
return EncryptedEnvelope.pack(iv, dek.wrappedBytes(), ciphertext);
}
private DataKey generateOrReuseDek(String topic) {
return dekCache.get(topic, t -> {
GenerateDataKeyResponse resp = kms.generateDataKey(req -> req
.keyId(cmkArn)
.keySpec(DataKeySpec.AES_256)
.encryptionContext(Map.of("topic", t, "purpose", "kafka-payload")));
return new DataKey(resp.plaintext(), resp.ciphertextBlob().asByteArray());
});
}
}Cache DEKs aggressively — generating one per record costs a KMS round-trip (~5–20 ms) plus a per-call API charge that becomes the dominant cost at scale. A 5-minute DEK rotation TTL means roughly one KMS call per topic per producer instance per 5 minutes, which for a 10-topic cluster across 50 instances is ~6,000 KMS calls per hour, well within free-tier limits.
Use the encryptionContext parameter (AWS) or additionalAuthenticatedData (GCP) to bind ciphertext to its topic. An attacker who pulls a ciphertext from one topic and tries to decrypt it as if it came from another topic gets a verification failure — even if they have the wrapped DEK. Consumers reverse the process: unpack the envelope, call kms.decrypt() on the wrapped DEK with the same context, then AES-GCM decrypt the payload. Both sides must agree on the envelope format — version it from day one.
Kafka 4.0 KRaft-Mode Operational Changes
Kafka 4.0 (early 2026) is the first version to ship without ZooKeeper support — the long-standing migration to KRaft (Kafka Raft) is now mandatory. From a producer perspective, the wire protocol is unchanged: existing client code keeps working. The operational surface is what shifts. Controller quorum nodes (typically 3 or 5) replace the ZooKeeper ensemble, broker startup is faster (no ZK session establishment), and metadata propagation latency drops by an order of magnitude on large clusters because the metadata log is now a regular Kafka topic that brokers replicate via the same fetch protocol they use for data.
For producers, two operational changes matter. First, dynamic broker config changes propagate via the metadata log, so delivery.timeout.ms can be tuned tighter on producers because broker config changes converge in milliseconds instead of seconds. Second, the controller failover window shrinks from 5–30 seconds (ZK session timeout) to under a second (Raft leader election), so producers see far fewer NOT_CONTROLLER retries during planned controller restarts.
# kraft-broker.yaml — Kafka 4.0 broker process config
process.roles: broker # or "controller" or "broker,controller" for combined mode
node.id: 11
controller.quorum.voters: 1@controller-1:9093,2@controller-2:9093,3@controller-3:9093
listeners: PLAINTEXT://:9092
inter.broker.listener.name: PLAINTEXT
log.dirs: /var/kafka/data
metadata.log.dir: /var/kafka/metadata
metadata.log.max.record.bytes.between.snapshots: 20971520
metadata.max.idle.interval.ms: 500
num.partitions: 12
default.replication.factor: 3
min.insync.replicas: 2
unclean.leader.election.enable: false
auto.create.topics.enable: falseRun controllers on dedicated nodes for clusters above ~50 brokers — combined mode (broker + controller on the same JVM) is fine for development and small production clusters but creates head-of-line blocking under heavy data load. Size the controller quorum at 3 nodes for clusters up to a few hundred brokers and 5 nodes only if you need to tolerate two simultaneous controller failures, which is rare. The Raft log is small (megabytes per day for a typical cluster) so the controllers themselves can run on modest hardware — the bottleneck is fsync latency on the metadata log device, so use NVMe or equivalent.
The migration path from ZK-based 3.x clusters runs through kafka.metadata.migration.enable=true in dual-write mode, where the controller writes to both ZK and KRaft simultaneously. After verifying KRaft has caught up, flip clients to the KRaft endpoint, then decommission ZK. Plan for a maintenance window — the migration itself is online but the cutover requires a controller restart.
Frequently Asked Questions
Why enable idempotence?
Before idempotence, retries forced a choice between dropping messages (retries=0) or accepting duplicates. Idempotence adds a per-session sequence number that lets the broker deduplicate retries. The cost is 5 bytes per record — always enable it.
When do I use transactions?
Transactions exist for atomic multi-partition writes and exactly-once consume-process-produce pipelines. They cut throughput by roughly 20%. For single-partition ordering, idempotence plus consistent message keying is enough. [Kafka producer config]
How do I measure whether tuning is working?
Watch record-queue-time-avg (time spent in the producer buffer) and the batch fill ratio (record-size-avg / batch-size-avg). High queue time means linger.ms is too aggressive; a fill ratio below 0.5 means linger.ms is too low and you should raise it.
Keep Reading
- Event-Driven Microservices with Go and Kafka
- Building Resilient Distributed Systems with Go — circuit breakers and retries on the consumer side complement producer idempotence.
- Idempotency Patterns in Distributed Systems — how
enable.idempotence=truefits the broader idempotency story. - Go Concurrency Best Practices — passing
context.Contextto producer.send() so callers can cancel. - Microservices Architecture Patterns — when Kafka is the integration layer, producer config defaults the entire system's durability.
Engineering Team
Backend engineers writing production-grade references for Go, Java, and distributed systems.
Read Next
Event-Driven Microservices in Go: Kafka, Sagas, and the Outbox Pattern
Reliable event-driven Go beyond connecting to Kafka: handling partial failures, duplicates, and distributed transactions safely.
Go context.Context Cheat Sheet: Cancellation, Timeouts & Gotchas
Go context.Context: constructors, cancellation, deadlines, request values, and five goroutine leak patterns in production.
Postgres EXPLAIN Cheat Sheet: Reading Query Plans Like a Pro
Postgres EXPLAIN plans: node types, cost interpretation, and six patterns that kill query performance on large datasets.