Apache Kafka is known for its high-throughput, fault-tolerant publish-subscribe model. A critical part of Kafka’s efficiency lies in how it manages message retention and storage. Kafka supports both time-based retention and log compaction, allowing developers to balance between durability, storage cost, and real-time accessibility.

In this post, we’ll explore Kafka’s message retention and compaction strategies, explain how they work, and share best practices for optimizing Kafka storage in production.


Kafka’s Storage Model

Kafka stores messages in logs, segmented by partitions. Each topic-partition is backed by an append-only log that grows over time.

Kafka brokers periodically:

  • Segment logs
  • Delete or compact older messages
  • Retain logs based on topic-level configurations

You can control how long and how much Kafka retains by tuning retention settings and enabling log compaction where appropriate.


Message Retention Policies

Kafka supports time-based and size-based retention.

Set via topic configurations:

retention.ms=604800000  # 7 days
retention.bytes=1073741824  # 1GB per partition
Setting Description
retention.ms Retain messages for X milliseconds
retention.bytes Retain messages until X bytes of log size

Once a segment falls outside of these limits, it is deleted by the broker.


Log Compaction

Log compaction ensures Kafka retains the latest value for each key, enabling:

  • Efficient change data capture (CDC)
  • Cache updates and state reconstruction
  • Durable last-known-good-value messages

Enable it per topic:

cleanup.policy=compact

Kafka then:

  • Retains at least the latest message per key
  • Discards older entries with the same key
  • Keeps tombstone messages (null values) for delete tracking

Combined Policies: Delete + Compact

Kafka supports hybrid retention:

cleanup.policy=compact,delete

Use this when:

  • You want to preserve latest state per key
  • But still want to delete segments beyond a time threshold

This is helpful for topics where some state tracking is needed but full logs aren’t required long-term.


Example: Compacted vs. Non-Compacted Topics

Non-Compacted (default)

  • Every message is retained for a configured time or size.
  • Use for analytics, auditing, reprocessing.

Compacted

  • Only the latest message for each key is kept.
  • Use for change logs, lookup state, and system events.

Tombstone Messages and Deletes

To delete a key from a compacted topic, produce a null value:

{"key": "user123", "value": null}

Kafka retains this tombstone for delete.retention.ms, after which it’s removed during compaction.


Best Practices

  1. Segment Your Topics
    Use separate topics for:
    • Full logs
    • Compacted change logs
    • Retained snapshots
  2. Tune Segment Sizes and Timeouts
    Lower segment size → faster cleanup
    Higher segment.ms → fewer files, better compaction performance

  3. Monitor Disk Usage
    Use kafka-log-dirs.sh or Prometheus JMX exporters

  4. Test Compaction with Staging Topics
    Validate that your compaction logic retains exactly what’s expected

  5. Keep a Healthy Cleanup Schedule
    Avoid over-retention: stale data increases disk pressure and recovery time

Monitoring Retention and Compaction

Key metrics to monitor:

  • LogEndOffset and HighWatermark (per partition)
  • Disk usage per topic/partition
  • LogSegmentsPerPartition
  • Compaction lag (LogCleanerManager stats)

Use tools like:

  • Kafka Manager
  • Prometheus + Grafana
  • Confluent Control Center

Conclusion

Efficient Kafka storage management starts with the right retention and compaction strategy. By understanding how Kafka retains and removes data, you can:

  • Save disk space
  • Improve recovery time
  • Support real-time and replayable consumers

Whether you’re building a CDC pipeline, a real-time analytics stream, or a state sync layer, Kafka gives you the flexibility to tune storage exactly how your architecture needs it.