In distributed stream processing, ensuring data correctness and consistency is crucial — especially when dealing with financial systems, inventory tracking, or critical business logic. Without proper handling, duplicate processing or message loss can lead to incorrect results or state corruption.

Kafka Streams, a client library for building event-driven microservices, offers Exactly-Once Semantics (EOS) to guarantee each message is processed once and only once, even in the face of failures.

In this post, we’ll explore how Exactly-Once Semantics works in Kafka Streams and how to configure and implement it correctly.


Kafka provides three delivery guarantees:

  • At-most-once: Messages may be lost but never duplicated
  • At-least-once: Messages are not lost but may be processed multiple times
  • Exactly-once: No message is lost or processed more than once

EOS ensures that produced output reflects each input message only once, even after retries or crashes.


Key EOS Concepts in Kafka Streams

  1. Idempotent Producers
    Prevents duplicate records during producer retries by assigning sequence numbers to messages.

  2. Transactions
    Allows atomic writes of records to multiple Kafka topics, including changelogs and output topics.

  3. Stateful Operators
    Kafka Streams maintains local state (e.g., for joins, aggregations) that needs to be committed consistently.

  4. Processing Guarantees Configurations

StreamsConfig.PROCESSING_GUARANTEE_CONFIG = "exactly_once_v2"

Kafka 2.5+ recommends "exactly_once_v2" for better scalability and performance.


Basic Kafka Streams Configuration for EOS

Properties props = new Properties();
props.put(StreamsConfig.APPLICATION_ID_CONFIG, "eos-stream-app");
props.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "broker1:9092");
props.put(StreamsConfig.PROCESSING_GUARANTEE_CONFIG, StreamsConfig.EXACTLY_ONCE_V2);
props.put(StreamsConfig.REPLICATION_FACTOR_CONFIG, 3); // for fault-tolerant changelogs
props.put(StreamsConfig.CACHE_MAX_BYTES_BUFFERING_CONFIG, 0); // optional for deterministic output
props.put(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, Serdes.String().getClass());
props.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, Serdes.String().getClass());

This configuration enables:

  • Transactional producer/consumer
  • Atomic state updates
  • Deduplication and retry safety

Example: Aggregation with EOS Enabled

StreamsBuilder builder = new StreamsBuilder();

KStream<String, String> input = builder.stream("input-topic");

input
.groupByKey()
.windowedBy(TimeWindows.ofSizeWithNoGrace(Duration.ofMinutes(5)))
.count()
.toStream()
.map((windowedKey, count) -> KeyValue.pair(windowedKey.key(), count.toString()))
.to("output-topic", Produced.with(Serdes.String(), Serdes.String()));

Even under failure, Kafka Streams guarantees:

  • Each input record affects the output once
  • State changelog and output topic are committed in a single transaction

Monitoring and Troubleshooting

Use metrics to verify EOS effectiveness:

  • commit-latency-avg
  • eos-enabled (should be true)
  • task.failed or producer.error.count

Common misconfigurations:

  • Using non-transactional sinks (e.g., writing to DBs without idempotency)
  • Incompatible external side effects (use connectors with transactional support)
  • Low replication of changelogs leading to data loss

Best Practices

  • Always use EXACTLY_ONCE_V2 in new applications (introduced in Kafka 2.5+)
  • Set a replication factor ≥ 3 for durability
  • Keep processing idempotent where possible
  • Avoid using manual commits — Kafka Streams handles commits atomically
  • Regularly test failure scenarios to validate end-to-end EOS guarantees

Conclusion

Kafka Streams makes it practical to implement exactly-once semantics in production-grade streaming applications without adding external complexity. With proper configuration and a transactional mindset, you can build fault-tolerant, consistent, and high-performance pipelines that handle both operational failures and business-critical logic.

EOS is no longer just a theoretical goal — it’s a practical foundation for modern, reliable event-driven systems.