Event Streaming

Event streaming is the practice of modeling data as a continuous, ordered flow of events rather than as a set of static records to be queried. The Apache Kafka introduction defines it as “the practice of capturing data in real-time from event sources like databases, sensors, mobile devices, cloud services, and software applications in the form of streams of events,” storing those streams durably, and processing them either as they arrive or after the fact.

The structure underneath event streaming is the log: an append-only, time-ordered sequence of records. Jay Kreps, in “The Log,” describes this as “an append-only, totally-ordered sequence of records ordered by time” and argues it is a unifying abstraction across data systems. Because the log is durable and ordered, a single stream can be consumed by many independent readers, each at its own position, and replayed from the beginning when a new consumer or a corrected computation needs the full history.

This model differs from a traditional message queue, where a message is typically removed once consumed. In an event stream the events persist according to a retention policy, so the stream itself becomes a system of record and a shared source of truth rather than a transient transport. Producers append events without knowing who will read them, and consumers subscribe without coordinating with producers, which keeps the two sides loosely coupled.

Event streaming is the conceptual foundation of platforms like Apache Kafka and of stream processors such as Apache Flink and Apache Spark. By treating the unbounded stream as the primary object, these systems support real-time pipelines, event-driven applications, and analytics in which fresh data is acted on the moment it occurs.