What is a Stream?
A Stream is not a transport method (like a pipe). A Stream is a data structure (like a table or a list).
Specifically, a Stream is an unbounded, append-only, immutable log of events.
Here is the breakdown of those three scary words.
The Physical Concept: "The Infinite Tape"
Imagine a magnetic cassette tape (or a DVR recording of a security camera).
Unbounded: The recording never stops. It runs 24/7 forever. There is no "end of the file."
Append-Only: You can only record new footage at the end. You cannot insert a clip in the middle of yesterday's footage.
Immutable: Once something is recorded, you cannot change it. You can't edit the past.
In a database, you can UPDATE row #4. In a stream, you cannot update. You can only append a new event that says "Row #4 has changed."
The Mental Model: Stream vs. Queue
This is where most people get stuck.
A Queue is a "To-Do List"
Goal: Empty the list.
Action: When you process an item, you delete it.
Analogy: A text message. Once you read it, the notification disappears.
A Stream is a "History Book"
Goal: Record what happened.
Action: When you process an item, you just turn the page. The previous page stays there.
Analogy: A Twitter/X feed or a Diary. Just because you read a tweet doesn't mean it gets deleted for everyone else. It stays in the history.
The "Offset" (Your Bookmark)
Since the Stream never deletes data (it just keeps growing), how does the system know what you have read?
It uses an Offset.
The Offset is simply a number (like a page number or a timestamp).
Consumer A (Real-time Dashboard) is reading at Offset 10,000 (Live data).
Consumer B (Backup Service) is reading at Offset 5,000 (Yesterday's data).
Because the data persists, different consumers can read the same stream at different speeds.
Stream vs. Table (The Duality)
This is a concept known as "Stream-Table Duality," popularized by the creators of Kafka.
The Stream: The Change Log. A list of actions.
[ "Alice deposited $100", "Alice bought shoes for $20", "Alice deposited $50" ]
The Table: The State. The result of those actions at a point in time.
Alice's Balance: $130
As a Data Engineer, you often store the Stream (in Kafka) so that you can rebuild the Table (in a Database/Warehouse) whenever you want. If you corrupt your Table, you just replay the Stream to fix it.
How some stream concepts relate to each other
Event Bus: The highway infrastructure.
Stream: The infinite line of cars (data) driving on that highway.
Stream Processing: Standing on a bridge counting the cars as they pass underneath.
Last updated