TTL
TTL in Streaming/Databases
Time To Live in streaming systems and databases:
Purpose: Limits how long state/data is retained to manage memory and storage
Mechanism:
Associate a timestamp with each piece of state
Periodically clean up state older than the TTL threshold
Can be based on event time, processing time, or ingestion time
Units: Actual time duration (seconds, minutes, hours, days)
Why TTL Matters in Streaming
Without TTL:
State grows unbounded → OOM (Out of Memory) errors
Old, irrelevant data wastes resources
Join buffers fill up with ancient events
With TTL:
Bounded memory usage
Automatic cleanup of stale data
Clear semantics: "only join events within N time of each other"
The Trade-off
Longer TTL:
✓ More complete results (catch late-arriving data)
✓ Fewer missed matches in joins
✗ Higher memory usage
✗ Slower state access (more data to scan)
Shorter TTL:
✓ Lower memory footprint
✓ Faster operations
✗ Miss late-arriving events
✗ Incomplete results for slow sources
Last updated