TTL

TTL in Streaming/Databases

Time To Live in streaming systems and databases:

Purpose: Limits how long state/data is retained to manage memory and storage

Mechanism:

  • Associate a timestamp with each piece of state

  • Periodically clean up state older than the TTL threshold

  • Can be based on event time, processing time, or ingestion time

Units: Actual time duration (seconds, minutes, hours, days)


Why TTL Matters in Streaming

Without TTL:

  • State grows unbounded → OOM (Out of Memory) errors

  • Old, irrelevant data wastes resources

  • Join buffers fill up with ancient events

With TTL:

  • Bounded memory usage

  • Automatic cleanup of stale data

  • Clear semantics: "only join events within N time of each other"

The Trade-off

Longer TTL:

  • ✓ More complete results (catch late-arriving data)

  • ✓ Fewer missed matches in joins

  • ✗ Higher memory usage

  • ✗ Slower state access (more data to scan)

Shorter TTL:

  • ✓ Lower memory footprint

  • ✓ Faster operations

  • ✗ Miss late-arriving events

  • ✗ Incomplete results for slow sources


Last updated