Self/Past dependencies
Self/Past Dependencies - Explained in Depth
What Are Self/Past Dependencies?
BAD PATTERN (Self-dependency):
Day 1: dim_user (empty) → load → dim_user_day1
Day 2: dim_user_day1 → changes → dim_user_day2
Day 3: dim_user_day2 → changes → dim_user_day3
Day 4: dim_user_day3 → changes → dim_user_day4
...
Day 1000: dim_user_day999 → changes → dim_user_day1000
Each day depends on the previous day!The Problem: Complexity Score
COMPLEXITY SCORE EXPLOSION:
To compute dim_user_day1000:
┌─────────────────────────────────────┐
│ Need: dim_user_day999 │
│ Which needs: dim_user_day998 │
│ Which needs: dim_user_day997 │
│ ... │
│ Which needs: dim_user_day1 │
│ │
│ Complexity Score: 1000! │
└─────────────────────────────────────┘
vs.
GOOD PATTERN (No self-dependency):
To compute dim_user_day1000:
┌─────────────────────────────────────┐
│ Need: raw_user_data_day1000 │
│ Need: reference_data (maybe) │
│ │
│ Complexity Score: 2! │
└─────────────────────────────────────┘Concrete Example
Visual Representation of the Problem
Why This Matters for Backfills
The Common Culprit: Cumulative Metrics
Solutions
Real-World Impact
Key Takeaway
Last updated