Window functions


Another explanationarrow-up-right


Window Functions in SQL (Complete Guide)

Window functions allow you to perform calculations across a set of related rows without collapsing them into a single output row (unlike GROUP BY).

Think of window functions as “running calculations” over a window, i.e., a defined set of rows relative to the current row.


Basic Anatomy of a Window Function

Every window function has the following pattern:

function_name(expression) OVER (
    PARTITION BY ...
    ORDER BY ...
    ROWS or RANGE ...
)

Three optional components of the window:

  • PARTITION BY — splits data into groups (like GROUP BY but non-destructive). In this context, "non-destructive" means that PARTITION BY divides your data into groups without reducing the number of rows in your result set.

  • ORDER BY — defines row ordering within partitions.

  • Frame clause (ROWS or RANGE) — defines which rows are visible to each calculation.

Example:


Categories of Window Functions

There are four main categories:

  1. Aggregate window functions

  2. Ranking window functions

  3. Value window functions (LAG/LEAD, FIRST_VALUE, LAST_VALUE)

  4. Distribution window functions (PERCENT_RANK, NTILE, CUME_DIST)

Below is each one in detail.


Aggregate Window Functions

These apply aggregate functions but do not collapse rows.

Common functions:

  • SUM()

  • COUNT()

  • AVG()

  • MIN()

  • MAX()

Example: Running total per customer

Example: Moving average (3-row sliding window)

Example: Total per partition


Ranking Window Functions

Used for analytics/OLAP, leaderboards, top-N filtering, etc.

Ranking functions:

  • ROW_NUMBER()

  • RANK()

  • DENSE_RANK()

  • PERCENT_RANK()


Example: ROW_NUMBER (no ties)

Example: RANK vs DENSE_RANK

Difference:

  • RANK() skips numbers on ties (1,2,2,4,…)

  • DENSE_RANK() does not skip (1,2,2,3,…)


Value Functions (LAG/LEAD, FIRST_VALUE, LAST_VALUE)

These are essential for change detection, period comparisons, time-series analytics, and before/after analysis.

LAG / LEAD

  • LAG() → looks at previous row

  • LEAD() → looks at next row

Example: Month-over-month difference

Example: Compare an event to next event


FIRST_VALUE / LAST_VALUE

Example: Price relative to first purchase

Important: LAST_VALUE() often requires a frame clause: ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING

Example: Last order value per customer


Distribution Window Functions

These help quantify relative position within a partition — extremely useful for scoring, percentiles, and statistical ranking.

Functions:

  • NTILE(n) — splits into N buckets

  • CUME_DIST() — cumulative distribution

  • PERCENT_RANK() — relative rank from 0 to 1


Example: NTILE (quartiles)

Example: CUME_DIST

Example: PERCENT_RANK


Frame Clauses (ROWS vs RANGE)

Probably the most misunderstood part of window functions.

ROWS: Physical row-based

Example:

  • ROWS BETWEEN 2 PRECEDING AND CURRENT ROW → previous 2 rows, regardless of values

RANGE: Value-based

Example:

  • RANGE BETWEEN INTERVAL '7 days' PRECEDING AND CURRENT ROW → all rows within the last 7 days, regardless of count

Example Comparison


Correlated Window Functions

You can combine window functions with filtering to express advanced analytics patterns.

Example: Top 3 orders per customer


Practical Analytics Use Cases

1. Churn Analysis

2. Running totals with resets

3. Gaps and Islands

Detecting breaks in sequences:

4. Top performers per category


Summary Table

Category
Functions
Purpose

Aggregate

SUM, AVG, MIN, MAX, COUNT

Running totals, moving averages

Ranking

ROW_NUMBER, RANK, DENSE_RANK

Top-N, leaderboards

Value

LAG, LEAD, FIRST_VALUE, LAST_VALUE

Period comparisons, time-series

Distribution

NTILE, CUME_DIST, PERCENT_RANK

Percentiles, bucketization

Frame-based

ROWS, RANGE

Window sizing logic


Last updated