DAG Versioning in Airflow 3.x


DAG Versioning

DAG Versioning is one of the "headline features" of Airflow 3. It solves the biggest operational risk in previous versions: The Code Mismatch Problem.

In Airflow 2, if you updated a DAG file while a task was running, the Worker would blindly pick up the new code for the old run. If you renamed a function or changed logic, the running task would crash or produce corrupt data.

Airflow 3 fixes this by introducing DAG Bundles and Version History.

The Core Concept: Immutable Snapshots

When Airflow 3 parses your DAGs, it doesn't just look at the file on disk; it creates a specific Version (snapshot) of that code.

  • Run A starts at 9:00 AM. It is locked to Version 1.

  • Run B starts at 10:00 AM. It is locked to Version 1.

  • You push a code change at 10:30 AM.

  • Run C starts at 11:00 AM. It is locked to Version 2.

Crucially, if Run A is a long job that is still running at 11:00 AM, it will continue using Version 1. It essentially ignores your new code until it finishes.

How it works: The "Bundle" System

Airflow 3 introduces a new abstraction called DAG Bundles.

  • The Bundle: A collection of DAG files (your repository).

  • The Version: A specific commit or snapshot of that bundle.

When a Task needs to run, the Worker doesn't just read the local dags/ folder. Instead, it asks: "Which bundle version does this specific DAG Run belong to?" It then ensures it executes that exact version of the code.

Key Benefits for You

  1. Safe Deployments: You can deploy breaking changes to a DAG without worrying about crashing pipelines that are currently running.

  2. Auditing: You can look at a failed run from last week and see exactly what the code looked like at that moment, not what it looks like today.

  3. Rollbacks: Since previous versions are stored/tracked, reverting to an older logic is much cleaner.

In the UI

You will see a new "Version" column or tab in the Airflow 3 UI.

  • You can click on a specific DAG Run and see which version hash it used.

  • You can potentially inspect the source code for that specific version directly in the UI.

Implementation Detail

This versioning usually requires a storage backend (like S3, GCS, or a database blob) to store the serialized DAGs if you are running in a distributed setup (Kubernetes/Celery). Since you are running Local Docker Compose, Airflow handles this locally by tracking file changes and hashes, effectively creating a local version history.


Last updated