Make and Makefiles

GNU manual: https://www.gnu.org/software/make/manual/make.htmlarrow-up-right

Makefile tutorial: https://makefiletutorial.com/arrow-up-right


General info

Make is a build automation tool that automatically builds executable programs and libraries from source code by reading files called Makefiles. It determines which pieces of a program need to be recompiled and issues commands to recompile them.

A Makefile contains:

  • Targets: What you want to build (files or actions)

  • Dependencies: What's needed before building the target

  • Commands: How to build the target

Basic syntax:

target: dependencies
	command

How to Use

  1. Save as Makefile in your project root

  2. Run commands like:

   make install      # Install dependencies
   make test         # Run tests
   make lint         # Check code quality
   make format       # Auto-format code
   make clean        # Clean up

Whatever target name you define becomes the command you run:

Then you run: make install, make hello-world, make my-custom-task

It is possible to run commands like this: make install hello-world my-custom-task

They're going to be run in the order you specify.


Important note: Make spawns a new shell for each line in a recipe by default.

Here's what happens:

This will not print /tmp — it will print your current directory. Each command runs in its own shell:

  • Shell 1: cd /tmp (then exits)

  • Shell 2: pwd (starts fresh in the original directory

chevron-rightAnother examplehashtag

This line does two things:

  1. Activates the virtual environment with . $(SRC_DIR)/.venv/bin/activate

  2. Runs pip install in that activated environment

Why the semicolon is necessary:

By default, Make runs each line in a separate shell. If you wrote it like this instead:

The activation would happen in one shell, then that shell would exit, and pip would run in a completely new shell where the virtual environment isn't activated. You'd end up installing packages to your system Python instead of the venv.

The semicolon keeps both commands in the same shell, so the pip install runs in the context of the activated virtual environment.

chevron-rightHow to keep commands in the same shellhashtag

  1. Use semicolons (as in the previous example):

  1. Use && (stops if first command fails):

  1. Use backslash continuation (makes it one logical line):

  1. Use .ONESHELL: (special Make directive):

This makes all lines in all recipes run in the same shell, but it's less commonly used because it changes behavior globally.


About the @ Symbol

Bash commands do NOT always begin with @. The @ is optional and controls visibility:

  • Without @ (default behavior)

    • Make prints both the command as it's being run and its output.

  • With @ (suppresses command echo):

    • Only the output is shown, not the commands themselves.

Use @ for:

  • Clean, user-friendly output

  • Help messages

  • Status messages

Don't use @ for:

  • Debugging (you want to see what's running)

  • Complex commands (transparency is helpful)

  • CI/CD logs (full visibility preferred)


Variables

$() vs $$

The difference in notation.

$(VAR) = Make variable (defined in Makefile)

$$VAR = Shell variable (defined in shell commands)

Key difference:

  • $(...) → evaluated by Make

  • $$... → evaluated by shell (the $ escapes to $ in shell)

Example showing both:

  • $(PYTHON) → Make replaces with python3

  • $$file → Shell variable from the loop


Make variables

They're not quite the same as environment variables, but similar. Examples:

Note: := vs =

  • := = immediate assignment (evaluates once)

  • = = recursive assignment (evaluates each time used)

Most people use := for predictability.


Using environment variables

Load from an .env file

The most common method involves using the include directive within your Makefile to load variables from a .env file. This approach makes the variables available within the Makefile's scope.

Variables loaded from .env are typically Makefile variables. To make them available as environment variables within the shell commands executed by your recipes, you need to export them.

Export Specific Variables

Use Python-dotenv (load environment variables from a script)

In your Python code:


Automatic Variables: $< $@ $^

These are Make variables. They're shortcuts for dependencies and targets:

  • $@ = target name (the thing being built)

  • $< = first dependency

  • $^ = all dependencies

  • $* = stem in pattern rules

Examples:

Why use them?

  • Less typing

  • Works with pattern rules (explained below)

Pattern matching

If you run make data/processed/users.csv

  • % matches users

  • $< becomes data/raw/users.csv (input)

  • $@ becomes data/processed/users.csv (output)

Command becomes:

How Python Receives It

Your Python script clean.py receives two command-line arguments:

What happens:

  1. Make expands: python clean.py data/raw/users.csv data/processed/users.csv

  2. Shell executes the command

  3. Python receives: sys.argv = ['clean.py', 'data/raw/users.csv', 'data/processed/users.csv']

  4. Script reads from sys.argv[1], writes to sys.argv[2]

You may also use argparse.


?= - Assign only if not set

Usage:


Phony targets

.PHONY tells Make that a target doesn't produce a file with that name.

Why it matters:

Imagine you have this Makefile:

If someone creates a file named test in your directory, Make will see it and think "the target test already exists, nothing to do!" and won't run your command.

Solution:

Now Make knows test is a phony target (an action, not a file) and will always run it.

Common Phony Targets:

These are all actions, not files, so they should be marked .PHONY. You can list them at the top of the Makefile and specify as .PHONY.


Automatic dependency resolution

Make's killer feature - automatic dependency resolution.

The Concept

When you specify a dependency, Make:

  1. Checks if the dependency exists - if not, builds it first

  2. Compares timestamps - only rebuilds if dependency is newer than target

  3. Chains dependencies - recursively builds the entire tree

Examples:

When a target depends on another target (not a file):

What happens:

  1. Run make deploy

  2. Make first runs test target

  3. Then runs build target

  4. Finally runs deploy target

Key points:

  • Enforces order - ensures test and build complete before deploy

  • Always runs - because they're .PHONY (no timestamp checking)

  • Chains actions - like "do this, then this, then that"

When a target depends on files:

Which is great for file transformations (for example in ETL pipelines)

❌ Don't use dependencies for:

  1. Always-run tasks (use .PHONY instead). Don't use always-run tasks for expensive file transformation tasks that are run even if files are unchanged.

  1. Dynamic dependencies (determined at runtime)


The order of execution

Multiple Dependencies

Make runs this if ANY of these are newer than merged.csv:

  • users.csv

  • orders.csv

  • products.csv

  • merge.py

Parallel Dependencies

Run with: make -j3 all (processes all 3 in parallel!)


Important flags in Make

Parallel Execution

Use multiple CPU cores to run specified targets.

When Parallel Helps

Independent tasks (no dependencies between them)

Dry Run / Debug

Ignore Errors

Directory & File Options

Variable Override

What-If


Using conditional directives

Summary Table

Directive
Purpose
Example

ifeq (a,b)

If equal

ifeq ($(ENV),prod)

ifneq (a,b)

If not equal

ifneq ($(DEBUG),)

ifdef VAR

If defined (non-empty)

ifdef API_KEY

ifndef VAR

If not defined (empty)

ifndef DATABASE_URL

else

Else clause

else

else ifeq

Else-if

else ifeq ($(OS),Linux)

endif

End conditional

endif

Data Engineering example:


Loops

Run shell loops inside recipes (the commands under targets):

Key points:

  • Must escape $ as $$ (so shell sees it, not Make)

  • Use \ to continue lines

  • Runs when the target executes


Targets that start with .

For Python, .PHONY and .DEFAULT_GOAL cover 90% of use cases.

.DEFAULT_GOAL-Set default target

When you just type make (without a target), it runs help instead of the first target.

.IGNORE - Continue on errors

Normally Make stops on first error. .IGNORE makes it continue.

.EXPORT_ALL_VARIABLES - Export all Make variables

.NOTPARALLEL - Disable parallel execution


Using Make with Docker

Note: you may specify image name and tag as a variable.

Basic Docker Commands in Make

Docker Compose Integration

Real-World Data Engineering Example

Docker Network Management


Using Make with Python

While Make is traditionally used for compiled languages like C/C++, it's incredibly useful for Python projects to automate common tasks:

This article explains what you need to know: https://earthly.dev/blog/python-makefile/arrow-up-right


Using Make with Golang

Golang has its own build automation tools such as:

https://www.youtube.com/watch?v=XlobWOgcK7Y&pp=ygU1TWFrZWZpbGVzIGFuZCBHbzogU2ltcGxpZnkgYW5kIGF1dG9tYXRlIHlvdXIgd29ya2Zsb3c%3Darrow-up-right


Last updated