Stored Procedures

In PostgreSQL, Stored Procedures are objects that contain a collection of SQL statements and procedural logic (like loops and if-else branches) stored directly on the database server.

While they are very similar to Functions, they were introduced in PostgreSQL 11 to solve a specific limitation: the ability to manage transactions.

Procedure vs. Function

The most important distinction for a Data Engineer is how they handle "Units of Work."

Functions (CREATE FUNCTION): Run inside a single transaction. You cannot "Commit" or "Rollback" inside a function. They are usually designed to return a value (like a calculation).
Stored Procedures (CREATE PROCEDURE): Can open and close transactions. This makes them ideal for long-running ETL tasks where you want to commit data in "batches" rather than all at once.

Feature

Function

Stored Procedure

Execution

SELECT my_func();

CALL my_proc();

Transactions

Cannot COMMIT/ROLLBACK.

Can COMMIT/ROLLBACK.

Return Value

Must return a value (or void).

Does not return a value (uses INOUT parameters).

Use Case

Calculations, Data Formatting.

Bulk ETL, Maintenance, Batching.

Basic Syntax

Procedures are typically written in PL/pgSQL.

CREATE OR REPLACE PROCEDURE archive_old_orders(cutoff_date DATE)
LANGUAGE plpgsql
AS $$
BEGIN
    -- Move data to archive
    INSERT INTO orders_archive 
    SELECT * FROM orders WHERE order_date < cutoff_date;

    -- Delete from main table
    DELETE FROM orders WHERE order_date < cutoff_date;

    -- We can commit here to free up locks!
    COMMIT; 
END;
$$;

To run it, you use the CALL keyword:

CALL archive_old_orders('2025-01-01');

Key Procedural Logic (PL/pgSQL)

Inside a procedure, you aren't limited to standard SQL. You have access to:

Variables: DECLARE my_var INT := 10;
Conditionals: IF ... THEN ... ELSE ... END IF;
Loops: FOR ... IN ... LOOP ... END LOOP;
Error Handling: EXCEPTION WHEN OTHERS THEN ...

Why Data Engineers Use Procedures

Reduced Network Overhead: Instead of sending 50 individual SQL commands from a Python script to the database, you send one CALL command. The database executes the logic locally.
Batch Processing: In a massive update, you can commit every 10,000 rows to prevent the Write-Ahead Log (WAL) from growing too large and to avoid holding locks for hours.
Security: You can give a user permission to CALL a procedure without giving them direct SELECT or DELETE access to the underlying tables (encapsulation).

Visualizing Procedure Execution

Summary

CALL is for Procedures; SELECT is for Functions.
Use Procedures when you need Transaction Control (COMMIT/ROLLBACK).
Stored logic keeps your ETL "close to the data," improving performance and maintainability.

PreviousTriggers NextAnti-patterns

Last updated 16 days ago

hashtagProcedure vs. Function

hashtagBasic Syntax

hashtagKey Procedural Logic (PL/pgSQL)

hashtagWhy Data Engineers Use Procedures

hashtagVisualizing Procedure Execution

hashtagSummary