VACUUM
In PostgreSQL, VACUUM is the essential process for managing storage and maintaining performance. Because of how Postgres handles data updates (using a system called MVCC), "deleted" data isn't actually removed from the disk immediately.
Why VACUUM is Necessary: MVCC & Bloat
Postgres uses Multi-Version Concurrency Control (MVCC). When you DELETE a row, the database simply marks it as "invisible" to future transactions. When you UPDATE a row, Postgres marks the old version as invisible and inserts a brand-new version (a new tuple).
Dead Tuples: These invisible, old versions of rows are called "dead tuples."
Bloat: If dead tuples aren't cleaned up, your table files grow larger and larger, even if the actual amount of "live" data stays the same. This makes queries slower because the engine has to scan through wasted space.
Types of VACUUM
There are three main "levels" of vacuuming, ranging from standard maintenance to a total rebuild.
Standard VACUUM
VACUUMThis is the most common form. It scans the table and marks the space occupied by dead tuples as available for new data.
Locking: It does not lock the table. You can continue to read and write data while it runs.
Disk Space: It usually does not return space to the Operating System. It just keeps the space inside the database file for future use.
VACUUM ANALYZE
VACUUM ANALYZEThis performs a standard vacuum and then immediately runs ANALYZE.
Purpose: It updates the statistics about the table (how many rows, common values, etc.).
Importance: The Query Planner uses these statistics to decide whether to use an Index Scan or a Table Scan. Without regular
ANALYZE, the planner makes bad, slow decisions.
VACUUM FULL
VACUUM FULLThis is the "nuclear option." It creates a brand-new, compacted version of the table on disk and deletes the old, bloated version.
Locking: It requires an Access Exclusive Lock. The table is completely unusable (no reads or writes) while this runs.
Disk Space: It does return all reclaimed space to the Operating System.
Warning: Only use this during maintenance windows when the table is extremely bloated (e.g., 50% or more dead space).
The Visibility Map
To make standard vacuuming faster, Postgres uses a Visibility Map. This is a small file that tracks which "pages" of the table only contain live tuples. VACUUM will skip those pages entirely, significantly reducing the amount of work the database has to do.
Practical Commands
Command
Action
Impact on Performance
VACUUM;
Reclaims space for internal reuse.
Minimal (can run during the day).
VACUUM ANALYZE my_table;
Reclaims space + updates optimizer stats.
Recommended after large data loads.
VACUUM FULL my_table;
Rebuilds the entire table file.
High (Table is locked until finished).
VACUUM (VERBOSE, ANALYZE);
Shows detailed logs of what was cleaned.
Useful for debugging bloat issues.
Summary: The Lifecycle of a Row
Insert: A live tuple is created.
Update/Delete: The tuple is marked "dead" but stays on the disk.
Vacuum: The engine sees the dead tuple and marks that spot on the disk as "Available."
New Insert: The engine places a new live tuple in that "Available" spot instead of growing the file.
Last updated