How to create read replicas


While postgres_fdw is primarily a tool for Data Federation (querying external tables), it is a popular way to create a "Poor Man’s Read Replica" or a "Partial Replica" when you don't want to set up full binary streaming replication.

Here is a guide on how it works, how to set it up, and the vital trade-offs.


postgres_fdw as a Read Replica

Using postgres_fdw for replication means you are creating a "Virtual Replica." Instead of the database engine copying bytes at the storage level, your "Replica" database simply reaches out to the "Primary" database via SQL every time a query is run.

Why use this instead of standard Replication?

  • Granular Control: You can replicate just one table or one schema rather than the entire database.

  • Cross-Version/Cross-Platform: You can link a Postgres 12 database to a Postgres 16 database, or a local DB to a Cloud DB.

  • Writable Replicas: Unlike a standard read replica, you can actually create indexes or local tables on the "Replica" server that don't exist on the Primary.


Implementation Steps

To set this up, you run these commands on the Replica (Foreign) server:

Step 1: Install and Create Server

CREATE EXTENSION postgres_fdw;

-- Define the connection to the Primary (Source) database
CREATE SERVER primary_db_link
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (host 'primary-db-address', dbname 'production_db', port '5432');

Step 2: User Mapping

You must tell the Replica which credentials to use when logging into the Primary.

Step 3: Import the Schema

Instead of creating tables one-by-one, you can import an entire schema from the Primary.


Performance & Optimization (Critical)

Because postgres_fdw is a network-based link, it is much slower than a local table. You must use these optimizations to make it usable as a "Read Replica":

  • use_remote_estimate: By default, the Replica doesn't know the stats of the Primary table. Setting this to true allows the Replica to ask the Primary for the best query plan.

    SQL

  • fetch_size: Controls how many rows are fetched per network trip. Increasing this (default is 100) helps with large analytical queries.

  • Predicate Pushdown: postgres_fdw is excellent at pushing WHERE clauses to the Primary. If you filter by id, only that one row travels over the network.


Foreign Tables vs. Binary Streaming Replication

Feature

postgres_fdw (Virtual)

Streaming Replication (Physical)

Setup Complexity

Very Low (SQL only)

Medium (Config files + Restart)

Latency

High (Network per query)

Very Low (Millisecond lag)

Scope

Selected Tables/Schemas

Entire Instance

Data Integrity

Dependent on Network

Guaranteed bit-for-bit copy

Use Case

Data Integration / Light Reporting

High Availability / Disaster Recovery


"Materialized" Read Replicas

A common Data Engineering pattern is to use postgres_fdw to create the link, and then create a Materialized View on top of it.

  1. The FOREIGN TABLE provides the link.

  2. The MATERIALIZED VIEW caches the data locally.

  3. A cron job runs REFRESH MATERIALIZED VIEW every hour.

This gives you local-speed performance with automated updates, which is the "sweet spot" for many reporting tasks.


Last updated