# How to create read replicas

***

While `postgres_fdw` is primarily a tool for **Data Federation** (querying external tables), it is a popular way to create a "Poor Man’s Read Replica" or a "Partial Replica" when you don't want to set up full binary streaming replication.

Here is a guide on how it works, how to set it up, and the vital trade-offs.

***

### `postgres_fdw` as a Read Replica

Using `postgres_fdw` for replication means you are creating a "Virtual Replica." Instead of the database engine copying bytes at the storage level, your "Replica" database simply reaches out to the "Primary" database via SQL every time a query is run.

#### Why use this instead of standard Replication?

* **Granular Control**: You can replicate just one table or one schema rather than the entire database.
* **Cross-Version/Cross-Platform**: You can link a Postgres 12 database to a Postgres 16 database, or a local DB to a Cloud DB.
* **Writable Replicas**: Unlike a standard read replica, you can actually create indexes or local tables on the "Replica" server that don't exist on the Primary.

***

### Implementation Steps

To set this up, you run these commands on the **Replica (Foreign)** server:

#### Step 1: Install and Create Server

```sql
CREATE EXTENSION postgres_fdw;

-- Define the connection to the Primary (Source) database
CREATE SERVER primary_db_link
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (host 'primary-db-address', dbname 'production_db', port '5432');
```

#### Step 2: User Mapping

You must tell the Replica which credentials to use when logging into the Primary.

```sql
CREATE USER MAPPING FOR current_user
SERVER primary_db_link
OPTIONS (user 'replica_user', password 'secure_password');
```

#### Step 3: Import the Schema

Instead of creating tables one-by-one, you can import an entire schema from the Primary.

```sql
IMPORT FOREIGN SCHEMA public 
FROM SERVER primary_db_link 
INTO public;
```

***

### Performance & Optimization (Critical)

Because `postgres_fdw` is a network-based link, it is much **slower** than a local table. You must use these optimizations to make it usable as a "Read Replica":

* `use_remote_estimate`: By default, the Replica doesn't know the stats of the Primary table. Setting this to `true` allows the Replica to ask the Primary for the best query plan.

  SQL

  ```
  ALTER SERVER primary_db_link OPTIONS (use_remote_estimate 'true');
  ```
* `fetch_size`: Controls how many rows are fetched per network trip. Increasing this (default is 100) helps with large analytical queries.
* **Predicate Pushdown:** `postgres_fdw` is excellent at pushing `WHERE` clauses to the Primary. If you filter by `id`, only that one row travels over the network.

***

### Foreign Tables vs. Binary Streaming Replication

| **Feature**          | **postgres\_fdw (Virtual)**        | **Streaming Replication (Physical)**  |
| -------------------- | ---------------------------------- | ------------------------------------- |
| **Setup Complexity** | Very Low (SQL only)                | Medium (Config files + Restart)       |
| **Latency**          | High (Network per query)           | Very Low (Millisecond lag)            |
| **Scope**            | Selected Tables/Schemas            | Entire Instance                       |
| **Data Integrity**   | Dependent on Network               | Guaranteed bit-for-bit copy           |
| **Use Case**         | Data Integration / Light Reporting | High Availability / Disaster Recovery |

***

### "Materialized" Read Replicas

A common Data Engineering pattern is to use `postgres_fdw` to create the link, and then create a **Materialized View** on top of it.

1. The `FOREIGN TABLE` provides the link.
2. The `MATERIALIZED VIEW` caches the data locally.
3. A `cron` job runs `REFRESH MATERIALIZED VIEW` every hour.

This gives you local-speed performance with automated updates, which is the "sweet spot" for many reporting tasks.

***
