# Inmon

***

Relevant books:

* Building the Data Warehouse, 4th edition (Wiley) by W\.H.Inmon
* Corporate Information Factory
* The Unified Star Schema (Technics Publications) by W. H. (Bill) Inmon

***

## **Inmon Data Modeling (Corporate Information Factory)**

*Bill Inmon’s “Top-Down” Data Warehouse Architecture*

Bill Inmon—often called the **father of the data warehouse**—defined one of the earliest and most influential approaches to enterprise analytics architecture. His method is typically summarized as **top-down**, **enterprise-wide**, and **normalized**.

At the time (late 1980s), organizations were querying production OLTP systems directly, slowing them down. Inmon’s data warehouse solved this by separating analytical workloads from operational systems.

***

### **Inmon’s Definition of a Data Warehouse**

A data warehouse is:

> **Subject-oriented, integrated, nonvolatile, and time-variant data that supports management decision-making.**

Let’s break down each term and its modeling implications.

***

#### **1. Subject-Oriented**

* The warehouse is built around **business subjects** (e.g., Sales, Finance, Marketing—not around applications like SAP, Shopify, or HubSpot).
* Logical models focus on one subject domain at a time.
* Each domain contains business keys, attributes, and relationships across the enterprise.

**Implication:**\
Your data model must capture **all enterprise facts about a single business area**, not just data from one application.

***

#### **2. Integrated**

This is the most important Inmon principle.

* Data is ingested from *multiple disparate source systems*.
* During ETL, it is:
  * standardized
  * conformed
  * cleaned
  * deduplicated
  * sequenced and normalized

When it lands in the warehouse, it has a **single unified corporate representation**.

**Integration creates the “Single Source of Truth.”**

**Implication:**\
All data must be aligned (domains, data types, business keys). The warehouse becomes the closest thing to a unified enterprise data model.

***

#### **3. Nonvolatile**

* Data, once stored, does *not change* (except for appending new history).
* No transactional updates, deletes, or overwrites.

**Implication:**\
Design models expecting immutable data:

* Surrogate keys
* Audit fields
* Load dates
* Business-effective dates
* Snapshot tables if needed

You build a stable layer where older facts remain intact for long-term analytics.

***

#### **4. Time-Variant**

* Every record includes time context (load time, effective time).
* Queries can span long historical ranges.

**Implication:**\
Modeling must include:

* slowly changing dimensions (SCDs)
* historical fact tables
* lineage tracking
* temporal data structures

The warehouse becomes a place to ask questions about **the past**, not just the present.

***

### **What the Inmon Warehouse Looks Like**

**Core Characteristics**

* Built using **3rd Normal Form (3NF)** entity-relationship modeling.
* Highly normalized → minimal redundancy → maximum consistency.
* Mirrors business truth, often resembling normalized OLTP systems in structure (but not purpose).
* Data arrives via heavy ETL: standardization, validation, integration.

***

#### **Architecture: Top-Down**

Below is a conceptual diagram showing how data flows in a classic **Inmon (top-down) architecture**:

```
             ┌──────────────────────────────────────┐
             │          Source Systems               │
             │  (ERP, CRM, Orders, Inventory, etc.) │
             └──────────────────────────────────────┘
                             │
                             ▼
                ┌──────────────────────────┐
                │        ETL Layer         │
                │  Cleansing, Standardizing│
                │  Deduplication, Keys     │
                │  Integration, Validation │
                └──────────────────────────┘
                             │
                             ▼
         ┌────────────────────────────────────────────┐
         │   Enterprise Data Warehouse (3NF, EDW)      │
         │  - Integrated enterprise model              │
         │  - Subject-oriented                         │
         │  - Nonvolatile + time-variant               │
         │  - Atomic data in normalized entities       │
         └────────────────────────────────────────────┘
                             │
                             ▼
                ┌──────────────────────────┐
                │      ETL / Data Marts    │
                │   (Summaries, Joins,     │
                │    Dimensionalization)   │
                └──────────────────────────┘
                             │
                             ▼
 ┌────────────────────────────────────────────────────────────────────┐
 │                     Departmental Data Marts (Star Schema)          │
 │  e.g.:                                                             │
 │   - Sales Mart (FactSales + DimCustomer + DimProduct + DimDate)   │
 │   - Inventory Mart                                                 │
 │   - Marketing Mart                                                 │
 └────────────────────────────────────────────────────────────────────┘
                             │
                             ▼
           ┌────────────────────────────────────┐
           │       BI, Dashboards, Reports      │
           └────────────────────────────────────┘
```

This is the classic Inmon flow:\
**Sources → ETL → 3NF EDW → ETL → Star Schemas → BI.**

**Key points:**

* **EDW is the authoritative source** for analytics.
* **Data marts depend on the EDW**, not on operational systems.
* Data marts may be:
  * star schema (common)
  * snowflake schema
  * or any structure optimized for consumption.

***

#### **Example: E-commerce Business**

**Source systems:**

* Orders
* Inventory
* Marketing

These feed into the **Enterprise Data Warehouse**, where they become:

* normalized
* integrated
* historical
* immutable

Once in the EDW, downstream ETL builds subject-specific **data marts**:

* **Sales data mart**
* **Marketing data mart**
* **Purchasing data mart**

Each may use a star schema for performance.

<figure><img src="https://2332658533-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FG5fhKjYnbaQlTPTcaO85%2Fuploads%2F5zRTaH9kEk4Q8MtvKYL3%2Fdata_warehouse_diagram.svg?alt=media&#x26;token=27c8cf86-0c54-4186-96e1-26c1c0af466d" alt=""><figcaption></figcaption></figure>

***

### **Why Inmon?**

**Pros**

* Strong enterprise consistency (integration).
* High data quality.
* Long-term historical accuracy.
* Excellent for large organizations with complex domains.

**Cons**

* Slowest to implement (years in large companies).
* Heavy ETL maintenance.
* Requires strict data governance.
* Analysts often don’t like normalized structures for querying.

***

#### **How It Compares to Kimball**

* **Inmon:**\
  Top-down → EDW (3NF) → Data marts → Reports\
  \&#xNAN;*Enterprise first.*
* **Kimball:**\
  Bottom-up → Data marts (star schemas) → Virtual EDW\
  \&#xNAN;*Analytics first.*

Modern architectures often blend both.

***

#### **Summary of Inmon Modeling**

* Build a **centralized, enterprise-wide** data warehouse.
* Store **granular**, **normalized (3NF)** data.
* Maintain **immutable**, **time-variant** history.
* Use **data marts** for department-specific access.
* Prioritize **integration consistency** above all else.

***

***
