Comparison of Kimball architecture to its alternatives


In "The Data Warehouse Toolkit," Kimball devotes significant time to contrasting his methodology with others. He is particularly critical of the "Independent Data Mart" approach (which he calls a disaster) and respectful but critical of the Inmon approach (which he finds overly complex and slow to deliver value).

Here is the breakdown of these architectures according to Kimball.


1. Independent Data Mart Architecture

AKA: "The Data Silo Disaster"

This is not a "designed" architecture; it is usually what happens by accident. Different departments (Sales, Marketing, HR) get tired of waiting for IT, so they buy their own tools and build their own mini-warehouses.

  • How it works:

    • The Sales team grabs data from the CRM and puts it into a SQL server for their own reports.

    • The Marketing team grabs some of the same data but applies different rules and puts it into their own separate SQL server.

    • There is no shared logic.

  • Kimball’s Critique:

    • Chaos: If the CEO asks "How many customers do we have?", Sales says 10,000 and Marketing says 12,000. No one knows who is right.

    • Redundant Work: Every department re-writes the same cleaning logic (or worse, writes it differently).

    • Kimball says: "This is the worst possible architecture. Avoid at all costs."


2. Hub-and-Spoke (The Inmon Corporate Information Factory)1

AKA: "The Top-Down Approach"

This is the brainchild of Bill Inmon, the "Father of Data Warehousing."2 His philosophy is that you must build a perfect, normalized database for the entire enterprise before you let anyone run reports.

  • How it works:

    1. ETL: Extract data from all source systems.3

    2. The Hub (EDW): Load data into a massive 3rd Normal Form (3NF) database.4 This is an "atomic" database (lowest level of detail), but it is not optimized for reporting. It is optimized for data integrity.5

    3. The Spokes (Data Marts): You then move data out of the Hub into specific Data Marts (Star Schemas) for each department to actually use.

  • Kimball’s Critique:

    • Too Slow: It often takes years to build the "Hub" before business users get a single report. Projects often get cancelled before they finish.

    • Redundant Storage: You store the data twice (once in 3NF, once in the Data Marts).

    • Complexity: Querying the 3NF Hub is too difficult for analysts; they are forced to wait for the "Spokes" to be built.


3. Hybrid Hub-and-Spoke and Kimball Architecture

AKA: "The Compromise"

This is often seen in large banks or insurance companies with massive budgets. It attempts to please both the "Data Purists" (Inmon) and the "Business Users" (Kimball).

  • How it works:

    1. You build the Inmon 3NF Hub first to ensure total data consistency and compliance.

    2. However, instead of treating the Data Marts as an afterthought, you strictly apply Kimball's Bus Matrix to the "Spoke" layer.

    3. The 3NF Hub acts as the "Source System" for the Kimball Data Warehouse.

  • Kimball’s View:

    • He admits this works and is technically sound.

    • However, he argues it is overkill for 95% of companies. He believes you can achieve the same data consistency using Conformed Dimensions without the need for the massive, expensive 3NF Hub in the middle.

    • Cost: It requires double the storage and double the ETL maintenance.


Summary Comparison Table

Architecture

Speed to Deliver

Data Consistency

Maintenance Cost

Kimball's Verdict

Independent Marts

Fast (initially)

None (High Risk)

High (Redundant work)

🛑 Fail

Inmon (Hub-and-Spoke)

Slow (Years)

High

High (Complex Joins)

⚠️ Too Complex

Hybrid

Slowest

Very High

Very High (Double ETL)

🆗 Acceptable but Expensive

Kimball (Bus Architecture)

Fast (Iterative)

High (via Conformed Dims)

Low

✅ Recommended

Last updated