Principles of good data architectures


These principles are described in

  • the book "Fundamentals of Data Engineering" written by Joe Reis and Matt Housley

  • the course called "Data Engineering" from Deeplearning.ai hosted Coursera and instructed by the authors of the above mentioned book

According to the authors, when formulating these principles they borrowed inspiration from many sources, 2 of which are AWS Well-Architected Framework and Google Cloud’s Five Principles for Cloud-Native Architecture.


The Nine Principles (per Reis & Housley)

  1. Choose Common Components Wisely

    • Use shared building blocks (e.g., object storage, orchestration, observability) across teams.

    • But avoid “one-size-fits-all” forcing: pick components that suit different domains / use cases.

  2. Plan for Failure

    • Architect with the assumption that systems will fail.

    • Define RTO (Recovery Time Objective) and RPO (Recovery Point Objective) to set business-meaningful SLAs.

  3. Architect for Scalability

    • Make systems elastic — able to scale up when load increases, but also scale down to save cost.

    • Scalability includes both throughput (volume) and resource elasticity.

  4. Architecture Is Leadership

    • Data architects should take a leadership role: mentor teams, make informed decisions, guide technology adoption.

    • Not just designing, but championing best practices across the org.

  5. Always Be Architecting

    • Architecture is never “done”: continuously revisit and evolve the architecture as business and tech change.

    • Maintain an ongoing roadmap: know where you are, where you want to go, and how to get there.

  6. Build Loosely Coupled Systems

    • Design systems so that components (services, data stores) can evolve independently.

    • Use abstraction (APIs, messaging) to hide internal implementation details.

    • This reduces dependencies and helps with independent deployment / changes.

  7. Make Reversible Decisions

    • Favor decisions that you can roll back or replace easily.

    • Since the data landscape evolves fast, what works today might be suboptimal tomorrow — so avoid lock-in.

  8. Prioritize Security

    • Embed security from the start (not as an afterthought).

    • Use principles like zero trust and understand the shared responsibility model (especially in cloud).

  9. Embrace FinOps

    • Integrate financial operations (FinOps) into architecture decisions.

    • Think about cost optimization, not just performance: e.g., spot instances, reserved capacity, trade-offs between pay-per-query vs reserved.


Why These Principles Matter

  • Flexibility & Agility: Reis emphasizes that good architecture must adapt. The world (business needs, tech) changes quickly, so the architecture should not be rigid.

  • Cost-Awareness: By including FinOps, the authors acknowledge that data systems aren’t just technical — financial cost is a first-class concern, especially in the cloud.

  • Reliability & Resilience: Planning for failure ensures that data systems remain robust — the book encourages defining realistic recovery objectives.

  • Leadership & Culture: Architecture isn’t just about designing systems — it's about influencing teams, knowledge sharing, and guiding the organization’s data strategy.



Last updated