Types of NoSQL databases


⭐ Overview: What NoSQL Means Today

“NoSQL” originally meant “Not Only SQL.” It now refers to database systems that:

  • use non-relational data models

  • scale horizontally

  • handle semi-structured or unstructured data

  • often trade strict ACID properties for performance, availability, and flexibility

There is no single NoSQL model—there are multiple categories, each optimized for a different kind of problem.


  • Key-Value Stores - Store data as key-value pairs, similar to a dictionary or hash table (e.g., Redis, DynamoDB)

  • Document Stores - Store data as documents, typically in JSON or XML format, allowing for flexible schemas (e.g., MongoDB, CouchDB)

  • Column-Family Stores (aka Wide Column stores) - Organize data into columns rather than rows, optimized for queries over large datasets (e.g., Cassandra, HBase)

  • Graph Databases - Store data as nodes and edges, designed for managing highly connected data and relationships (e.g., Neo4j, Amazon Neptune)

  • Vector Databases - Store and query high-dimensional vectors, optimized for similarity search and commonly used in AI/ML applications (e.g., Pinecone, Weaviate, Milvus)

  • Time-Series Databases - Optimized for storing and querying time-stamped data, such as metrics, events, or IoT sensor data (e.g., InfluxDB, TimescaleDB)


🔑 Key-Value Stores

Model

The simplest NoSQL model:

Keys are unique; values are opaque blobs (string, JSON, binary, etc.). The DB does not understand the structure of the value.

Strengths

  • Extremely fast lookups (O(1) with hash tables)

  • High write throughput

  • Easy horizontal scaling (sharding)

  • Great for caching and session storage

Weaknesses

  • No complex querying

  • No secondary indexes (except in some systems)

  • No joins

  • You must know the key to retrieve data

Use Cases

  • Caching (e.g., Redis, Memcached)

  • User sessions

  • Feature flags

  • Configuration stores

  • Shopping cart data

  • Token/billing counters

  • Redis

  • Memcached

  • Amazon DynamoDB (key-value + document features)

  • Riak

  • Aerospike


📄 Document Databases

Model

Data is stored as self-contained documents, typically JSON or BSON.

Example document:

Strengths

  • Flexible schema (schema-on-write or schema-free)

  • Nested objects and arrays

  • Easy to evolve structure over time

  • Rich query language (on fields, arrays, nested data)

  • Horizontal scaling via sharding

Weaknesses

  • Complex joins are difficult or absent

  • Risk of data duplication (denormalization)

  • Harder to enforce integrity constraints

  • Query performance can degrade without careful index design

Use Cases

  • Content management

  • User profiles

  • Product catalogs

  • Event logging

  • Microservices storing heterogeneous data

  • MongoDB

  • Couchbase

  • CouchDB

  • Firebase Firestore

  • Amazon DocumentDB


🕸 Graph Databases

Model

Data is stored as nodes and edges, representing relationships directly:

Nodes = entities Edges = relationships Edges can have attributes (e.g., weight, timestamp, type)

Two main graph models:

  • Property graph (Neo4j)

  • RDF triple stores (Blazegraph, Amazon Neptune)

Strengths

  • Designed for relationship-heavy data

  • Efficient traversal of complex networks

  • Queries like “shortest path” or “friends of friends” are trivial

  • Relationships stored natively, not inferred via joins

Weaknesses

  • Scaling horizontally can be challenging

  • Not ideal for analytics across large datasets (OLAP)

  • Requires very different thinking compared to relational or document stores

Use Cases

  • Social networks

  • Fraud detection

  • Recommendation systems

  • Knowledge graphs

  • Identity & access modeling (RBAC graphs)

  • Supply chain optimization

  • Neo4j

  • Amazon Neptune

  • ArangoDB (multi-model)

  • JanusGraph

  • TigerGraph


🧠 Vector Databases (Modern NoSQL)

Vector databases are a modern category designed for AI, embeddings, and similarity search.

Model

Data is stored as high-dimensional vectors:

These vectors come from:

  • LLM embeddings

  • image encoders

  • audio embeddings

  • recommendation models

The database uses approximate nearest neighbor (ANN) indexes such as:

  • HNSW

  • FAISS

  • IVF

  • DiskANN

to efficiently find similar vectors.

Strengths

  • Handles semantic search

  • Scales to millions/billions of vectors

  • Millisecond-level similarity results

  • Supports hybrid queries (filters + vector search)

Weaknesses

  • Not ACID like relational DBs

  • Consistency models vary

  • Requires understanding of embeddings

  • Index tuning can be complex

Use Cases

  • RAG (Retrieval-Augmented Generation)

  • Semantic search

  • Recommender systems

  • Image/audio search

  • Deduplication (finding near duplicates)

  • Anomaly detection

  • Pinecone

  • Weaviate

  • Milvus

  • Qdrant

  • Chroma

  • Elasticsearch/OpenSearch (vector support added)

  • Postgres pgvector (extension)


🧭 Summary Table

Category
Model
Strengths
Weaknesses
Best For
Examples

Key-Value

key → value

Fast, simple, scalable

No querying

caching, sessions

Redis, DynamoDB

Document

JSON docs

flexible schema, rich queries

no joins

profiles, catalogs

MongoDB, Firestore

Graph

nodes + edges

relationship queries

hard to scale

social, fraud

Neo4j, Neptune

Vector

embeddings

similarity search

needs AI models

RAG, search

Pinecone, Milvus


Modern architectures often combine multiple NoSQL types:

  • Redis for caching

  • MongoDB for flexible app data

  • Neo4j for relationships

  • Milvus/pgvector for AI/RAG

  • DynamoDB for scalable key-value workloads

“NoSQL” is not one thing — it's a collection of data models optimized for different access patterns.


Last updated