MongoDB Architecture
Topics:
- Data Model
- MongoDB Instance
- MongoDB server Component
- Storage Engine
- Replication
1.DATA MODEL
- Documents
- The basic unit of data in MongoDB.
- Equivalent to a row in relational databases, but more flexible.
- BSON format – supports complex/nested structures.
- Collections
- A group of MongoDB documents.
- Equivalent to a table in relational DBs.
- No enforced schema, allowing different document structures.
- Databases
- Top-level container for collections.
- A single MongoDB instance can host multiple databases
MONGODB INSTANCE
In MongoDB, an instance refers to a running copy of the MongoDB server software. This is typically a process (mongod
) running on a machine—either locally, on a server, or in the cloud.
- MongoDB instance: The server process (
mongod
) running on a machine.
MongoDB Instance (mongod)
├── Database1
│ ├── CollectionA
│ │ ├── Document1
│ │ ├── Document2
│ └── CollectionB
├── Database2
│ └── CollectionC
2. MongoDB Server Components
MongoD(MongoDB Daemon)
mongod
is the core server process of MongoDB.- It’s what you start to actually run the MongoDB database on your machine or server.
- When you say “run MongoDB,” you’re really starting
mongod
.
- Handling CRUD Operations
- Create: Insert documents into collections.
- Read: Query data using filters, projections.
- Update: Modify documents.
- Delete: Remove documents or collections.
- Memory Management
- Caches frequently accessed data in memory (RAM) for fast reads.
- Uses an internal memory-mapped storage engine.
- Data Storage
- Manages reading/writing data to disk.
- Handles journaling and storage engines (like WiredTiger).
- Replication
- Keeps multiple copies of your data across servers (replica sets).
- Provides high availability and automatic failover.
- Sharding
- Distributes data across multiple machines (shards) for horizontal scalability.
- Allows MongoDB to handle large datasets and high throughput.
mongo Shell / Compass / Drivers
- mongo shell: JavaScript-based CLI to interact with mongod.
- Compass: GUI for querying and analyzing documents.
- Drivers: Language-specific interfaces (Node.js, Python, Java, etc.). Allow applications to connect to MongoDB and perform operations programmatically
mongos
mongos
is the query router used in sharded MongoDB clusters.- It's not a database itself—rather, it acts as an intermediary between clients and the shards.
- Query Routing:
- Directs client requests to the correct shard(s) based on the shard key.
- Hides the complexity of the sharded architecture from the client.
- Load Balancing:
- Distributes queries across shards for performance and scalability.
- Aggregation of Results:
- If a query spans multiple shards,
mongos
collects and merges the results before sending them back to the client.
- If a query spans multiple shards,
Sharded Cluster Architecture:
Client
|
v
[mongos] ← Query router
|
v
[Shards] — shard1, shard2, shard3... (each is a replica set)
|
v
[Config Servers] — store metadata about the cluster (e.g., which data is on which shard)
Storage Engine in MongoDB
A Storage Engine is the low-level component in MongoDB that manages how data is stored, updated, and retrieved from the disk.
Think of it like the “brain” behind data storage — choosing where and how data lives on your hard drive or SSD.
MongoDB Storage Engines
1. WiredTiger (Default Engine)
- Default since MongoDB 3.2.
- Modern and high-performance.
Feature | Description |
---|---|
๐ Document-level concurrency | Multiple documents in the same collection can be read/written at the same time. Efficient for high-load apps. |
๐️ Compression | Reduces disk usage with Snappy (fast) or Zlib (better compression). |
๐ง Caching | Frequently used data is kept in memory for faster access. |
๐งพ Journaling | Write-ahead logs ensure data safety in case of crashes. |
2. MMAPv1 (Deprecated)
- Older engine used before MongoDB 3.2.
- Now deprecated and not recommended.
- Collection-level locking.
- Not recommended for new projects.
Replication (High Availability)
Replica Set
A Replica Set is a group of mongod
instances that maintain the same data set, ensuring high availability and data redundancy.
Key Benefits:
- Automatic failover
- Data redundancy
- Read scalability (with read preferences)
Roles in a Replica Set:
- Primary:
- Handles all write operations by default.
- Also serves read operations unless the read preference is changed.
- Only one primary exists at a time.
- Secondary:
- Continuously replicates data from the primary.
- Can be configured to serve read operations (based on read preference).
- Eligible for election if the primary fails.
- Arbiter:
- Does not store data.
- Participates in the election process to break ties.
- Useful when you need an odd number of votes but don’t want the overhead of a full replica.
Failover Process:
If the primary node goes down:
- An election process is triggered.
- The replica set automatically promotes one of the secondaries to become the new primary.
- Client drivers will detect the change and reroute operations accordingly.
Indexing in MongoDB
Indexes in MongoDB are like the index in a book – they help MongoDB find data faster.
Without an index, MongoDB has to scan every document in a collection (called a collection scan) – which is slow for large datasets.
Comments
Post a Comment