Open In App

Embedded vs. Referenced Documents in MongoDB

Last Updated : 11 Feb, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

When designing a schema in MongoDB, understanding how to model relationships between data is critical for optimizing performance, scalability, and maintainability. MongoDB offers two primary ways to represent relationships: embedded documents and referenced documents. Both have their advantages and trade-offs and selecting the right model depends on the specific use case and requirements of your application.

In this article, we’ll explore the differences between embedded and referenced documents in MongoDB, when to use each approach, and how they impact data management and query performance.

Key Differences Between Embedded and Referenced Documents

Given below are the key differences between embedded and referenced documents in MongoDB, which help determine the best approach for our application based on performance, data size, and schema complexity. These differences highlight how data is stored and related in each model, along with their advantages and limitations.

FeatureEmbedded DocumentsReferenced Documents
DefinitionDocuments within documents, creating a hierarchical structureDocuments refer to other documents stored in different collections
Atomic OperationsSupports atomic operations on the entire documentAtomic operations are limited to individual documents
Read PerformanceFaster, as all related data is fetched in a single querySlower, requires multiple queries to fetch related data
Write PerformancePotentially slower for large documents due to size limitationsPotentially faster for large datasets due to smaller document sizes
Data LocalityHigh, as related data is stored togetherLow, related data is distributed across collections
Document SizeLimited by MongoDB's 16 MB document size limitNot constrained by individual document size, better for large datasets
RedundancyCan lead to data duplication and increased storage requirementsMinimizes data duplication, reducing storage requirements
Update ComplexitySimple for atomic updates, but complex for deeply nested updatesComplex, especially when maintaining consistency across documents
Schema FlexibilityLess flexible, better for fixed or simple hierarchical structuresMore flexible, suitable for complex and evolving schemas
Use Cases- One-to-One Relationships<br>- One-to-Many Relationships (small datasets)- Many-to-Many Relationships<br>- Large Subdocuments<br>- Independent updates
Examplejson { "_id": 1, "name": "John Doe", "address": { "street": "123 Main St", "city": "Anytown", "state": "CA" } }Users Collection: json { "_id": 1, "name": "John Doe", "address_id": 1001 } Addresses Collection: json { "_id": 1001, "street": "123 Main St", "city": "Anytown", "state": "CA" }

Introduction to MongoDB Relationships

In MongoDB, relationships between documents can be managed either by embedding related data within a single document or by using references that point to other documents in separate collections. Understanding the pros and cons of these methods will help us design a database schema that is both efficient and scalable.

1. Embedded Documents

Embedded documents are documents stored within other documents, forming a nested, hierarchical structure. This approach allows MongoDB to store related data together, making it easy to retrieve the entire set of information in a single query. This method uses MongoDB's support for complex document structures.

Advantages

  • Atomic Operations: Since all related data is in one document, updates to this document are atomic, ensuring data consistency.
  • Read Performance: Fetching all related data in a single query is faster and more efficient, as it avoids the need for multiple queries and joins.
  • Data Locality: Keeping related data together can improve performance, particularly when the data is frequently accessed together.

Disadvantages

  • Document Size: MongoDB has a document size limit of 16 MB. Embedding too much data can lead to large documents that may exceed this limit.
  • Redundancy: Duplicating data across multiple documents can lead to redundancy and increased storage requirements.
  • Complex Updates: Updating deeply nested structures can become complex and may require extensive manipulation of the document.

Use Cases

  • One-to-One Relationships: Where one document directly relates to another (e.g., user profile and user settings).
  • One-to-Many Relationships: For example, storing multiple comments under a single blog post where the comments are not too large.

Example of Embedded Document:

{
"_id": 1,
"name": "John Doe",
"address": {
"street": "123 Main St",
"city": "Anytown",
"state": "CA"
}
}
Screenshot-2024-06-01-161121
Post document

Query:

db.posts.findOne({"_id":ObjectId("665ef4b5b6034dde77877b86")})

Output

Screenshot-2024-06-04-163935
Embedded vs. Referenced Documents in MongoDB

2. Referenced Documents

Referenced documents store relationships by including a reference (usually an ObjectId) to another document stored in a different collection. This approach separates related data into distinct documents and collections.

Advantages

  • Flexibility: Allows for more flexible and normalized data models, making it easier to manage complex relationships.
  • Document Size Management: Keeps documents smaller, avoiding the 16 MB document size limit.
  • Reduced Redundancy: Reduces data duplication by storing related data in separate documents.

Disadvantages

  • Join Operations: Requires additional queries to fetch related data, which can impact read performance.
  • Consistency: Maintaining consistency across referenced documents can be challenging, especially without transactions.
  • Complex Queries: Queries involving multiple collections can become complex and may require careful indexing.

Use Cases

  • Many-to-Many Relationships: For example, a student can enroll in multiple courses, and each course can have many students. References are ideal for these relationships.
  • Large Subdocuments: If the data is large or accessed independently, referencing is a better option. For example, storing product reviews in a separate collection.

Users Collection:

{
"_id": 1,
"name": "John Doe",
"address_id": 1001
}

Addresses Collection:

{
"_id": 1001,
"street": "123 Main St",
"city": "Anytown",
"state": "CA"
}
Screenshot-2024-06-01-161823
Post document

Query:

db.posts.find({"_id":ObjectId("665ef4b5b6034dde77877b86")})

Output

Screenshot-2024-06-04-164822
Embedded vs. Referenced Documents in MongoDB
db.comments.find({})
Screenshot-2024-06-04-165140
Embedded vs. Referenced Documents in MongoDB

Choosing the Right Approach for Relationships in MongoDB

When deciding between embedded and referenced documents, consider the following factors:

1. Access Patterns:

  • If related data is frequently accessed together, embedding is more efficient.
  • If the data is accessed independently or only occasionally, referencing is better.

2. Data Size and Growth:

  • For smaller or growing datasets, embedding may be sufficient.
  • For large or expanding datasets, referencing helps manage document size and complexity.

3. Update Frequency:

  • Embedding is ideal when you need atomic updates to related data.
  • Referencing is better when different parts of the data are updated frequently and independently.

4. Schema Complexity:

  • Simple, hierarchical data models can benefit from embedding.
  • Complex, evolving relationships are better managed using referencing.

Conclusion

Choosing between embedded and referenced documents in MongoDB involves evaluating the specific needs of our application, including performance, data size, and complexity. Embedded documents offer simplicity and performance benefits for related data accessed together, while referenced documents provide flexibility and scalability for more complex relationships. By understanding the trade-offs and use cases for each approach, we can design efficient and scalable MongoDB schemas tailored to your application's requirements.


Next Article

Similar Reads