DBMS-Module 5
DBMS-Module 5
MongoDB is a NoSQL document database designed for storing, querying, and managing large amounts of data in
a flexible, scalable, and schema-less manner.
Unlike traditional relational databases like MySQL or PostgreSQL, MongoDB stores data in BSON (Binary
JSON) format, which allows for a more dynamic and document-oriented structure.
BSON stands for Binary JSON. It is a binary-encoded serialization format used by MongoDB to store data.
BSON is designed to be lightweight, efficient, and easy to traverse, enabling fast data interchange between the
application and the database.
1. Schema-less: MongoDB allows storing key-value pairs without enforcing a strict schema,
making it easy to adapt to changing data requirements.
2. Embedded Data: Values can include arrays or embedded documents for hierarchical or nested
data storage.
3. Distributed System: MongoDB supports sharding, enabling distributed storage of key-value
pairs across multiple servers.
4. MongoDB uses a powerful query language that supports:
1. Exact Matches: Retrieve a value by its key.
2. Range Queries: Search for keys within a range.
3. Partial Match: Find documents with specific fields or subfields in the value.
Document Databases:
A document database is a type of NoSQL database that stores data in the form of documents. These documents
are JSON-like objects, typically with a flexible schema, making them ideal for handling unstructured or semi-
structured data.
MongoDB is one of the most popular document-oriented databases and is widely used in modern applications.
Databases, collections, documents are important parts of MongoDB without them you are not able to store
data on the MongoDB server. A Database contains one or more collections, and a collection contains
documents and the documents contain data, they are related to each other.
In MongoDB, a database contains the collections of documents. One can create multiple databases on the
MongoDB server.
Collections are just like tables in relational databases, they also store data, but in the form of documents. A
single database is allowed to store multiple collections.
In MongoDB, the data records are stored as BSON documents. Here, BSON stands for binary
representation of JSON documents, although BSON contains more data types as compared to JSON. The
document is created using field-value pairs or key-value pairs and the value of the field can be of any
BSON type.
Syntax:
{
field1: value1
field2: value2
....
fieldN: valueN
}
Types of Databases
1. Embedded Databases
Definition:
In an embedded database model, related data is stored within the same document as nested sub-documents
or arrays.
This approach is useful when the related data is frequently accessed together.
Example:
Example:
Consider an e-commerce order system where an order contains multiple items. In the embedded model, the
order and its items are stored in a single document:
{
"_id": "order1",
"customer": "Alice",
"date": "2025-01-13",
"items": [
{ "product": "Laptop", "quantity": 1, "price": 50000 },
{ "product": "Mouse", "quantity": 2, "price": 1500 }
]
}
2. Normalized Databases
Definition:
In a normalized database model, related data is stored in separate collections, and relationships are
represented using references (foreign keys).
This approach reduces data duplication and improves maintainability.
Example:
Orders Collection:
{
"_id": "order1",
"customer": "Alice",
"date": "2025-01-13",
"items": ["item1", "item2"]
}
Items Collection:
[
{ "_id": "item1", "product": "Laptop", "quantity": 1, "price": 50000 },
{ "_id": "item2", "product": "Mouse", "quantity": 2, "price": 1500 }
]
Many applications today do not have a defined data format or structure since new types of information are
constantly being added in different data types; emails, social media posts, customer reviews are all examples that
show the necessity of a flexible of flexible schema. In these cases, each data record may have different elements
such as text, images, hashtags, location data, emojis, etc.
Document databases are incredibly flexible and can accommodate these kind of data, and their unusual nature.
While in relational databases, you often end up with many null values for optional columns, fields that don't have
a value simply do not need to be included in the document.
High Scalability
As your applications grow with higher read/write operations and larger data, scalability becomes an important
factor to consider since your original set up and resources – CPU, RAM, hard disk etc. – may not be able to
handle the increased load.
One significant advantage of document databases over traditional relational databases is their ability to scale
horizontally (also known as "sharding"), which is the ability to add more servers (nodes) to your database cluster
to handle increased traffic and storage needs. This option, in contrast to vertical scaling, is more cost-effective
and offers better performance.
Both relational and non-relational databases have the option to scale vertically where you increase the
computational resources based on your needs. Most times, however, the performance and costs of vertical scaling
do not scale linearly - you might reach a point of diminishing returns where more resources do not necessary lead
to an equal increase in performance. In such cases, you might need to scale horizontally by adding more servers
to your database cluster. Moreover, even though it's possible, it's quite challenging and complex to scale
horizontally in relational databases due to the presence of multiple related data across nodes.
Performance
The two previous benefits mentioned above (flexible schema and high scalability) culminates in document
databases being highly-performant, particularly when working with nested objects and documents; you can easily
query and update nested objects in a single atomic operation. Applications where this can be a huge advantage
include content management systems, social media apps, real-time analytics, IoT applications, and any use case
where you need to handle numerous data types and structures.
With the possibility of horizontal scaling, document databases can handle large amounts of data and high traffic
loads by just spreading them across multiple distributed nodes. And since related object data are stored in a single
document and no need for complex JOIN operations, along with the chance to create indexes for any field - even
in a nested object - data retrieval is so much faster.
Consistency in the context of document-oriented databases (like MongoDB) refers to the guarantee that the database will
always be in a valid state after any operation. Essentially, it means that after a transaction or update, the data will be accurate,
reliable, and reflect the intended changes correctly.
MongoDB ensures consistency through ACID properties (Atomicity, Consistency, Isolation, Durability) when handling single-
document operations, replica sets, and multi-document transactions.
In MongoDB, each document is treated as an atomic unit. This means that when you update a document, either the entire
document gets updated, or nothing changes at all.
Example: Let’s say you're updating an order document to modify the quantity of an item:
db.orders.updateOne(
{ "_id": "order1" },
{ $set: { "items.0.quantity": 2 } }
);
This update ensures that the entire document is consistent. If the operation is successful, the updated quantity is reflected in the
document. If the operation fails, no changes are made, ensuring the document remains in a consistent state.
JSON-like Document Format (BSON in MongoDB)
Benefit: Storing data in a JSON-like format allows for easy mapping between the structure of
the database and the data structures used in many programming languages (e.g., JavaScript
objects or Python dictionaries). This results in simpler data management and better integration
with applications that already use JSON for data transfer.
"_id": 1,
"author": {
"email": "[email protected]"
},
"published_at": "2025-01-09"
Feature: Document databases offer rich indexing and query capabilities. Indexes can be created
on any field, including nested fields within documents. They can support text search, range
queries, and geospatial queries.
Benefit: The ability to index both top-level fields and nested fields ensures fast retrieval of
documents based on specific criteria, making it efficient to query large volumes of data.
Example: If you need to find all blog posts written by a specific author or retrieve all products
with a certain rating, you can create indexes on fields like author.name or reviews.rating, which
makes such queries faster.
Feature: Document databases like MongoDB provide built-in replication features, which
automatically replicate data across multiple nodes or data centers for fault tolerance and high
availability.
Benefit: This ensures that data is always available, even if some servers fail, and helps in load
balancing by distributing read operations across replicas.
Example: In MongoDB, you can set up replica sets, where data is automatically replicated
across multiple servers, ensuring high availability and automatic failover if a primary server goes
down.
Feature Benefit
Schema-less/Flexible Easily accommodates varying content and changing data
Schema structures without requiring schema migrations.
JSON-like Document Easy integration with web technologies, no need for
Format transformations when handling structured data.
Nested Data Structures Supports complex, hierarchical data like arrays or
subdocuments, reducing the need for joins.
Indexing & Querying Fast and flexible queries with indexing on any document
field, including nested fields.
Aggregation Framework Supports complex data transformations like grouping,
filtering, and statistical analysis.
Horizontal Scalability Enables distributed storage and load balancing for large
(Sharding) datasets and high traffic applications.
Atomic Operations Ensures consistent updates to documents without affecting
the entire database.
Replication & High Built-in fault tolerance and high availability through data
Availability replication.
JSON Compatibility Direct compatibility with JSON-based applications, APIs,
and frontend frameworks.
author: {
nationality: "British",
},
isbn: "978-0618640157",
number_of_pages: 1178,
has_movie_adaptation: true,
movie_adaptation: {
},
}
1. Why is MongoDB a good fit for e-commerce applications, especially when
dealing with large catalogs, product variations, and customer data?
MongoDB is an excellent choice for e-commerce applications due to its
ability to handle the complexity and scalability requirements that arise when
dealing with large catalogs, product variations, and customer data. Here are
several key reasons why MongoDB is well-suited for these use cases:
1. Schema Flexibility for Complex Product Data
Product Catalogs: E-commerce platforms typically have large and diverse
catalogs of products, each with varying attributes (e.g., size, color, material,
price, description). MongoDB’s schema-less design allows for flexible storage
of product data, where each product can have different fields depending on its
type (e.g., electronics may have different attributes than clothing).
o Benefit: You don't need to define a rigid schema for all products in
advance. New product categories or attributes can be added easily
without disrupting the existing data structure, making the platform more
adaptable to new products or business needs.
o Example: A clothing item could have attributes like color, size, and
material, while an electronic item might include brand, warranty, and
features. MongoDB allows for products in the same collection to have
varying structures.
2. Handling Product Variations
Product Variations: Many products in an e-commerce store come in multiple
variations, such as different sizes, colors, or configurations (e.g., a T-shirt
available in different sizes and colors, or a smartphone with various storage
options).
o Benefit: MongoDB makes it easy to store these variations as embedded
documents or arrays within a single product document. This reduces the
need for separate tables or complex join operations to retrieve all the
variations of a product.
o Example: For a product like a T-shirt, the document could contain an
array of sub-documents, each representing a variation:
{
"_id": "12345",
"name": "Classic T-shirt",
"price": 20.00,+
"variations": [
{"size": "S", "color": "Red", "stock": 100},
{"size": "M", "color": "Blue", "stock": 50},
{"size": "L", "color": "Black", "stock": 30}
]
}
Rich Product Data: E-commerce products often include complex data, such
as product specifications, reviews, ratings, and pricing tiers. MongoDB allows
you to store these data points in a nested structure within a single document.
o Benefit: This structure avoids the need for multiple tables or foreign
key relationships, simplifying queries and ensuring better performance
when fetching complex product data in a single call.
"_id": "12345",
"reviews": [
"specifications": {
db.products.find({
"rating": { $gte: 4 },
"category": "electronics"
});
"_id": "customer123",
"orders": [
],