Unit II No-SQL Db Managment
Unit II No-SQL Db Managment
• Aggregate Data Models in NoSQL don’t support ACID transactions and sacrifice
one of the ACID properties. With the help of Aggregate Data Models in NoSQL,
you can easily perform OLAP (Online Analytical Processing)operations on the
Database.
Example of Aggregate Data Model:
Aggregation:
• Customer Aggregate: Includes the customer’s details and billing addresses.
• Order Aggregate: Contains details about the order, including the shipping address, order items,
and payments.
Denormalization:
• In the example, the BillingAddress appears multiple times (in the customer and payment). This
avoids having to look up the address in a separate place and helps ensure that the address details
are consistent.
• This is a trade-off in NoSQL. While it may involve some duplication, it reduces the need for
complex joins and improves performance.
No Need for IDs in Aggregates:
• Instead of using IDs to reference addresses and other data, the full address information is
included directly in each aggregate. This simplifies data retrieval and ensures consistency.
Relationship Between Aggregates:
• The link between a customer and their orders is maintained through the CustomerID in the order
aggregate but is not part of the customer aggregate itself.
• Similarly, the ProductName is included in the order items for simplicity, but the actual product
Embed all the objects for customer and the customer’s orders Using the above data model
key-value and document data
models
• Key-value and document databases were strongly aggregate-oriented. these
databases as primarily constructed through aggregates. Both of these types of databases
consist of lots of aggregates with each aggregate having a key or ID that’s used to get at
the data
• The two models differ in that in a key-value database, the aggregate is opaque to the
database—just some big blob of mostly meaningless bits
• In practice, the line between key-value and document gets a bit blurry. People often
putan ID field in a document database to do a key-value style lookup. Databases
classified as key-value databases may allow you structures for data beyond just
an opaque aggregate.
• Forexample, Riak allows you to add metadata to aggregates for indexing and
interaggregate links,
• Redis allows you to break down the aggregate into lists or sets. You can support querying
byintegrating search tools such as Solr. As an example, Riak includes a search facility that
usesSolr-like searching on any aggregates that are stored as JSON or XML structure
• Data is stored in key/value pairs. It is designed in such a way to handle lots of
data and heavy load.Key-value pair storage databases store data as a hash table
where each key isunique, and the value can be a JSON, BLOB(Binary Large
Objects), string, etc.For example, a key-value pair may contain a key like
"Website" associated with avalue like "JavaTpoint".
• Sharding involves dividing a large database into smaller, more manageable parts
called shards or partitions.
• Each shard contains a subset of the data and is stored on a separate node or server in
the distributed system.
• The sharding process typically involves selecting a shard key or partitioning key,
which determines how data is distributed across shards.
• The goal of sharding is to evenly distribute data to avoid bottlenecks and enable
horizontal scalability.
• Sharding is commonly used in NoSQL databases to handle large-scale datasets and
achieve better performance and scalability.
Replication
• Replication involves maintaining multiple copies (replicas) of data across different
nodes in the distributed database cluster.
• Each replica is an exact copy of the data stored on a separate server.
• Replication enhances data availability, fault tolerance, and read performance by
allowing data to be served from multiple replicas.
• Different replication models include master-slave replication and multi-master
replication.
• In master-slave replication, one node (master) accepts write operations and
asynchronously propagates changes to one or more replica nodes (slaves). Read queries
can be distributed among replicas, reducing the load on the master.
• In multi-master replication, multiple nodes can accept write operations, and changes
are replicated to other nodes. This approach allows for better write scaling and high
availability
Master-slave replication
Master-slave replication is a method of data replication in distributed database systems where
one node, called the master or primary node, serves as the authoritative source for data, and
one or more nodes, known as slave or secondary nodes, replicate and maintain copies of the
master's data.
In a master-slave replication setup, the master node handles write operations (inserts, updates,
deletes) and propagates those changes to the slave nodes. The slave nodes synchronize with
the master to receive and apply the changes, ensuring that they have an up-to-date copy of the
data. Slave nodes are typically read-only, meaning they do not accept write operations directly.
Important Questions
• What is NoSQL, and how does it differ from traditional relational databases?
• Why are NoSQL databases becoming increasingly popular in modern
applications?
• What are the key characteristics of NoSQL databases?
• Explain the concept of aggregate data models in NoSQL.
• Provide examples of aggregate data models in NoSQL.
• Describe the key-value data model and its structure.
• Explain the document data model and its structure.
• What is a graph database, and how does it differ from other NoSQL
databases?
• Describe the basic components of a graph database, including nodes, edges, and
properties.
• Explain the concept of schemaless databases.
• What are the benefits of using schemaless databases?
• What are the challenges of working with schemaless databases?
• What are materialized views in NoSQL databases?
• How do materialized views improve query performance?
• When should you consider using materialized views?
• Why are distribution models important in NoSQL databases?
• Explain different distribution models used in NoSQL databases, such as sharding and
replication.
• What are the factors to consider when choosing a distribution model?
• What is master-slave replication, and how does it work?
• When is master-slave replication suitable, and when might other replication
• Query: Retrieve all documents from a MongoDB collection where the
"status" field is "active".
• Query: Create a collection in MongoDB that aggregates orders by
customer ID.
• Insert a new document into a MongoDB collection representing a blog
post with title, content, and tags.
• create a Materialized views where find greater salary.