Unit II No-SQL Db Managment

The document provides an overview of NoSQL data management, covering various data models including aggregate, key-value, document, and graph databases. It explains the advantages of NoSQL over traditional relational databases, particularly in handling large volumes of data and real-time applications. Additionally, it discusses distribution models, materialized views, and the concept of schemaless databases, highlighting their significance in modern data management systems.

Uploaded by

daivshaladhepale

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

Unit II No-SQL Db Managment

Uploaded by

daivshaladhepale

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 33

UNIT- II NOSQL Data Management

• Introduction to NoSQL, aggregate data models,

key-value and document data models,
• relationships, graph databases, schema less
databases, materialized views, distribution
• models, master-slave replication
Introduction to NoSQL
What is NoSQL?
• NoSQL Database is a non-relational Data Management System, that does not
require a fixed schema. It avoids joins, and is easy to scale. The major
purpose of using a NoSQL database is for distributed data stores with
humongous datastorage needs. NoSQL is used for Big data and real-time web
apps. For example,companies like Twitter, Facebook and Google collect
terabytes of user data everysingle day.
• NoSQL databasestands for "Not Only SQL" or "Not SQL." Though a better
term would be "NoREL", NoSQL caught on. Carl Strozz introduced the NoSQL
concept in 1980
• Traditional RDBMS uses SQL syntax to store and retrieve data for further
insights.Instead, a NoSQL database system encompasses a wide range of
databasetechnologies that can store structured, semi-structured,
unstructured andpolymorphic data. Let's understand about NoSQL with a
diagram in this NoSQLdatabase tutorial
Why NoSQL?
• The concept of NoSQL databases became popular with Internet giants like
Google,Facebook, Amazon, etc. who deal with huge volumes of data. The
systemresponse time becomes slow when you use RDBMS for massive
volumes of data.To resolve this problem, we could "scale up" our systems by
upgrading ourexisting hardware. This process is expensive.The alternative for
this issue is to distribute database load on multiple hostswhenever the load
increases. This method is known as "scaling out."
• Brief History of NoSQL Databases
• 1998- Carlo Strozzi use the term NoSQL for his lightweight, open-source
relational database
• 2000- Graph database Neo4j is launched
• 2004- Google BigTable is launched
• 2005- CouchDB is launched
• 2007- The research paper on Amazon Dynamo is released
• 2008- Facebooks open sources the Cassandra project
• 2009- The term NoSQL was reintroduced
AGGREGATE DATA MODELS
• The term aggregate means a collection of objects that we use to treat as a unit.
An aggregate is a collection of data that we interact with as a unit. These units
of data or aggregates form the boundaries for ACID operation.
• Aggregate Data Models in NoSQL make it easier for the Databases to manage
data storage over the clusters as the aggregate data or unit can now reside on
any of the machines. Whenever data is retrieved from the Database all the data
comes along with the Aggregate Data Models in NoSQL.

• Aggregate Data Models in NoSQL don’t support ACID transactions and sacrifice
one of the ACID properties. With the help of Aggregate Data Models in NoSQL,
you can easily perform OLAP (Online Analytical Processing)operations on the
Database.
Example of Aggregate Data Model:
Aggregation:
• Customer Aggregate: Includes the customer’s details and billing addresses.
• Order Aggregate: Contains details about the order, including the shipping address, order items,
and payments.
Denormalization:
• In the example, the BillingAddress appears multiple times (in the customer and payment). This
avoids having to look up the address in a separate place and helps ensure that the address details
are consistent.
• This is a trade-off in NoSQL. While it may involve some duplication, it reduces the need for
complex joins and improves performance.
No Need for IDs in Aggregates:
• Instead of using IDs to reference addresses and other data, the full address information is
included directly in each aggregate. This simplifies data retrieval and ensures consistency.
Relationship Between Aggregates:
• The link between a customer and their orders is maintained through the CustomerID in the order
aggregate but is not part of the customer aggregate itself.
• Similarly, the ProductName is included in the order items for simplicity, but the actual product
Embed all the objects for customer and the customer’s orders Using the above data model
key-value and document data
models
• Key-value and document databases were strongly aggregate-oriented. these
databases as primarily constructed through aggregates. Both of these types of databases
consist of lots of aggregates with each aggregate having a key or ID that’s used to get at
the data
• The two models differ in that in a key-value database, the aggregate is opaque to the
database—just some big blob of mostly meaningless bits
• In practice, the line between key-value and document gets a bit blurry. People often
putan ID field in a document database to do a key-value style lookup. Databases
classified as key-value databases may allow you structures for data beyond just
an opaque aggregate.
• Forexample, Riak allows you to add metadata to aggregates for indexing and
interaggregate links,
• Redis allows you to break down the aggregate into lists or sets. You can support querying
byintegrating search tools such as Solr. As an example, Riak includes a search facility that
usesSolr-like searching on any aggregates that are stored as JSON or XML structure
• Data is stored in key/value pairs. It is designed in such a way to handle lots of
data and heavy load.Key-value pair storage databases store data as a hash table
where each key isunique, and the value can be a JSON, BLOB(Binary Large
Objects), string, etc.For example, a key-value pair may contain a key like
"Website" associated with avalue like "JavaTpoint".

• It is one of the most basic NoSQL database

example. This kind of NoSQL databaseis
used as a collection, dictionaries,
associative arrays, etc. Key value stores
helpthe developer to store schema-less
data.
• They work best for shopping
cartcontents.Redis, Dynamo, Riak are some
NoSQL examples of key-value store
DataBases.They are all based on Amazon's
Dynamo paper.
Document data models
• A Document Data Model is a lot different than other data models because it stores data in
JSON, BSON, or XML documents. in this data model, we can move documents under one
• document and apart from this, any particular elements can be indexed to run queries faster.
• Often documents are stored and retrieved in such a way that it becomes close to the data
objects
• which are used in many applications which means very less translations are required to use
• data in applications. JSON is a native language that is often used to store and query data too.
• So in the document data model, each document has a key-value pair below is an example for
the same.
{
"Name" : "Yashodhra",
"Address" : "Near Patel Nagar",
"Email" : "[email protected]",
"Contact" : "12345"
}
• Document-Oriented:
• Document-Oriented NoSQL DB stores and retrieves data as a key value
pair butthe value part is stored as a document. The document is stored in
JSON or XMLformats. The value is understood by the DB and can be
queried.
graph databases
• A graph database is a type of NoSQL database that is designed to handle data with
complex relationships and interconnections. In a graph database, data is stored as nodes
and edges,
• where nodes represent entities and edges represent the relationships between those
entities
The description of components are as follows:
Nodes: represent the objects or instances. They are equivalent to a row in database. The
node
basically acts as a vertex in a graph. The nodes are grouped by applying a label to each
member.
Relationships: They are basically the edges in the graph. They have a specific direction,
type
and form patterns of the data. They basically establish relationship between nodes.
Properties: They are the information associated with the node
• Graph-Based
• A graph type database stores entities as well the relations amongst those
entities.The entity is stored as a node with the relationship as edges. An
edge gives a relationship between nodes. Every node and edge has a
unique identifier
• Compared to a relational database where tables
are loosely connected, a Graphdatabase is a
multi-relational in nature.
• Traversing relationship is fast as they arealready
captured into the DB, and there is no need to
calculate them.
• Graph base database mostly used for social
networks, logistics, spatial data.Neo4J, Infinite
Graph, OrientDB, FlockDB are some popular
graph-baseddatabases.
Schemaless databases
• Schemaless databases, also known as schema-free or schema-less databases,
are a type of database management system (DBMS) that allows for flexible and
dynamic data modeling without rigidly predefined schemastabase.
• schemaless databases provide a more agile and adaptable approach for storing
and querying data.
• In schemaless databases, data is typically stored in a format that does not
require a predefined schema to be specified before storing the data.
• Each document or record within the database can have its own structure, and
the database system does not enforce a specific schema on the data.
• This means that different documents within the same collection or table can
have varying structures and fields.
How does a schemaless database work?
• In schemaless databases, information is stored in JSON-style documents
which can have
• varying sets of fields with different data types for each field. So, a
collection could look like
this:
{
name : “Joe”, age : 30, interests : ‘football’ }
{
name : “Kate”, age : 25
}
What are the benefits of using a schemaless database?
• Greater flexibility over data types:- schemaless databases can store, retrieve,
and query any data type — perfect for big data analytics and similar
operations that are powered
• No pre-defined database schemas
• No data truncation:-A schemaless database makes almost no changes to
your data
• Suitable for real-time analytics functions
• Enhanced scalability and flexibility
Materialized views
• Materialized views, also known as materialized or indexed views, are
database objects that store the results of a query in a precomputed and
persistent form. They are derived from one or more source tables or views
and are used to improve query performance by providing faster access to
frequently queried or complex data.
• In a traditional database system, queries often involve joining multiple tables
or performing complex calculations, which can be resource-intensive and
time-consuming.
• Materialized views address this issue by precomputing and storing the results
of such queries, allowing subsequent queries to retrieve the data directly
from the materialized view instead of reexecuting the original query
1. Data Storage: Materialized views store the actual result set of a query, typically as a
table-like structure in the database. The data in the materialized view is updated
periodically to reflect changes in the underlying source tables.
2. Query Performance: By storing precomputed results, materialized views eliminate
the need for executing complex queries repeatedly. This improves query performance
by reducing the processing and computation time required for data retrieval.
3. Data Aggregation and Joins: Materialized views are commonly used for aggregating
data or joining multiple tables to simplify and optimize complex queries. They can
store the aggregated or joined results, allowing for faster access to the desired data.
4. Maintenance and Refresh: Materialized views need to be maintained and
refreshed to reflect changes in the underlying data. Depending on the database
system, materialized views can be refreshed on a schedule or triggered by specific
events or updates to the source data.
5. Query Rewrite: Some database systems support automatic query rewrite, where
the optimizer recognizes queries that can be satisfied using a materialized view and
rewrites the query to use the materialized view instead. This further improves
performance by transparently utilizing the materialized view.
Distribution models
• Distribution models in database systems refer to strategies for distributing data across
multiple nodes or servers in a distributed computing environment.
• These models determine how data is partitioned and replicated to ensure availability, fault
tolerance, and efficient query processing.
• Here are some commonly used distribution models:
1. Horizontal Partitioning (Sharding):
2. Replication :-i)master-slave replication
The choice of a distribution model depends on factors such as the nature of the data, access
patterns, scalability requirements, fault tolerance goals, and performance considerations. It's
crucial to analyze the characteristics of the application and workload to determine the most
suitable distribution model for a given scenario.
MongoDB, and Apache Hadoop, provide mechanisms to implement these distribution
models
• Sharding

• Sharding involves dividing a large database into smaller, more manageable parts
called shards or partitions.
• Each shard contains a subset of the data and is stored on a separate node or server in
the distributed system.
• The sharding process typically involves selecting a shard key or partitioning key,
which determines how data is distributed across shards.
• The goal of sharding is to evenly distribute data to avoid bottlenecks and enable
horizontal scalability.
• Sharding is commonly used in NoSQL databases to handle large-scale datasets and
achieve better performance and scalability.
Replication
• Replication involves maintaining multiple copies (replicas) of data across different
nodes in the distributed database cluster.
• Each replica is an exact copy of the data stored on a separate server.
• Replication enhances data availability, fault tolerance, and read performance by
allowing data to be served from multiple replicas.
• Different replication models include master-slave replication and multi-master
replication.
• In master-slave replication, one node (master) accepts write operations and
asynchronously propagates changes to one or more replica nodes (slaves). Read queries
can be distributed among replicas, reducing the load on the master.
• In multi-master replication, multiple nodes can accept write operations, and changes
are replicated to other nodes. This approach allows for better write scaling and high
availability
Master-slave replication
Master-slave replication is a method of data replication in distributed database systems where
one node, called the master or primary node, serves as the authoritative source for data, and
one or more nodes, known as slave or secondary nodes, replicate and maintain copies of the
master's data.

In a master-slave replication setup, the master node handles write operations (inserts, updates,
deletes) and propagates those changes to the slave nodes. The slave nodes synchronize with
the master to receive and apply the changes, ensuring that they have an up-to-date copy of the
data. Slave nodes are typically read-only, meaning they do not accept write operations directly.
Important Questions
• What is NoSQL, and how does it differ from traditional relational databases?
• Why are NoSQL databases becoming increasingly popular in modern
applications?
• What are the key characteristics of NoSQL databases?
• Explain the concept of aggregate data models in NoSQL.
• Provide examples of aggregate data models in NoSQL.
• Describe the key-value data model and its structure.
• Explain the document data model and its structure.
• What is a graph database, and how does it differ from other NoSQL
databases?
• Describe the basic components of a graph database, including nodes, edges, and
properties.
• Explain the concept of schemaless databases.
• What are the benefits of using schemaless databases?
• What are the challenges of working with schemaless databases?
• What are materialized views in NoSQL databases?
• How do materialized views improve query performance?
• When should you consider using materialized views?
• Why are distribution models important in NoSQL databases?
• Explain different distribution models used in NoSQL databases, such as sharding and
replication.
• What are the factors to consider when choosing a distribution model?
• What is master-slave replication, and how does it work?
• When is master-slave replication suitable, and when might other replication
• Query: Retrieve all documents from a MongoDB collection where the
"status" field is "active".
• Query: Create a collection in MongoDB that aggregates orders by
customer ID.
• Insert a new document into a MongoDB collection representing a blog
post with title, content, and tags.
• create a Materialized views where find greater salary.

The Human Challenges of The Digital World
No ratings yet
The Human Challenges of The Digital World
27 pages
2022 Medical Equipment List With Price
83% (6)
2022 Medical Equipment List With Price
4 pages
6 Thinking Hats Lesson Plan
No ratings yet
6 Thinking Hats Lesson Plan
9 pages
BIG Data 2
No ratings yet
BIG Data 2
18 pages
Full Stack-Unit-Iii
No ratings yet
Full Stack-Unit-Iii
56 pages
Unit 2
No ratings yet
Unit 2
65 pages
Unit Ii - Nosql Databases
No ratings yet
Unit Ii - Nosql Databases
112 pages
Unit 2 Handouts
No ratings yet
Unit 2 Handouts
11 pages
NOSQL
No ratings yet
NOSQL
15 pages
NoSQL_Notes
No ratings yet
NoSQL_Notes
11 pages
Lec 15 Notes
No ratings yet
Lec 15 Notes
3 pages
NOSQL
No ratings yet
NOSQL
25 pages
Unit 6
No ratings yet
Unit 6
143 pages
Lecture 3.1.2
No ratings yet
Lecture 3.1.2
47 pages
Features of Nosql: Non-Relational
No ratings yet
Features of Nosql: Non-Relational
7 pages
NoSQL (1)
No ratings yet
NoSQL (1)
12 pages
Learning Guide 2.1 - CloudDatabase - NOSQL PDF
No ratings yet
Learning Guide 2.1 - CloudDatabase - NOSQL PDF
44 pages
Module 5_NoSQL databases
No ratings yet
Module 5_NoSQL databases
33 pages
Chapter 1 - Introducing Big Data & NoSQL
No ratings yet
Chapter 1 - Introducing Big Data & NoSQL
14 pages
Cs 620 / Dasc 600 Introduction To Data Science & Analytics: Lecture 6-Nosql
No ratings yet
Cs 620 / Dasc 600 Introduction To Data Science & Analytics: Lecture 6-Nosql
31 pages
Lecture 1 - NoSQL
No ratings yet
Lecture 1 - NoSQL
31 pages
Unit 5_230601_174540-1
No ratings yet
Unit 5_230601_174540-1
14 pages
NOSQL Lecture 1 Notes
No ratings yet
NOSQL Lecture 1 Notes
31 pages
Lecture 1
No ratings yet
Lecture 1
31 pages
Introduction To Nosql: What Is A Nosql Database Used For?
No ratings yet
Introduction To Nosql: What Is A Nosql Database Used For?
6 pages
No SQL
No ratings yet
No SQL
38 pages
DBMS Da 2 (19bce1668)
No ratings yet
DBMS Da 2 (19bce1668)
8 pages
NOsql Presentation
No ratings yet
NOsql Presentation
20 pages
No SQL Database Compiled
No ratings yet
No SQL Database Compiled
20 pages
Bcse302l Dbms Module-7 Nosql
No ratings yet
Bcse302l Dbms Module-7 Nosql
30 pages
BIG DATA UNIT-II NOTES
No ratings yet
BIG DATA UNIT-II NOTES
7 pages
NoSQL Tutorial - New
No ratings yet
NoSQL Tutorial - New
10 pages
Chapter14_BigData&NoSQLDatabases
No ratings yet
Chapter14_BigData&NoSQLDatabases
39 pages
NoSQL Database
No ratings yet
NoSQL Database
8 pages
Unit No 1
No ratings yet
Unit No 1
34 pages
AWS1-1
No ratings yet
AWS1-1
38 pages
Introduction To Nosql: - Key Value Databases
No ratings yet
Introduction To Nosql: - Key Value Databases
14 pages
Unit 2
No ratings yet
Unit 2
23 pages
NoSQL Database
No ratings yet
NoSQL Database
10 pages
Unit 2
No ratings yet
Unit 2
26 pages
NoSql 2024 Assign2
No ratings yet
NoSql 2024 Assign2
189 pages
What Is NoSQL
No ratings yet
What Is NoSQL
4 pages
More Details On Data Models
No ratings yet
More Details On Data Models
23 pages
No SQL
No ratings yet
No SQL
38 pages
MongoDB Slides Until ClassTest
No ratings yet
MongoDB Slides Until ClassTest
221 pages
Non Relational Database-NoSQL
No ratings yet
Non Relational Database-NoSQL
4 pages
DSA 4-Introduction To NoSQL
No ratings yet
DSA 4-Introduction To NoSQL
59 pages
Unit 3 NoSQL
No ratings yet
Unit 3 NoSQL
98 pages
Nosql Database: Abstract
No ratings yet
Nosql Database: Abstract
6 pages
DBMS Unit 5 Notes
No ratings yet
DBMS Unit 5 Notes
57 pages
CH.5 NOSQL database for Business Applications
No ratings yet
CH.5 NOSQL database for Business Applications
21 pages
Unit 2 BDA
No ratings yet
Unit 2 BDA
32 pages
Ca23301-Full Stack Web Development Unit-III
No ratings yet
Ca23301-Full Stack Web Development Unit-III
61 pages
Lesson 2 Unstructured Data
No ratings yet
Lesson 2 Unstructured Data
33 pages
NoSQL DATABSES
No ratings yet
NoSQL DATABSES
12 pages
Non-Relational Databases (NoSQL)
No ratings yet
Non-Relational Databases (NoSQL)
15 pages
Nosql Database
No ratings yet
Nosql Database
19 pages
Bda Unit-2
No ratings yet
Bda Unit-2
29 pages
Big Data Analytics Unit-2
No ratings yet
Big Data Analytics Unit-2
30 pages
HBase
No ratings yet
HBase
36 pages
Unit 3
No ratings yet
Unit 3
10 pages
NOSQL
No ratings yet
NOSQL
55 pages
DBMS MASTER: Become Pro in Database Management System
From Everand
DBMS MASTER: Become Pro in Database Management System
Ummed Singh
No ratings yet
Internship Report
No ratings yet
Internship Report
15 pages
Infrastructure
No ratings yet
Infrastructure
6 pages
BI 0 - Selectors 163016
No ratings yet
BI 0 - Selectors 163016
51 pages
Development of Face - and Palate
No ratings yet
Development of Face - and Palate
60 pages
List of Preferred Brand
No ratings yet
List of Preferred Brand
8 pages
English Science: Araling Panlipunan Edukasyon Sa Pagpapakatao
No ratings yet
English Science: Araling Panlipunan Edukasyon Sa Pagpapakatao
2 pages
Evaluation of Quality Management System
No ratings yet
Evaluation of Quality Management System
8 pages
0 Out of 2 Points: An Example of An Accounting Record Control To Address The Risk of Inaccurately
No ratings yet
0 Out of 2 Points: An Example of An Accounting Record Control To Address The Risk of Inaccurately
29 pages
Chapter 5 Transport Geography Shipping Routes - Major Ports
No ratings yet
Chapter 5 Transport Geography Shipping Routes - Major Ports
17 pages
Sculpture: By: Group 1
No ratings yet
Sculpture: By: Group 1
7 pages
What Are The 14 Punctuation Marks in English Grammar
100% (1)
What Are The 14 Punctuation Marks in English Grammar
3 pages
IDL Cheatsheet PDF
100% (1)
IDL Cheatsheet PDF
4 pages
Shad Bala - Six Fold Potency in Vedic Astrology With A Case Study
100% (3)
Shad Bala - Six Fold Potency in Vedic Astrology With A Case Study
35 pages
Issues in Philosophical Counseling Peter B Raabe
No ratings yet
Issues in Philosophical Counseling Peter B Raabe
6 pages
DOC-20240701-WA0003.
No ratings yet
DOC-20240701-WA0003.
7 pages
ClientRequest CSUMGB 20240816 02
No ratings yet
ClientRequest CSUMGB 20240816 02
3 pages
Lesson 3 Bio - As - 8 - 2 - Nir - Dna - History
No ratings yet
Lesson 3 Bio - As - 8 - 2 - Nir - Dna - History
5 pages
Automated Portable Hammering Machine: Components
No ratings yet
Automated Portable Hammering Machine: Components
2 pages
Photoshop Billing Introduction
No ratings yet
Photoshop Billing Introduction
18 pages
File Transfer Protocol
No ratings yet
File Transfer Protocol
63 pages
Territories of The Soul by Nadia Ellis
No ratings yet
Territories of The Soul by Nadia Ellis
59 pages
Sample Executive Leadership Development Plan Example
No ratings yet
Sample Executive Leadership Development Plan Example
2 pages
Simple Past Tense
No ratings yet
Simple Past Tense
10 pages
K-MAX-Plus Service: For Improved Temporary Fluid Loss Control
No ratings yet
K-MAX-Plus Service: For Improved Temporary Fluid Loss Control
2 pages
Jeopardy PPTM
No ratings yet
Jeopardy PPTM
148 pages
EAC Final
No ratings yet
EAC Final
17 pages
Nokia 1200/1208/1209 User's Guide
No ratings yet
Nokia 1200/1208/1209 User's Guide
42 pages

Unit II No-SQL Db Managment

Uploaded by

Unit II No-SQL Db Managment

Uploaded by

UNIT- II NOSQL Data Management

• Introduction to NoSQL, aggregate data models,

• It is one of the most basic NoSQL database

You might also like