Chapter 3 NoSQL Database (1)
Chapter 3 NoSQL Database (1)
1
Dr. Pooja K R
Agenda
• Introduction to NoSQL
• What is NoSQL
2
Dr. Pooja K R
Introduction to NoSQL Databases
3
Dr. Pooja K R
Different SQL Databases
4
Dr. Pooja K R
What is NoSQL?
5
Dr. Pooja K R
Limitations of Relational databases
•Need to define structure and schema of data first and then
only we can process the data.
6
Dr. Pooja K R
Advantages of NoSQL
•High scalability
•High Availability
7
Dr. Pooja K R
RDBMS Vs NoSQL
• RDBMS: It is a structured data that provides more functionality but
gives less performance.
8
Dr. Pooja K R
NOSQL DATABASES
9
Dr. Pooja K R
What is NoSQL?
• More than rows in tables
• Free of joins
• Schema-free
• Innovative
10
Dr. Pooja K R
NoSQL Database Categories
•Document Database
•Graph store
11
Dr. Pooja K R
NoSQL Data Architecture Patterns
12
Dr. Pooja K R
NOSQL BUSINESS DRIVERS
VOLUME
VELOCITY
VARIABILITY
AGILITY
13
Dr. Pooja K R
What is the CAP Theorem?
1. Consistency
2. Availability
3. Partition Tolerance
14
Dr. Pooja K R
BASE Properties
15
Dr. Pooja K R
BASE Properties
Soft state: The state of the system could change over time.
16
Dr. Pooja K R
NoSQL Database Categories
•Document Database
•Graph store
17
Dr. Pooja K R
NoSQL Data Architecture Patterns
18
Dr. Pooja K R
Data Models
NoSQL databases are classified in four major data
models :
19
Dr. Pooja K R
Key-value
Simplest NOSQL databases
• Get(key)
• Multi-get(Key1, Key2,….Keyn)
• Delete(key)
21
Dr. Pooja K R
KEY VALUE STORE PROS
Consistent
Scalable
Reliable
No queries on values.
23
Dr. Pooja K R
Key Value Stores
24
Dr. Pooja K R
Key Value Stores
25
Dr. Pooja K R
Document-Based Store NoSQL
•In this type of database, the record and its associated data are stored
in a single document.
26
Dr. Pooja K R
DOCUMENT STORES
The central concept of a document-oriented database is the notion
of a document.
Every object, even those of the same class, can look very different.
27
Dr. Pooja K R
DOCUMENT STORES
28
Dr. Pooja K R
DOCUMENT STORES
29
Dr. Pooja K R
Document-Based Store NoSQL
•The document type is mostly used for CMS systems, blogging
platforms, real-time analytics & e-commerce applications. It should not
use for complex transactions which require multiple operations or
queries against varying aggregate structures.
30
Dr. Pooja K R
Example:
•The examples of databases using the above data model are MongoDB
and Couchbase.
individually.
All data within each column data file have the same type which makes it ideal for
compression.
Column stores can improve the performance of queries as it can access specific
column data.
High performance on aggregation queries (e.g. COUNT, SUM, AVG, MIN, MAX).
33
Dr. Pooja K R
GRAPH DATABASES
A graph database stores data in a graph.
As the number of nodes increases, the cost of a local step (or hop)
remains the same.
Index for lookups.
Example of Graph databases: OrientDB, Neo4J, Titan.etc.
38
Dr. Pooja K R
GRAPH STORES
39
Dr. Pooja K R
GRAPH STORES
40
Dr. Pooja K R
Analyzing big data with a shared-nothing architecture
41
Dr. Pooja K R
Analyzing big data with a shared-nothing architecture
42
Dr. Pooja K R
Analyzing big data with a shared-nothing architecture
43
Dr. Pooja K R
Analyzing big data with a shared-nothing architecture
•The advantages of SN architecture versus a central entity that controls
the network (a controller-based architecture) include eliminating any
single point of failure, allowing self-healing capabilities and providing an
advantage with offering non-disruptive upgrades.
•SN system can scale almost infinitely simply by adding nodes in the
form of inexpensive computers, since there is no single bottleneck to
slow the system down.
45
Dr. Pooja K R
Master-slave versus peer-to-peer
• This node keeps a database of all the other nodes in the cluster and
the rules for distributing requests to each node.
• In the peer-to-peer model stores all the information about the cluster
on each node in the cluster.
•If any node crashes, the other nodes can take over and processing
can continue.
46
Dr. Pooja K R
Choosing distribution models: master-slave versus peer-to-
peer
• Peer-to-peer systems distribute the responsibility of the master to
each node in the cluster.
• In this situation, testing is much easier since you can remove any
node in the cluster and the other nodes will continue to function.
•The disadvantage of peer-to-peer networks is that there’s an
increased complexity and communication overhead that must occur for
all nodes to be kept up to date with the cluster status.
47
Dr. Pooja K R
Master Slave Distribution Model
•With a master-slave distribution model, the role of managing the
cluster is done on a single master node.
•This node can run on specialized hardware such as RAID drives to
lower the probability that it crashes.
•The cluster can also be configured with a standby master that’s
continually updated from the master node.
•The challenge with this option is that it’s difficult to test the standby
master without jeopardizing the health of the cluster.
•Failure of the standby master to take over from the master node is a
real concern for high-availability operations.
48
Dr. Pooja K R
NoSQL systems to handle big data problems
49
Dr. Pooja K R
Case Study:
50
Dr. Pooja K R
Thank You!
([email protected])
51
Dr. Pooja K R