0% found this document useful (0 votes)
10 views

8.4 NoSQL Database

Uploaded by

newt67710
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

8.4 NoSQL Database

Uploaded by

newt67710
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 36

NoSQL Database

Prof Jigna Patel,


CSE Department, Institute of Technology
Nirma University

08/07/2024 Prof Jigna Patel


RDBMS
• Atomic

• Consistent

• Isolated

• Durable

08/07/2024 Prof Jigna Patel


Distributed Systems - Advantages
• Reliability (fault tolerance)
• Scalability
• Sharing of Resources
• Flexibility
• Speed
• Open system
• Performance

08/07/2024 Prof Jigna Patel


Distributed Systems - Disadvantages
• Troubleshooting
• Software
• Networking
• Security

08/07/2024 Prof Jigna Patel


Big Data Evolution
• Scalability

o Vertical Scaling

o Horizontal Scaling

08/07/2024 Prof Jigna Patel


What is NoSQL
 Class of database management systems (DBMS)
 "Not only SQL"
o Does not use SQL as querying language
o Distributed, fault-tolerant architecture
o No fixed schema (formally described structure)
o No joins (typical in databases operated with SQL)
 Expensive operation for combining records from
o two or more tables into one set
o Joins require strong consistency and fixed schemas
o Lack of these makes NoSQL databases more flexible
o It's not a replacement for a RDBMS but compliments it
08/07/2024 Prof Jigna Patel
Why NoSQL ?

08/07/2024 Prof Jigna Patel


Why NoSQL ?

08/07/2024 Prof Jigna Patel


RDBMS vs NoSQL
RDBMS NoSQL
- Structured and organized data - Stands for Not Only SQL
- Structured query language (SQL) - No declarative query language
- Data and its relationships are stored in separate - No predefined schema
tables. - Key-Value pair storage, Column Store,
- Data Manipulation Language, Data Definition Document Store, Graph databases
Language - Eventual consistency rather ACID property
- Tight Consistency - Unstructured and unpredictable data
- CAP Theorem
- Prioritizes high performance, high availability
and scalability
- BASE Transaction

08/07/2024 Prof Jigna Patel


History of NoSQL

• The term NoSQL was coined by Carlo Strozzi in the year 1998
• In the early 2009, when last.fm wanted to organize an event on open-source
distributed databases, Eric Evans, a Rackspace employee.
• In the same year, the "no:sql(east)" conference held in Atlanta, USA, NoSQL
was discussed and debated a lot
• And then, discussion and practice of NoSQL got a momentum, and NoSQL saw
an unprecedented growth.

08/07/2024 Prof Jigna Patel


CAP Theorem (Brewer’s Theorem)
• Consistency - This means that the data in the database remains
consistent after the execution of an operation. For example after an
update operation all clients see the same data.
• Availability - This means that the system is always on (service
guarantee availability), no downtime.
• Partition Tolerance - This means that the system continues to
function even the communication among the servers is unreliable, i.e.
the servers may be partitioned into multiple groups that cannot
communicate with one another.

08/07/2024 Prof Jigna Patel


08/07/2024 Prof Jigna Patel
• Consistency — A guarantee that every node in a distributed cluster
returns the same, most recent, successful write. Consistency refers to
every client having the same view of the data. There are various types
of consistency models. Consistency in CAP (used to prove the
theorem) refers to linearizability or sequential consistency, a very
strong form of consistency.

08/07/2024 Prof Jigna Patel


• Availability — Every non-failing node returns a response for all read
and write requests in a reasonable amount of time. The key word
here is every. To be available, every node on (either side of a network
partition) must be able to respond in a reasonable amount of time.

08/07/2024 Prof Jigna Patel


• Partition Tolerant — The system continues to function and upholds its
consistency guarantees in spite of network partitions. Network
partitions are a fact of life. Distributed systems guaranteeing partition
tolerance can gracefully recover from partitions once the partition
heals.

08/07/2024 Prof Jigna Patel


• CP (Consistent and Partition Tolerant) — At first glance, the CP
category is confusing, i.e., a system that is consistent and partition
tolerant but never available. CP is referring to a category of systems
where availability is sacrificed only in the case of a network partition.

08/07/2024 Prof Jigna Patel


• CA (Consistent and Available) — CA systems are consistent and
available systems in the absence of any network partition. Often a
single node's DB servers are categorized as CA systems. Single node
DB servers do not need to deal with partition tolerance and are thus
considered CA systems. The only hole in this theory is that single node
DB systems are not a network of shared data systems and thus do not
fall under the preview of CAP

08/07/2024 Prof Jigna Patel


• AP (Available and Partition Tolerant) — These are systems that are
available and partition tolerant but cannot guarantee consistency.

08/07/2024 Prof Jigna Patel


08/07/2024 Prof Jigna Patel
CAP Theorem
• CA - Single site cluster, therefore all nodes are always in contact.
When a partition occurs, the system blocks.
• CP -Some data may not be accessible, but the rest is still
consistent/accurate.
• AP - System is still available under partitioning, but some of the data
returned may be inaccurate.

08/07/2024 Prof Jigna Patel


NoSQL pros/cons

Advantages :
• High scalability
• Distributed Computing
• Lower cost
• Schema flexibility, semi-structure data
• No complicated Relationships

08/07/2024 Prof Jigna Patel


NoSQL pros/cons

Disadvantages :
• No standardization
• Limited query capabilities (so far)
• Eventual consistent is not intuitive to program for

08/07/2024 Prof Jigna Patel


The BASE

• A BASE system gives up on consistency.


• Basically Available indicates that the system does guarantee
availability, in terms of the CAP theorem.
• Soft state indicates that the state of the system may change over
time, even without input. This is because of the eventual consistency
model.
• Eventual consistency indicates that the system will become consistent
over time, given that the system doesn't receive input during that
time.

08/07/2024 Prof Jigna Patel


08/07/2024 Prof Jigna Patel
NoSQL Databases

• Key-value stores
• Column-oriented
• Graph
• Document oriented

08/07/2024 Prof Jigna Patel


Key-value stores
• Key-value stores are most basic types of NoSQL databases.
• Key value stores allow developer to store schema-less data.
• In the key-value storage, database stores data as hash table where each key is
unique and the value can be string, JSON, BLOB (Binary Large OBjec) etc.
• A key may be strings, hashes, lists, sets, sorted sets and values are stored
against these keys.
• For example a key-value pair might consist of a key like "Name" that is associated with a
value like "Robin".
• Key-Value stores can be used as collections, dictionaries, associative arrays etc.
• Key-Value stores follow the 'Availability' and 'Partition' aspects of CAP theorem.
• Key-Values stores would work well for shopping cart contents, or individual
values like color schemes, a landing page URI, or a default account number.
08/07/2024 Prof Jigna Patel
Example of Key-value store DataBase : Redis, Dynamo, Riak. etc.

08/07/2024 Prof Jigna Patel


Column-oriented databases
• Column-oriented databases primarily work on columns and every column is
treated individually.
• Values of a single column are stored contiguously.
• Column stores data in column specific files.
• In Column stores, query processors work on columns too.
• All data within each column data file have the same type which makes it ideal for
compression.
• Column stores can improve the performance of queries as it can access specific
column data.
• High performance on aggregation queries (e.g. COUNT, SUM, AVG, MIN, MAX).
• Works on data warehouses and business intelligence, customer relationship
management (CRM), Library card catalogs etc.
08/07/2024 Prof Jigna Patel
Example of Column-oriented databases : BigTable, Cassandra, SimpleDB etc.

08/07/2024 Prof Jigna Patel


Graph databases
• A graph database stores data in a graph.
• It is capable of elegantly representing any kind of data in a highly
accessible way.
• A graph database is a collection of nodes and edges
• Each node represents an entity (such as a student or business) and each
edge represents a connection or relationship between two nodes.
• Every node and edge are defined by a unique identifier.
• Each node knows its adjacent nodes.
• As the number of nodes increases, the cost of a local step (or hop)
remains the same.

08/07/2024 Prof Jigna Patel


Example of Graph databases : OrientDB, Neo4J, Titan.etc.

08/07/2024 Prof Jigna Patel


Document Oriented databases

• A collection of documents
• Data in this model is stored inside documents.
• A document is a key value collection where the key allows access to its
value.
• Documents are not typically forced to have a schema and therefore are
flexible and easy to change.
• Documents are stored into collections in order to group different kinds
of data.
• Documents can contain many different key-value pairs, or key-array
pairs, or even nested documents.
08/07/2024 Prof Jigna Patel
Example of Document Oriented databases : MongoDB, CouchDB etc

08/07/2024 Prof Jigna Patel


InclassExercise#1
• Find out an appropriate database store for innovative assignment
definition of Big Data System subject

08/07/2024 Prof Jigna Patel


Thank You

08/07/2024 Prof Jigna Patel


References
• https://rubygarage.org/blog/neo4j-database-guide-with-use-cases
• https://www.w3resource.com/mongodb/nosql.php
• http://
pages.di.unipi.it/turini/Basi%20di%20Dati/Slides/11.NoSQL-slides.pdf
• https://www.guru99.com/nosql-tutorial.html

08/07/2024 Prof Jigna Patel

You might also like