0% found this document useful (0 votes)

6 views

Introduction to NoSQL

NoSQL is a non-relational database designed for managing unstructured data, emphasizing scalability, performance, and agility. It addresses the limitations of traditional relational databases by supporting flexible data structures and real-time processing, making it suitable for modern applications. Key features include distributed architecture, schema-less models, and the ability to handle large volumes of data efficiently.

Uploaded by

sakinabohra0909

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views

Introduction to NoSQL

Uploaded by

sakinabohra0909

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 13

Introduction to NoSQL :

NoSQL (Not Only SQL) is a non-relational database used to manage unstructured data. It is a
distributed database system designed to work in virtual environments, providing mechanisms
for data storage and retrieval with a focus on scalability, high performance, availability, and
agility.

It was developed in response to the need to store a large volume of user-related data. NoSQL
databases are designed to scale easily and to handle products and objects that need to be
frequently accessed, updated, and changed, keeping up with the needs of the modern
industry.

Limitations of Traditional Relational Databases:

Relational databases:

 Are not designed to handle frequent changes or unstructured data.

 Do not take advantage of cheap storage and processing power from commodity hardware.
 Are less agile in handling big data and dynamic applications.

Key Features of NoSQL:

1. Not Only SQL – SQL and other query languages can be used.
2. Non-relational and schema-free – No fixed structure required.
3. No JOINs – Avoids complex join operations.
4. Distributed architecture – Runs on multiple processors/nodes.
5. Horizontally scalable – Add more machines instead of upgrading one.
6. Open-source options – Many available for free.
7. Easy data replication – For better performance and backup.
8. Simple API usage – Easy to implement.
9. Handles huge volumes of data – Efficient at big data processing.
10. Can be run on commodity hardware – Follows shared nothing concept.

Why NoSQL? :

A traditional database model is not suitable for all types of applications, especially those with:

 Unstructured or unpredictable data

 Need for easy scalability
 Real-time processing

NoSQL fits this need perfectly because of its:

 High performance
 Flexible structure
 Scalability
 Capability to handle dynamic data

Although NoSQL may not provide full ACID (Atomicity, Consistency, Isolation, Durability)
properties, it guarantees BASE properties:

 Basically Available
 Soft State
 Eventually Consistent

This is achieved through its distributed and fault-tolerant architecture.

CAP Theorem (Brewer’s Theorem) :

CAP Theorem says a distributed system cannot guarantee all three of the following at the
same time:

 Consistency – All nodes show the same data at the same time.
 Availability – Every request gets a response (success or failure).
 Partition Tolerance – System continues working despite network failure.

NoSQL often compromises consistency in favor of availability and partition tolerance.

BASE Transactions (Opposite of ACID) :

BASE stands for:

 Basically Available – System responds to every request, even if the data is not consistent.
 Soft State – System state can change over time even without input (due to eventual
consistency).
 Eventually Consistent – All changes will eventually reflect across all nodes, but not
immediately.

Characteristics of BASE:

 Weak consistency (stale data is okay)

 Focus on availability
 Best effort system
 Approximate answers are acceptable
 Optimistic in design
 Simpler and faster than ACID systems

BASE Case Scenarios:

 If data is consistent and available with no partition, then data is replicated and available in
both servers (A and B).
 If data is available and partitioned, then it's not consistent. Example: Server A has new
data, B has old.
 If data is consistent and partitioned, then it may not be available (B is waiting for update
from A).

Examples of NoSQL Implementations :

There are around 150 NoSQL databases in the market. Some popular ones include:

 Google BigTable
 Apache Hadoop
 MapReduce
 SimpleDB
 MemcacheDB

NoSQL Business drivers :

Today’s businesses need fast, scalable, and always-available data storage systems. Traditional
relational database systems (RDBMS), which work on a single CPU, often fail to keep up with
the increasing demands of data processing, speed, and variety of data. This is where NoSQL
databases come in.

Businesses today need to:

 Handle large and variable amounts of data

 Make quick decisions based on real-time data
 Be flexible with changing data types and needs

NoSQL addresses these needs through four major business drivers:

1. Volume :

Organizations now generate huge volumes of data. RDBMS systems often fail due to
limitations in single CPU performance. When dealing with large datasets, distributed
processing using clusters of commodity (low-cost) machines becomes necessary.

This has led to the development of distributed systems like:

 Apache Hadoop
 HDFS
 MapR
 HBase

These systems break large data into smaller chunks and process them in parallel.
2. Velocity :

Velocity refers to the speed at which data is generated and processed.

For example:

 E-commerce websites handle thousands of reads and writes per second.

 During sales or discounts, traffic spikes slow down RDBMS systems due to multiple
indexes.

NoSQL systems handle these high-speed real-time operations efficiently and ensure low
response time, even during heavy traffic.

3. Variability :

Data often comes in different formats and structures. In RDBMS, changing the schema (table
design) for new data fields is difficult and can affect the entire system.

Example: If you want to store a special field for a few customers, you need to change the
entire table schema. This creates a sparse matrix (empty fields for others) and affects
performance.

NoSQL systems offer schema-less models, allowing storage of different kinds of data without
any rigid structure.

4. Agility :

Handling complex queries in RDBMS requires multiple nested queries and object-relational
mapping layers (ORM) using frameworks like Hibernate or Java. This slows down development
and updates.

NoSQL simplifies this by:

 Supporting easy data retrieval

 Reducing the need for complex SQL queries
 Adapting quickly to changes in business requirements

Key Business Features of NoSQL :

1. 24x7 Availability

 No single point of failure

 Data and functions are replicated across multiple nodes
 Even if a node fails, others continue operations without data loss
 Dynamic updates can be made without downtime

2. Location Transparency

 Read/write data from any location without knowing the physical location of the node
 Data is synchronized across regions
 Ensures fast local access and global availability

3. Schema-less Data Model

 Accepts structured, semi-structured, and unstructured data

 Handles large volumes of data efficiently
 Suitable for flexible and unpredictable data patterns
 Delivers fast performance for both read and write operations

4. Modern Transaction Analysis

 NoSQL does not require strict ACID transactions

 Uses CAP theorem for consistency: data can be immediately or eventually consistent
across nodes
 Suitable for customer reviews, branding, strategy planning, etc., where JOINs and foreign
keys are unnecessary

5. Architecture for Big Data

NoSQL databases support modern architectures by offering:

 Scalability
 Data distribution
 Continuous availability
 Support for multi-data centers

Big data architecture includes:

 Huge data source handling (terabytes to petabytes)

 Real-time data streaming instead of batch processing
 Storage using Hadoop, MongoDB, Cassandra, Neo4j, etc.
 Support for various compute methods (MapReduce, streaming, batch)
6. Analytics and Business Intelligence

 NoSQL enables real-time data mining and analytics

 Helps in quick decision-making
 Extracts valuable insights from high-volume, complex datasets
 Provides integrated analytics that traditional RDBMS struggle to offer

NoSQL Data architectural patterns :

NoSQL databases are designed for flexibility, scalability, and high performance. Based on the
data structure they use, there are four main types of NoSQL data stores:

Types of NoSQL Data Stores:

 Key-Value Store
 Column Store
 Document Store
 Graph Store

1. Key-Value Store

A key-value store stores data as a pair of key and value, just like a dictionary.

 The key is unique and is used to find the value.

 The value can be in formats like String, JSON, or Binary (BLOB).
 It is schema-less, meaning no fixed structure is required.

How it works:

 Internally uses a hash table to store data.

 Keys can be system-generated or custom.
 Buckets group keys logically (not physically), so same key names can exist in different
buckets.
 The real key is a combination of bucket + key.

Basic Operations (APIs):

Operation Description
Get(key) Retrieves value using the key
Put(key, value) Stores or updates value with the key
Multi-Get(key1, key2...) Retrieves multiple values
Delete(key) Deletes the value for the key

Rules:

1. Distinct Keys: All keys must be unique.

2. No Queries on Values: You cannot search within values

Weaknesses:

 No consistency: Cannot update part of the value.

 No querying: Cannot search based on value.
 As data grows, performance can become difficult to manage

Use Cases:

 Caching
 Session storage
 Image stores
 Dictionaries (word-definition pairs)

2. Column Store / Wide Column Store

Stores data in columns instead of rows. It is good for storing large and sparse datasets.

Key Concepts:

 A row key and column name together identify the cell.

 Data is grouped in Column Families, which are like categories of related columns
 Each cell stores data with a timestamp for versioning.
 Very fast for reading data from specific columns.

Structure Format:
<Row Key, Column Family, Column Name, Timestamp> : Value

How it differs from Key-Value:

 Supports grouping of columns.

 Allows fast reading of selected columns.
 Used in analytical systems (OLAP).

Cassandra Data Model Highlights:

 Keyspace: Like a database for one application.

 Column Family: Stores data related to a specific topic.
 Row Key: Unique identifier for each row.
 Columns can be added dynamically.

Use Cases:

 Analytics
 Time-series data
 IoT (Internet of Things) systems
 Social media posts

3. Document Store

A document store is like a smart key-value store, where the value is a document (usually in
JSON or XML format).

Features:
 Documents are semi-structured and self-describing.
 Each document has a unique key (ID).
 All properties inside the document are indexed for fast search.
 Can store nested data (tree structure) directly.

How it works:

1. You can search by any field inside the document.

2. Uses Document Path to access specific nested values.

Example Path: Employee[id='2300']/Address/street/BuildingName

Advantages Over Key-Value Store:

 Allows searching inside documents.

 Supports complex data and hierarchies.
 Supports queries on values.

Use Cases:

 Content management systems

 User profiles
 Ad services (MongoDB sends real-time ads to millions)
 Real-time analytics

4. Graph Store

A graph store uses nodes and relationships to represent and store data.
It is based on graph theory.

Structure:

 Nodes: Entities (e.g., person, product)

 Relationships: Connections between nodes (e.g., follows, friend)
 Properties: Data stored inside nodes or relationships (key-value pairs)

Key Benefits:

 Great for storing and exploring complex relationships.

 No need for complex joins like in RDBMS.
 Fast traversal between connected nodes.

Use Cases:

 Social networks (Facebook, LinkedIn)

 Recommendation systems
 Fraud detection
 Video platforms (YouTube, Flickr)

Variations of NoSQL architectural patterns :

A NoSQL architectural pattern refers to how a NoSQL database system is structured or
designed to store, manage, and retrieve data efficiently — especially for big data, distributed
systems, and real-time applications.

NoSQL databases are schema-less, distributed, and horizontally scalable. But based on system
needs, the architectural design can vary. Let’s explore those variations:

Major NoSQL Data Models (Core Patterns):

1. Key-Value Store

 Stores data as a pair: Key → Value

 Example: Redis, Riak
 Variation:

Can be distributed across multiple servers for scalability.

Federated architecture allows multiple independent key-value databases to work

together.

2. Document Store

 Stores semi-structured data like JSON, XML.

 Example: MongoDB, CouchDB
 Variation:

Can be used in IoT systems, where sensors push data into JSON-like documents.

Data can be temporarily stored or permanently archived.

3. Column Family Store

 Stores data in columns instead of rows.

 Example: Apache Cassandra, HBase
 Variation:

Hash table + content-addressable network to improve distribution and data

lookup.

Scalable distributed architecture using shared-nothing design and load balancers.

4. Graph Store

 Stores entities as nodes and relationships as edges.

 Example: Neo4j, Amazon Neptune
 Variation:

Often used in social networks or enterprise collaboration platforms.

Architectural Variations Based on Implementation Style:

1. Distributed Architecture

 Data is split and stored on multiple servers at different locations.

 Benefits: High availability, fault tolerance, scalability.
 Used in:

Global-scale apps (Netflix, Facebook)

Content delivery platforms

2. Federated Architecture

 Manages independent and heterogeneous databases across various sites.

 Each database is autonomous but can work together as one logical system.
 Used in:

Healthcare systems

Academic research platforms

IoT-Centric NoSQL Architecture

With the rise of Internet of Things (IoT):

 Data from multiple sensors needs to be processed as a single stream.

 Middleware (software between database and app) helps:

Integrate streams

Temporarily store or archive data

Enable real-time querying

 Example:

Using a document store to store sensor readings as JSON

Using Pub/Sub model (EventJava) for live updates

Scalable and Flexible NoSQL Patterns:

System Requirement-Based Variations:

Using NoSQL to manage Big data:

Introduction to MongoDB:

Datatypes in MongoDB:

MongoDB Query language:

Informatica Transformation
No ratings yet
Informatica Transformation
86 pages
unit 4 BDA
No ratings yet
unit 4 BDA
22 pages
BDA CW Chapter 3
No ratings yet
BDA CW Chapter 3
9 pages
Intro to NoSQL DBs
No ratings yet
Intro to NoSQL DBs
44 pages
BDA_(2)_merged[1]
No ratings yet
BDA_(2)_merged[1]
29 pages
No SQL
No ratings yet
No SQL
109 pages
CS3492-DBMS unit-5
No ratings yet
CS3492-DBMS unit-5
9 pages
Nosql Tricks
No ratings yet
Nosql Tricks
34 pages
UNIT II First Half Notes
No ratings yet
UNIT II First Half Notes
21 pages
NoSQL (1)
No ratings yet
NoSQL (1)
12 pages
Lecture 1
No ratings yet
Lecture 1
31 pages
NoSQL
No ratings yet
NoSQL
18 pages
Introduction To NoSQL
No ratings yet
Introduction To NoSQL
29 pages
No SQL
No ratings yet
No SQL
4 pages
Introduction To NoSQL
No ratings yet
Introduction To NoSQL
12 pages
Nosql Databases: P.Krishna Reddy Iiit Hyderabad
No ratings yet
Nosql Databases: P.Krishna Reddy Iiit Hyderabad
30 pages
Unit Ii - Nosql Databases
No ratings yet
Unit Ii - Nosql Databases
112 pages
Nosql
No ratings yet
Nosql
20 pages
Unit 2
No ratings yet
Unit 2
23 pages
Lecture 1 - NoSQL
No ratings yet
Lecture 1 - NoSQL
31 pages
No SQL
No ratings yet
No SQL
11 pages
Dbms Presentation
No ratings yet
Dbms Presentation
22 pages
Unit 5
No ratings yet
Unit 5
137 pages
Unit 4
No ratings yet
Unit 4
36 pages
Unit 4: Big Data Tehnology Landscape Two Inportant Technologies
No ratings yet
Unit 4: Big Data Tehnology Landscape Two Inportant Technologies
42 pages
41 NoSQL Introduction.pptx
No ratings yet
41 NoSQL Introduction.pptx
18 pages
BigData_NoSQL
No ratings yet
BigData_NoSQL
30 pages
DBMS Unit2
No ratings yet
DBMS Unit2
26 pages
No SQL
No ratings yet
No SQL
19 pages
3.1 Introduction to NoSQL
No ratings yet
3.1 Introduction to NoSQL
10 pages
BDA Module-3
No ratings yet
BDA Module-3
7 pages
Nosql Databases
No ratings yet
Nosql Databases
2 pages
Unit-I Remaining HM
No ratings yet
Unit-I Remaining HM
32 pages
Cassandra: Types of Nosql Databases
No ratings yet
Cassandra: Types of Nosql Databases
6 pages
Big Data Analytics Unit-2
No ratings yet
Big Data Analytics Unit-2
30 pages
Cs 620 / Dasc 600 Introduction To Data Science & Analytics: Lecture 6-Nosql
No ratings yet
Cs 620 / Dasc 600 Introduction To Data Science & Analytics: Lecture 6-Nosql
31 pages
Unit No 1
No ratings yet
Unit No 1
34 pages
Module 5_NoSQL databases
No ratings yet
Module 5_NoSQL databases
33 pages
Bcse302l Dbms Module-7 Nosql
No ratings yet
Bcse302l Dbms Module-7 Nosql
30 pages
NoSQL, Cloud Computing, and IOT
No ratings yet
NoSQL, Cloud Computing, and IOT
3 pages
NOSQL Lecture 1 Notes
No ratings yet
NOSQL Lecture 1 Notes
31 pages
RK NoSQL
No ratings yet
RK NoSQL
35 pages
Introduction To: Nosql
No ratings yet
Introduction To: Nosql
27 pages
Seminar Nosql
No ratings yet
Seminar Nosql
56 pages
Module-2
No ratings yet
Module-2
100 pages
Introduction To NoSQL
No ratings yet
Introduction To NoSQL
16 pages
NoSQL_Notes
No ratings yet
NoSQL_Notes
11 pages
NoSQL MongoDB HBase Cassandra
100% (1)
NoSQL MongoDB HBase Cassandra
142 pages
Nosql Database
No ratings yet
Nosql Database
19 pages
Nosql Database: New Era of Databases For Big Data Analytics - Classification, Characteristics and Comparison
No ratings yet
Nosql Database: New Era of Databases For Big Data Analytics - Classification, Characteristics and Comparison
17 pages
BDA-1-
No ratings yet
BDA-1-
23 pages
CT113H Lecture 1_ Introduction to NoSQL
No ratings yet
CT113H Lecture 1_ Introduction to NoSQL
51 pages
Unit 3
No ratings yet
Unit 3
10 pages
Unit 2 _ Big Data Analytics_CCS334
No ratings yet
Unit 2 _ Big Data Analytics_CCS334
36 pages
Introduction To NoSQL
No ratings yet
Introduction To NoSQL
38 pages
Massively Parallel Cloud Data Storage Systems: S. Sudarshan IIT Bombay
No ratings yet
Massively Parallel Cloud Data Storage Systems: S. Sudarshan IIT Bombay
17 pages
Module 1
No ratings yet
Module 1
34 pages
Unit 6
No ratings yet
Unit 6
143 pages
NoSQL Databases
No ratings yet
NoSQL Databases
8 pages
NoSQL Notes
No ratings yet
NoSQL Notes
5 pages
Introduction to Microsoft SQL Server
From Everand
Introduction to Microsoft SQL Server
Eric Frick
No ratings yet
HDFS - Rackawareness
No ratings yet
HDFS - Rackawareness
21 pages
DBMS LAB File
No ratings yet
DBMS LAB File
61 pages
Unit 5 Da
No ratings yet
Unit 5 Da
41 pages
Mirror Image: Creating ISO Images With DD and Mkisofs
No ratings yet
Mirror Image: Creating ISO Images With DD and Mkisofs
3 pages
Datamigration
No ratings yet
Datamigration
23 pages
BD - Unit - I - Introduction To Big Data
No ratings yet
BD - Unit - I - Introduction To Big Data
18 pages
Business Analytics
100% (1)
Business Analytics
6 pages
DBMS - Lab Manual
No ratings yet
DBMS - Lab Manual
105 pages
Lab Activity
No ratings yet
Lab Activity
16 pages
Oracle Developer Syllabus
No ratings yet
Oracle Developer Syllabus
15 pages
An Overview of SAP Core Data Services
No ratings yet
An Overview of SAP Core Data Services
6 pages
A Complete List of SEO Activities
No ratings yet
A Complete List of SEO Activities
11 pages
Overall DWH Concepts Handbook
No ratings yet
Overall DWH Concepts Handbook
27 pages
Table S 1 Yellow Book
No ratings yet
Table S 1 Yellow Book
5 pages
Improving Retrieval For RAG Based Question Answering Models On Financial Documents
No ratings yet
Improving Retrieval For RAG Based Question Answering Models On Financial Documents
7 pages
Assignment 2
No ratings yet
Assignment 2
9 pages
IJCER (WWW - Ijceronline.com) International Journal of Computational Engineering Research
No ratings yet
IJCER (WWW - Ijceronline.com) International Journal of Computational Engineering Research
5 pages
Sankalp's Assignment
No ratings yet
Sankalp's Assignment
4 pages
Chapter 5 - Part1
No ratings yet
Chapter 5 - Part1
2 pages
Key Trends in Business Intelligence: E-Book
No ratings yet
Key Trends in Business Intelligence: E-Book
16 pages
Startup and Shutdown Container Databases (CDB) and Pluggable Databases (PDB)
No ratings yet
Startup and Shutdown Container Databases (CDB) and Pluggable Databases (PDB)
3 pages
Understanding The Data Warehouse Lifecycle
No ratings yet
Understanding The Data Warehouse Lifecycle
9 pages
53 Computer Applications 2024 MS
No ratings yet
53 Computer Applications 2024 MS
18 pages
Overview:: Book Title:-Database Principles: Fundamentals of Design
No ratings yet
Overview:: Book Title:-Database Principles: Fundamentals of Design
4 pages
Unix Assignment 2
No ratings yet
Unix Assignment 2
3 pages
CS208 Principles of Data Base Design
No ratings yet
CS208 Principles of Data Base Design
3 pages
SQL 8.8 To s4 Hana
No ratings yet
SQL 8.8 To s4 Hana
11 pages
Archiiiii 2
No ratings yet
Archiiiii 2
56 pages
11i Cloning Procedure - Non-RAC
100% (1)
11i Cloning Procedure - Non-RAC
28 pages