0% found this document useful (0 votes)

58 views38 pages

No SQL

The document discusses NoSQL databases and data pipelines. It provides motivation for NoSQL databases due to the large volumes of data being collected from various sources. It notes that most big data is unstructured or semi-structured. NoSQL databases are designed for big data and support horizontal scaling, high availability, and flexible schemas. The main types of NoSQL databases are discussed as key-value stores, document databases, column-family databases, and graph databases. Examples of each type are also provided.

Uploaded by

prab bains

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

58 views38 pages

No SQL

Uploaded by

prab bains

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 38

NoSQL Databases and

data pipelines

Based on slides by
Mike Franklin, George Kollios and Jimmy Lin
Part of the slides are adapted from Database System Concepts
Seventh Edition by Avi Silberschatz, Henry F. Korth, S. Sudarshan 1
Motivation
• Very large volumes of data being collected
 Driven by growth of web, social media, and more
recently internet-of-things
 Web logs were an early source of data
• Analytics on web logs has great value for
advertisements, web site structuring, what posts to
show to a user, etc
• Big Data: differentiated from data handled by
earlier generation databases
 Volume: much larger amounts of data stored
 Velocity: much higher rates of
insertions/updates
 Variety: many types of data, beyond relational
data
Big Data (some old numbers)
• Facebook:
 130TB/day: user logs
 200-400TB/day: 83 million pictures

• Google: > 25 PB/day processed data

• Gene sequencing: 100M kilobases

per day per machine
 Sequence 1 cell for every infant by 2015?
 10 trillion cells / human body

• Total data created in 2010: 1.ZettaByte

(1,000,000 PB)/year
 ~60% increase every year
3
~80% of Big Data is not structured

• Structured:
 Data of a well-defined data type, format, or structure
 Example: Relational database tables and CSV files
• Semi-structured:
 Textual data files with a discernable pattern, enabling parsing
 Example: XML, JSON files
• Quasi-structured:
 Textual data with erratic data formats: can be formatted with effort, tools,
and time
 Example: Web clickstream data
• Unstructured:
 Data that has no inherent structure
 Examples: Text documents, images, and video

4
What is NoSQL?
• An emerging “movement” around
non-relational software for Big Data
Wikipedia: “A NoSQL database provides a mechanism for storage
and retrieval of data that use looser consistency models than
traditional relational databases in order to achieve
horizontal scaling and higher availability. Some authors refer to
them as "Not only SQL" to emphasize that some NoSQL systems do
allow SQL-like query language to be used.”

https://en.wikipedia.org/wiki/NoSQL
NoSQL features
• Scalability is crucial!
 load increased rapidly for many applications
 Large servers are expensive
 Solution: use clusters of small (cheap) commodity
machines (often cloud based)
• Need to partition the data and use replication (sharding)
• E.g., records with key values from 1 to 100,000 on
database 1, records with key values from 100,001 to
200,000 on database 2, etc
• Application must track which records are on which
database and send queries/updates to that database
• Develop with agility
 Suitable for faster and more agile application
development due to its flexibility
6
https://azure.microsoft.com/en-us/resources/cloud-computing-dictionary/what-is-nosql-database/
NoSQL features
• Sometimes not a well defined schema
 Performance and availability are more important
than strong consistency provided by RDBMS
 Supports flexible schema
• Allow for semi-structured data
 Handle large, unrelated, indeterminate, or rapidly
changing data
 Still need to provide ways to query efficiently
(use of index methods)
 Need to express specific types of queries easily

7
NoSQL Example
• Storing information about a user ( first name, last name,
cell phone number, city) and their hobbies
Relational
database
way

NoSQL
way

8
Image source: https://www.mongodb.com/nosql-explained
The Structure Spectrum

Structured Semi-Structured Unstructured

(schema-first) (schema-later) (schema-never)

Relational Documents Plain Text

Database XML
Media
Formatted Tagged
Messages Text/Media
Flavors of NoSQL

Four main types:

• key-value stores
• document databases
• column-family (aka big-table/columnar) stores
• graph databases

https://azure.microsoft.com/en-us/resources/cloud-computing-dictionary/what-is-nosq
l-database/ 10
Key Value Storage Systems
• Key-value storage systems store large numbers
(billions or even more) of small (KB-MB) sized
records
• Records are partitioned across multiple machines
• Queries are performed on keys and routed by the
system to appropriate machine
• Records are also replicated across multiple machines,
to ensure availability even if a machine fails
 Key-value stores ensure that updates are applied
to all replicas, to keep values consistent

Example: Redis, MemcacheDB, Amazon's

DynamoDB, Voldemort
Key value pairs in Amazon DynamoDB

https://aws.amazon.com/nosql/key-value/ 12
Key Value Storage Systems
• Key-value stores support
 put(key, value): used to store values with an
associated key,
 get(key): retrieves the stored value associated
with the specified key
 delete(key) -- Remove the key and its
associated value
• Some systems also support range queries on key
values
JSON
• JSON is an alternative data model for
semi-structured data.
• JavaScript Object Notation

• Built on two key structures:

• an object, which is a sequence of name/value pairs
{ ”_id": "1000",
"name": "Sanders Theatre",
"capacity": 1000 }
• an array of values [ "123", "222", "333" ]
• A value can be:
• an atomic value: string, number, true, false, null
• an object
• an array

14
Data Representation in Key Value
• An example of a JSON object is:
{
"ID": "22222",
"name": {
"firstname: "Albert",
"lastname: "Einstein"
},
"deptname": "Physics",
"children": [
{ "firstname": "Hans", "lastname":
"Einstein" },
{ "firstname": "Eduard", "lastname":
"Einstein" }
]
}
Document Databases

• Extends the idea of key/value pairs

- However, the value is a document.
• expressed using some sort of semi-structured data model
• XML
• more often: JSON or BSON (JSON's binary counterpart)
• the value can be examined and used by the DBMS (unlike in key/
data stores)
• Queries can be based on the key (as in key/value stores), but
more often they are based on the contents of the document.

• Here again, there is support for sharding and replication

• sharding can be based on values within the document

Examples include: MongoDB, CouchDB, Terrastore

16
MongoDB (An example of a Document
Database)
-Data are organized in collections. A collection stores
a set of documents.
- Collection like table and document like record
but: each document can have a different set of
attributes even in the same collection
Semi-structured schema!
- Only requirement: every document should have an
“_id” field
humongous => Mongo

17
Example mongodb

{ "_id”:ObjectId("4efa8d2b7d284dad101e4bc9"),
"Last Name": ” Cousteau",
"First Name": ” Jacques-Yves",
"Date of Birth": ”06-1-1910" },

{ "_id": ObjectId("4efa8d2b7d284dad101e4bc7"),
"Last Name": "PELLERIN",
"First Name": "Franck",
"Date of Birth": "09-19-1983",
"Address": "1 chemin des Loges",
"City": "VERSAILLES" }

18
XML Example
<employees>
<employee>
<id>4efa8d2b7d284dad101e4bc9</id> <Last Name> Cousteau </Last
Name> <First Name>Jacques-Yves</First Name> <Date of Birth>06- 1-
1910 </Date of Birth>
</employee>
<employee>
<id>4efa8d2b7d284dad101e4bc7</id> <Last Name>PELLERIN</Last
Name> <First Name>Franck</First Name> <Date of Birth>09-19-1983
</Date of Birth> <Address>1 chemin des Loges</Address>
<City>VERSAILLES</City>
</employee>
</employees>

19
https://www.w3schools.com/js/js_json_xml.asp
Columnar databases
• stores data tables by columns rather than by rows
• are advantageous when querying a subset of columns by
eliminating the need to read columns that are not relevant
• used in analytics that can quickly aggregate the value of a
given column (adding up the total sales for the year
• are typically less efficient for inserting new data

NoSQL (Columnar) way

Relational database way
20
https://en.wikipedia.org/wiki/Column-oriented_DBMS
Graph databases
• focuses on the relationship between data elements
• each element is stored as a node (e.g., a person in a social
media graph)
• Connections (e.g., friendship in social media) between
elements are called links or relationships
• connections are first-class elements of the database, stored
directly
• graph databases are usually run alongside other more
traditional databases
• fraud detection, social networks, and knowledge graphs

21
https://www.mongodb.com/scale/types-of-nosql-databases
Example Document Database:
MongoDB
Key features include:
• JSON-style documents
• actually uses BSON (JSON's binary format)
• replication for high availability
• auto-sharding for scalability
• document-based queries
• can create an index on any attribute
• for faster reads

22
MongoDB Terminology
relational term <== >MongoDB equivalent
----------------------------------------------------------
database <== > database
table <== > collection
row <== > document
attributes <== > fields (field-name:value pairs)
primary key <== > the _id field, which is the key
associated with the document

23
The _id Field
Every MongoDB document must have an _id field.
• its value must be unique within the collection
• acts as the primary key of the collection
• it is the key in the key/value pair
• If you create a document without an _id field:
• MongoDB adds the field for you
• assigns it a unique BSON ObjectID
• example from the MongoDB shell:
> db.test.save({ rating: "PG-13" })
> db.test.find() { "_id" :ObjectId("528bf38ce6d3df97b49a0569"),
"rating" : "PG-13" }

• Note: quoting field names is optional (see rating above)

24
Data Modeling in MongoDB
Need to determine how to map entities and relationships to
collections of documents
• It can make sense to group different types of entities together
• create an aggregate containing data that tends to be accessed
together
• Could in theory store each type of entity in a collection:
• its own (flexibly formatted) type of document
• those documents would be stored in the same collection
• store references to other documents in different collection

25
Capturing Relationships in MongoDB

• embed documents within

{
other documents if "_id":ObjectId("52ffc33cd85242f436000001"),
"contact": "987654321",
 there are contained or "dob": "01-01-1991",
"name": "Tom Benzamin",
one-to-few relationships "address": [
{
between entities "building": "22 A, Indiana Apt",
"pincode": 123456,
 embedded data do "city": "Los Angeles",
"state": "California"
not change frequently },
{
or grow without bound "building": "170 A, Acropolis Apt",
"pincode": 456789,
 embedded data "city": "Chicago",
"state": "Illinois"
is queried frequently }
]
together }

26
https://learn.microsoft.com/en-us/azure/cosmos-db/nosql/modeling-data
Capturing Relationships in MongoDB
{
• store references to "_id":ObjectId("52ffc33cd85242f436000001"),
"name": "Tom Hanks",
other documents using "contact": "987654321",
"dob": "01-01-1991"
their _id values if }
 data grows
unbounded (e.g., {
"_id":ObjectId("52ffc4a5d85242602e000000"),
comments on a post) "building": "22 A, Indiana Apt",
"pincode": 123456,
 data changes "city": "Los Angeles",
"state": "California"
frequently (e.g., }
stock information)
{
{ "_id":ObjectId("52ffc4a5d85242602e000001"),
"_id":ObjectId("52ffc33cd85242f436000001"), "building": "170 A, Acropolis Apt",
"contact": "987654321", "pincode": 456789,
"dob": "01-01-1991", "city": "Chicago",
"name": "Tom Benzamin", "state": "Illinois"
"address_ids": [ }
ObjectId("52ffc4a5d85242602e000000"),
ObjectId("52ffc4a5d85242602e000001")
]
}
27
https://learn.microsoft.com/en-us/azure/cosmos-db/nosql/modeling-data
Queries in MongoDB

Each query can only access a single collection of

documents.
• Use a method called
db.collection.find(<selection>, <projection>)

• Example: find the names of all G-rated movies:

> db.movies.find({ rating: ‘G' }, { name: 1 })

28
Projection
• Specify the name of the fields that you want in the output with
1 ( 0 hides the value)

• Example:
 >db.movies.find({},{"title":1,_id:0})
(will report the title but not the id)

29
Selection
• You can specify the condition on the corresponding attributes
using the find:
>db.movies.find({ rating: “G", year: 2000 }, {name: 1, runtime: 1 })
• Operators for other types of comparisons:
MongoDB SQL equivalent
$gt, $gte >, >=
$lt, $lte <, <=
$ne !=
Example: find the names of movies with an earnings <= 200000
> db.movies.find({ earnings: { $lte: 200000 }})

• For logical operators $and, $or, $nor

 use an array of conditions and apply the logical operator among the array conditions:

> db.movies.find({ $or: [ { rating: “G" }, { rating: "PG-13" } ] })

30
Aggregation
• Recall the aggregate operators in SQL: AVG(), SUM(), etc.
More generally, aggregation involves computing a result
from a collection of data.
• db.collection.count(<selection>)
returns the number of documents in the collection
that satisfy the specified selection document
Example: how may G-rated movies are shorter than 90 minutes?
>db.movies.count({ rating: “G”, runtime: { $lt: 90 }})

• db.collection.distinct(<field>, <selection>)
returns an array with the distinct values of the specified field in
documents that satisfy the specified selection document
- which actors have been in one or more of the top 10 grossing movies?
>db.movies.distinct("actors.name”, { earnings_rank: { $lte: 10 }})

if we omit the selection, get all distinct values of that field 31

Aggregation Pipeline
• MongoDB supports several approaches to aggregation:
- single-purpose aggregation methods
- an aggregation pipeline
- map-reduce

Aggregation pipelines are more flexible and useful (see next):

• A very powerful approach to write queries in MongoDB is to use
pipelines
• We execute the query in stages.
• Every stage gets as input some documents, applies
filters/aggregations/projections and outputs some new documents.
• These documents are the input to the next stage (next operator) and
so on

https://docs.mongodb.com/manual/core/aggregation-pipeline/
32
Aggregation Pipeline example
• Let’s use the following pizza orders collection and find the total order
quantity of medium size pizzas grouped by pizza name

db.orders.aggregate( [
// Stage 1: Filter pizza order documents by pizza size
{
$match: { size: "medium" }
},
// Stage 2: Group remaining documents by pizza name and calculate total quantity
{
$group: { _id: "$name", totalQuantity: { $sum: "$quantity" } }
}
]) 33
https://docs.mongodb.com/manual/core/aggregation-pipeline/
Aggregation Pipeline example
• Let’s use the following pizza orders collection and find the total order
quantity of medium size pizzas grouped by pizza name

https://docs.mongodb.com/manual/core/aggregation-pipeline/
34
Aggregation Pipeline example
• Let’s use the pizza orders collection and find total pizza order value and
average order quantity between two dates
db.orders.aggregate( [
// Stage 1: Filter pizza order documents by date range
{
$match:
{
"date": { $gte: new ISODate( "2020-01-30" ), $lt: new ISODate( "2022-01-30" ) }
}},
// Stage 2: Group remaining documents by date and calculate results
{
$group:
{
_id: { $dateToString: { format: "%Y-%m-%d", date: "$date" } },
totalOrderValue: { $sum: { $multiply: [ "$price", "$quantity" ] } },
averageOrderQuantity: { $avg: "$quantity" }
}
},
// Stage 3: Sort documents by totalOrderValue in descending order
{
$sort: { totalOrderValue: -1 }
}] )
• Example, output

35
What is a Data Pipeline?
• A data pipeline is a process for moving data between a
source system and a target repository
• It involves software which automates the many steps
that may or may not be involved in moving data for a
specific use case, such as extracting data from a source
system, and then loading it into a target repository

https://www.qlik.com/us/etl/etl-pipeline
36
What is Extract, Transform, and Load (ETL)?

• Set of processes to extract data from one system, transform it, and
then load it into a target repository (data warehouse or data lake )
• Transform is the process of converting the format or structure of the
data set to match the target system
 Data mapping, applying concatenations or calculation
• ETL process is most appropriate for small data sets which require
complex transformations
 Transforming larger data sets can take a long time up front but analysis
can take place immediately once the ETL process is complete

37
https://www.qlik.com/us/etl/etl-pipeline
What is Extract, Load, and Transform (ELT)?

• All data is extracted from the source and immediately loaded into the target
system (data warehouse or data lake )
• Data is transformed on an as-needed basis in the target system
 raw, unstructured, semi-structured and structured data
 Transformation can slow down the querying and analysis processes if there is
not sufficient processing power
• ELT is more cost effective then ETL, is appropriate for larger, structured and
unstructured data sets and when timeliness is important
 Cloud platforms (Amazon Redshift, Snowflake, Azure Synapse, Databricks) offer
much lower costs and a variety of plan options to store and process data

38
https://www.qlik.com/us/etl/etl-vs-elt

Information Retrieval Solutions Manual
84% (57)
Information Retrieval Solutions Manual
17 pages
Dynamark Logging Protocol User Guide r01
No ratings yet
Dynamark Logging Protocol User Guide r01
27 pages
05 NoSQL
No ratings yet
05 NoSQL
21 pages
06-NoSQL
No ratings yet
06-NoSQL
80 pages
Cs 620 / Dasc 600 Introduction To Data Science & Analytics: Lecture 6-Nosql
No ratings yet
Cs 620 / Dasc 600 Introduction To Data Science & Analytics: Lecture 6-Nosql
31 pages
Overview of NoSQL
No ratings yet
Overview of NoSQL
17 pages
Lecture NoSqlIntro
No ratings yet
Lecture NoSqlIntro
30 pages
Lecture 1
No ratings yet
Lecture 1
31 pages
Bcse302l Dbms Module-7 Nosql
No ratings yet
Bcse302l Dbms Module-7 Nosql
30 pages
NOSQL Lecture 1 Notes
No ratings yet
NOSQL Lecture 1 Notes
31 pages
NOsql Presentation
No ratings yet
NOsql Presentation
20 pages
Unit II No-SQL Db Managment
No ratings yet
Unit II No-SQL Db Managment
33 pages
Lecture 1 - NoSQL
No ratings yet
Lecture 1 - NoSQL
31 pages
chap 4
No ratings yet
chap 4
18 pages
Chapter14_BigData&NoSQLDatabases
No ratings yet
Chapter14_BigData&NoSQLDatabases
39 pages
Learning Guide 2.1 - CloudDatabase - NOSQL PDF
No ratings yet
Learning Guide 2.1 - CloudDatabase - NOSQL PDF
44 pages
NoSQL Vs SQL Databases Explained
No ratings yet
NoSQL Vs SQL Databases Explained
23 pages
Module 5_NoSQL databases
No ratings yet
Module 5_NoSQL databases
33 pages
BDA_UNIT12
No ratings yet
BDA_UNIT12
9 pages
Lec 15 Notes
No ratings yet
Lec 15 Notes
3 pages
No SQL
No ratings yet
No SQL
38 pages
Unit-V DBMS
No ratings yet
Unit-V DBMS
19 pages
1842-week6-NoSQL
No ratings yet
1842-week6-NoSQL
51 pages
NoSQL lec
No ratings yet
NoSQL lec
45 pages
Unit-V SQL
No ratings yet
Unit-V SQL
18 pages
No SQL
No ratings yet
No SQL
32 pages
UNIT-1(IOT)
No ratings yet
UNIT-1(IOT)
11 pages
NoSQL (1)
No ratings yet
NoSQL (1)
12 pages
Unit 2 Handouts
No ratings yet
Unit 2 Handouts
11 pages
Nosql Database
No ratings yet
Nosql Database
19 pages
Nosql 20240103 114025 0000
No ratings yet
Nosql 20240103 114025 0000
24 pages
Nosql Module 1
No ratings yet
Nosql Module 1
23 pages
Unit 5_230601_174540-1
No ratings yet
Unit 5_230601_174540-1
14 pages
UNIT 1 NOTES
No ratings yet
UNIT 1 NOTES
28 pages
NoSQL Database
No ratings yet
NoSQL Database
45 pages
NOSQL.pptx
No ratings yet
NOSQL.pptx
50 pages
What Is NoSQL
No ratings yet
What Is NoSQL
4 pages
NoSQL DATABSES
No ratings yet
NoSQL DATABSES
12 pages
NoSQL_Complete_QB
No ratings yet
NoSQL_Complete_QB
43 pages
DBMS Unit 5 Notes
No ratings yet
DBMS Unit 5 Notes
57 pages
Unit 2
No ratings yet
Unit 2
65 pages
NoSQL Big Data Management
No ratings yet
NoSQL Big Data Management
36 pages
Full Stack-Unit-Iii
No ratings yet
Full Stack-Unit-Iii
56 pages
NoSQL Database Comprehensive Report
No ratings yet
NoSQL Database Comprehensive Report
75 pages
Full Stack UNIT3
No ratings yet
Full Stack UNIT3
57 pages
ADBMS original-output
No ratings yet
ADBMS original-output
28 pages
Unit Ii - Nosql Databases
No ratings yet
Unit Ii - Nosql Databases
112 pages
Lesson 2 Unstructured Data
No ratings yet
Lesson 2 Unstructured Data
33 pages
Chapter 1 - Introducing Big Data & NoSQL
No ratings yet
Chapter 1 - Introducing Big Data & NoSQL
14 pages
No SQL
No ratings yet
No SQL
10 pages
Unit 2
No ratings yet
Unit 2
23 pages
Unit 3 Nosql Databases Adt
No ratings yet
Unit 3 Nosql Databases Adt
64 pages
Unit 2
No ratings yet
Unit 2
26 pages
Mongodb
No ratings yet
Mongodb
22 pages
41 NoSQL Introduction.pptx
No ratings yet
41 NoSQL Introduction.pptx
18 pages
BDA_(2)_merged[1]
No ratings yet
BDA_(2)_merged[1]
29 pages
Unit 3
No ratings yet
Unit 3
10 pages
BIG DATA UNIT-II NOTES
No ratings yet
BIG DATA UNIT-II NOTES
7 pages
CHAPTER 03: Big Data Technology Landscape
No ratings yet
CHAPTER 03: Big Data Technology Landscape
81 pages
BDA
No ratings yet
BDA
65 pages
Databases: System Concepts, Designs, Management, and Implementation
From Everand
Databases: System Concepts, Designs, Management, and Implementation
Jonathan Rigdon
No ratings yet
DBMS MASTER: Become Pro in Database Management System
From Everand
DBMS MASTER: Become Pro in Database Management System
Ummed Singh
No ratings yet
9-Hashing Schemes
No ratings yet
9-Hashing Schemes
23 pages
DBMS-21CSL55 Lab Manual With 3 Varients 2023-24
No ratings yet
DBMS-21CSL55 Lab Manual With 3 Varients 2023-24
69 pages
Barman-3 5 0-Manual
No ratings yet
Barman-3 5 0-Manual
88 pages
Access SQL Transform Statement
No ratings yet
Access SQL Transform Statement
5 pages
NetBackup83 9x Tuning Guide
No ratings yet
NetBackup83 9x Tuning Guide
165 pages
NO.1 A. B. C. D. E.: Answer
No ratings yet
NO.1 A. B. C. D. E.: Answer
4 pages
Running Databricks Migrations Code Analyzer
No ratings yet
Running Databricks Migrations Code Analyzer
23 pages
MongoDB For Java Developers - Sample Chapter
No ratings yet
MongoDB For Java Developers - Sample Chapter
35 pages
Advanced Database Management System - Tutorials and Notes - Partitioned Parallel Hash Join
No ratings yet
Advanced Database Management System - Tutorials and Notes - Partitioned Parallel Hash Join
6 pages
DBMS Lab Manual 2016-17
No ratings yet
DBMS Lab Manual 2016-17
66 pages
Full Download Pro Oracle SQL Development: Best Practices for Writing Advanced Queries 2nd Edition Jon Heller PDF DOCX
100% (2)
Full Download Pro Oracle SQL Development: Best Practices for Writing Advanced Queries 2nd Edition Jon Heller PDF DOCX
47 pages
DWFile
No ratings yet
DWFile
22 pages
Final Exam Sem 2
No ratings yet
Final Exam Sem 2
25 pages
OS Labreport - 1
No ratings yet
OS Labreport - 1
26 pages
Function Dependency AND Types & Example: Name-Vraj Patel Batch-A ENROLL NO: 150410107082
No ratings yet
Function Dependency AND Types & Example: Name-Vraj Patel Batch-A ENROLL NO: 150410107082
17 pages
Skip List: ADS Skip List Iii I.T - I Sem
No ratings yet
Skip List: ADS Skip List Iii I.T - I Sem
6 pages
Database Assignment
No ratings yet
Database Assignment
4 pages
01 Introduction On Data Dictionary
No ratings yet
01 Introduction On Data Dictionary
22 pages
Term-II Question Paper Ip
No ratings yet
Term-II Question Paper Ip
2 pages
Lab 4 - 5
No ratings yet
Lab 4 - 5
13 pages
Unit4 Database
No ratings yet
Unit4 Database
59 pages
DataStage Vs Informatica
No ratings yet
DataStage Vs Informatica
3 pages
OpenClinica Import and Export v1
No ratings yet
OpenClinica Import and Export v1
41 pages
DBMS 01
No ratings yet
DBMS 01
11 pages
Informatica 9.X Level 1 and Level 2 Training
No ratings yet
Informatica 9.X Level 1 and Level 2 Training
4 pages
Slam Book Python Programming
100% (2)
Slam Book Python Programming
11 pages
Bda Aiml Note Unit 1
No ratings yet
Bda Aiml Note Unit 1
14 pages
Unit 5 Big Data
No ratings yet
Unit 5 Big Data
48 pages

No SQL

Uploaded by

No SQL

Uploaded by

NoSQL Databases and

• Google: > 25 PB/day processed data

• Gene sequencing: 100M kilobases

• Total data created in 2010: 1.ZettaByte

Structured Semi-Structured Unstructured

Relational Documents Plain Text

Four main types:

Example: Redis, MemcacheDB, Amazon's

• Built on two key structures:

• Extends the idea of key/value pairs

• Here again, there is support for sharding and replication

Examples include: MongoDB, CouchDB, Terrastore

NoSQL (Columnar) way

• Note: quoting field names is optional (see rating above)

• embed documents within

Each query can only access a single collection of

• Example: find the names of all G-rated movies:

• For logical operators $and, $or, $nor

> db.movies.find({ $or: [ { rating: “G" }, { rating: "PG-13" } ] })

if we omit the selection, get all distinct values of that field 31

Aggregation pipelines are more flexible and useful (see next):

You might also like