0% found this document useful (0 votes)

28 views

5.1 Intro Nosql

This document provides an overview of NoSQL databases. It discusses the different eras of databases and limitations of relational databases that led to the rise of NoSQL. It then describes several popular NoSQL database models like key-value, document, column-family and graph databases. Specific NoSQL databases like MongoDB, Cassandra, HBase, Redis and DynamoDB are also overviewed along with their use cases and features.

Uploaded by

hainguyenhuy2002

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views

5.1 Intro Nosql

Uploaded by

hainguyenhuy2002

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

10/16/2023

NoSQL part 1
Lecturer: Binh-Minh Nguyen
School of Information and Communication Technology

Eras of Databases

1
10/16/2023

Eras of Databases

Before NoSQL

Star schema

OLTP
OLAP cube
4

2
10/16/2023

RDBMS: one size fits all needs

ICDE 2005 conference

The last 25 years of commercial DBMS development can be summed up in a single phrase:
"one size fits all". This phrase refers to the fact that the traditional DBMS architecture
(originally designed and optimized for business data processing) has been used to support
many data-centric applications with widely varying characteristics and requirements. In this
paper, we argue that this concept is no longer applicable to the database market, and that the
commercial world will fracture into a collection of independent database engines ...
6

3
10/16/2023

After is NoSQL

NoSQL landscape

4
10/16/2023

How to write a CV

Why NoSQL

• Web applications have different needs

• Horizontal scalability – lowers cost
• Geographically distributed
• Elasticity
• Schema less, flexible schema for semi-structured data
• Easier for developers
• Heterogeneous data storage
• High Availability/Disaster Recovery
• Web applications do not always need
• Transaction
• Strong consistency
• Complex queries

5
10/16/2023

SQL vs NoSQL

SQL NoSQL
Gigabytes to Terabytes Petabytes(1kTB) to Exabytes(1kPB) to
Zetabytes(1kEB)
Centralized Distributed
Structured Semi structured and Unstructured
Structured Query Language No declarative query language
Stable Data Model Schema less
Complex Relationships Less complex relationships
ACID Property Eventual Consistency
Transaction is priority High Availability, High Scalability
Joins Tables Embedded structures

NoSQL use cases

• Massive data volume at scale (Big volume)

• Google, Amazon, Yahoo, Facebook – 10-100K servers
• Extreme query workload (Big velocity)
• High availability
• Flexible, schema evolution

6
10/16/2023

DB engines ranking according to their

popularity (2019)

Relational data model revisited

• Data is usually stored in row by row

manner (row store)
• Standardized query language (SQL)
• Data model defined before you add data
• Joins merge data from multiple tables
• Results are tables
• Pros: Mature ACID transactions with fine-grain
security controls, widely used
Oracle, MySQL, PostgreSQL,
• Cons: Requires up front data modeling, does not Microsoft SQL Server, IBM
scale well DB/2

7
10/16/2023

Key/value data model

• Simple key/value interface

• GET, PUT, DELETE
• Value can contain any kind of data
• Super fast and easy to scale (no joins)
• Examples
• Berkley DB, Memcache, DynamoDB, Redis, Riak

Key/value vs. table

• A table with two columns and a simple

interface
• Add a key-value
• For this key, give me the value
• Delete a key

8
10/16/2023

Key/value vs. Relational data model

Memcached

• Open source in-memory key-value caching system

• Make effective use of RAM on many distributed web servers
• Designed to speed up dynamic web applications by alleviating
database load
• Simple interface for highly distributed RAM caches
• 30ms read times typical
• Designed for quick deployment, ease of development
• APIs in many languages

9
10/16/2023

Redis

• Open source in-memory key-value store with optional

durability
• Focus on high speed reads and writes of common data
structures to RAM
• Allows simple lists, sets and hashes to be stored within the
value and manipulated
• Many features that developers like expiration, transactions,
pub/sub, partitioning

Amazon DynamoDB

• Scalable key-value store

• Fastest growing product in Amazon's history
• Focus on throughput on storage and predictable read and
write times
• Strong integration with S3 and Elastic MapReduce

10
10/16/2023

Riak

• Open source distributed key-value store with support and

commercial versions by Basho
• A "Dynamo-inspired" database
• Focus on availability, fault-tolerance, operational simplicity
and scalability
• Support for replication and auto-sharding and rebalancing on
failures
• Support for MapReduce, fulltext search and secondary
indexes of value tags
• Written in ERLANG

Column family store

• Dynamic schema, column-oriented data model

• Sparse, distributed persistent multi-dimensional sorted map
• (row, column (family), timestamp) -> cell contents

11
10/16/2023

Column families

• Group columns into "Column families"

• Group column families into "Super-Columns"
• Be able to query all columns with a family or super family
• Similar data grouped together to improve speed

Column family data model vs. relational

• Sparse matrix, preserve table structure

• One row could have millions of columns but can be very sparse
• Hybrid row/column stores
• Number of columns is extendible
• New columns to be inserted without doing an "alter table"

12
10/16/2023

Bigtable

• ACM TOCS 2008

• Fault-tolerant, persistent
• Scalable
• Thousands of servers
• Terabytes of in-memory data
• Petabyte of disk-based data
• Millions of reads/writes per
second, efficient scans
• Self-managing
• Servers can be added/removed
dynamically
• Servers adjust to load imbalance

Apache Hbase

• Open-source Bigtable, written in JAVA

• Part of Apache Hadoop project

13
10/16/2023

Apache Cassandra

• Apache open source column family database

• Supported by DataStax
• Peer-to-peer distribution model
• Strong reputation for linear scale out (millions of
writes/second)
• Written in Java and works well with HDFS and MapReduce

Graph data model

• Core abstractions: Nodes, Relationships, Properties on both

14
10/16/2023

Graph database store

• A database stored data in an explicitly graph structure

• Each node knows its adjacent nodes
• Queries are really graph traversals

Compared to Relational Databases

Optimized for aggregation Optimized for connections

15
10/16/2023

Compared to Key Value Stores

Optimized for simple look-ups Optimized for traversing connected data

Compared to Document Stores

Optimized for “trees” of data Optimized for seeing the forest and the
trees, and the branches, and the trunks

16
10/16/2023

Linking open data

Neo4j

• Graph database designed to be easy to use by Java

developers
• Disk-based (not just RAM)
• Full ACID
• High Availability (with Enterprise Edition)
• 32 Billion Nodes, 32 Billion Relationships,
64 Billion Properties
• Embedded java library
• REST API

17
10/16/2023

Document store

• Documents, not value, not tables

• JSON or XML formats
• Document is identified by ID
• Allow indexing on properties

Relational data mapping

• T1–HTML into Objects

• T2–Objects into SQL Tables
• T3–Tables into Objects
• T4–Objects into HTML

18
10/16/2023

Web Service in the middle

• T1 – HTML into Java Objects

• T2 – Java Objects into SQL Tables
• T3 – Tables into Objects
• T4 – Objects into HTML
Web Service
• T5 – Objects to XML
• T6 – XML to Objects
T5 T6

T1 T2

T4 T3
Relational
Web Browser Object Middle
Database
Tier
37

Discussion

• Object-relational mapping has become one of the most

complex components of building applications today
• Java Hibernate Framework
• JPA
• To avoid complexity is to keep your architecture very simple

19
10/16/2023

Document mapping

• Documents in the database

• Documents in the application
• No object middle tier
• No "shredding"
• No reassembly
• Simple!

Document Document

Application Layer Database

MongoDB

• Open Source JSON data store created by 10gen

• Master-slave scale out model
• Strong developer community
• Sharding built-in, automatic
• Implemented in C++ with many APIs (C++, JavaScript, Java,
Perl, Python etc.)

20
10/16/2023

MongoDB architecture

• Replica set
• Copies of the data on each node
• Data safety
• High availability
• Disaster recovery
• Maintenance
• Read scaling
• Sharding
• “Partitions” of the data
• Horizontal scale

Apache CouchDB

• Apache project
• Open source JSON data store
• Written in ERLANG
• RESTful JSON API
• B-Tree based indexing, shadowing b-tree versioning
• ACID fully supported
• View model
• Data compaction
• Security

21
10/16/2023

Thank you for your attention!

Q&A

Instant download Graph Algorithms for Data Science MEAP v7 Tomaž Bratanič pdf all chapter
100% (2)
Instant download Graph Algorithms for Data Science MEAP v7 Tomaž Bratanič pdf all chapter
65 pages
Latihan Azure Microsoft-1
No ratings yet
Latihan Azure Microsoft-1
33 pages
Designing Data Intensive Applications
25% (4)
Designing Data Intensive Applications
61 pages
4.1 Intro Nosql
No ratings yet
4.1 Intro Nosql
43 pages
4.1_intro_nosql
No ratings yet
4.1_intro_nosql
43 pages
4.1 Intro Nosql
No ratings yet
4.1 Intro Nosql
45 pages
04-2 Intro Nosql
No ratings yet
04-2 Intro Nosql
18 pages
ACS233025 M Talha
No ratings yet
ACS233025 M Talha
4 pages
Fdocuments - in Nosql-Seminar
No ratings yet
Fdocuments - in Nosql-Seminar
40 pages
CloudComputing DATABASE
No ratings yet
CloudComputing DATABASE
27 pages
Bda CHP 3
No ratings yet
Bda CHP 3
75 pages
Database Types
No ratings yet
Database Types
4 pages
Bcse302l Dbms Module-7 Nosql
No ratings yet
Bcse302l Dbms Module-7 Nosql
30 pages
Unit 6
No ratings yet
Unit 6
143 pages
Unit 3
No ratings yet
Unit 3
7 pages
database types
No ratings yet
database types
9 pages
DBMS (UNIT-6) (Advances in Databases and Big Data)
No ratings yet
DBMS (UNIT-6) (Advances in Databases and Big Data)
103 pages
NOSQL, Graph Databases & Cypher
No ratings yet
NOSQL, Graph Databases & Cypher
78 pages
2 Big Data Analytics-Hadoop R21 A7902 ABP
No ratings yet
2 Big Data Analytics-Hadoop R21 A7902 ABP
16 pages
Databases in Computer World
No ratings yet
Databases in Computer World
71 pages
chap 4
No ratings yet
chap 4
18 pages
Nosql Tricks
No ratings yet
Nosql Tricks
34 pages
NOSQL Lecture 1 Notes
No ratings yet
NOSQL Lecture 1 Notes
31 pages
Lecture 1 - NoSQL
No ratings yet
Lecture 1 - NoSQL
31 pages
Intro-Databases For Big Data
No ratings yet
Intro-Databases For Big Data
10 pages
Lecture 1
No ratings yet
Lecture 1
31 pages
Bda Unit-5 PDF
No ratings yet
Bda Unit-5 PDF
83 pages
NOSQL
No ratings yet
NOSQL
25 pages
Seminar Nosql
No ratings yet
Seminar Nosql
56 pages
BDA - M 3 - NoSQL
No ratings yet
BDA - M 3 - NoSQL
81 pages
NoSQL DB
No ratings yet
NoSQL DB
33 pages
Database Advice Guide
No ratings yet
Database Advice Guide
19 pages
Module 1
No ratings yet
Module 1
34 pages
Nosql : Knowledge Engineering and Representation
No ratings yet
Nosql : Knowledge Engineering and Representation
30 pages
NO-SQL
No ratings yet
NO-SQL
32 pages
NoSQL Database
No ratings yet
NoSQL Database
45 pages
nosql-technology (1)
No ratings yet
nosql-technology (1)
8 pages
Unit 2 Evaluating NoSQL
No ratings yet
Unit 2 Evaluating NoSQL
64 pages
DBMS Unit2
No ratings yet
DBMS Unit2
26 pages
NoSQL Database Technology - A Survey and Comparison of Systems
No ratings yet
NoSQL Database Technology - A Survey and Comparison of Systems
44 pages
1.mysql: Aim: To Study The Top 10 Databases Management Systems
No ratings yet
1.mysql: Aim: To Study The Top 10 Databases Management Systems
9 pages
BDT Unit 4
No ratings yet
BDT Unit 4
93 pages
Bda Notes (Unit-2)
No ratings yet
Bda Notes (Unit-2)
26 pages
BDCN Unit 1 Activity 1
No ratings yet
BDCN Unit 1 Activity 1
10 pages
NOSQL
No ratings yet
NOSQL
6 pages
NO SQL Unit 1
No ratings yet
NO SQL Unit 1
66 pages
CS8091-BIG DATA ANALYTICS UNIT V Notes
100% (4)
CS8091-BIG DATA ANALYTICS UNIT V Notes
31 pages
Cs 620 / Dasc 600 Introduction To Data Science & Analytics: Lecture 6-Nosql
No ratings yet
Cs 620 / Dasc 600 Introduction To Data Science & Analytics: Lecture 6-Nosql
31 pages
MongoDB Slides Until ClassTest
No ratings yet
MongoDB Slides Until ClassTest
221 pages
Database Concept
No ratings yet
Database Concept
3 pages
Unit 4: Big Data Tehnology Landscape Two Inportant Technologies
No ratings yet
Unit 4: Big Data Tehnology Landscape Two Inportant Technologies
42 pages
Index: Mlbase Component, 100
No ratings yet
Index: Mlbase Component, 100
8 pages
Unit 2
No ratings yet
Unit 2
65 pages
Big Data
No ratings yet
Big Data
53 pages
21 Mca 2326 Researchpaper
No ratings yet
21 Mca 2326 Researchpaper
14 pages
DBMS Notes
No ratings yet
DBMS Notes
85 pages
Dbms + SQL Sheet (1)
No ratings yet
Dbms + SQL Sheet (1)
78 pages
DBMS PPT 1 ENG
No ratings yet
DBMS PPT 1 ENG
74 pages
Top 18 Free and Widely Used, Open Source NoSQL Databases
No ratings yet
Top 18 Free and Widely Used, Open Source NoSQL Databases
4 pages
UDBMS NOTES
No ratings yet
UDBMS NOTES
18 pages
DBA's Guide to NoSQL
From Everand
DBA's Guide to NoSQL
The Enlightened DBA
5/5 (1)
Mastering ScyllaDB: High-Performance NoSQL with C++
From Everand
Mastering ScyllaDB: High-Performance NoSQL with C++
Robert Johnson
No ratings yet
Fedora 4.7 Triplestore Integration Notes
No ratings yet
Fedora 4.7 Triplestore Integration Notes
30 pages
DMS MP
No ratings yet
DMS MP
4 pages
Chapter 3. Graph Platforms and Processing: Platform Considerations
No ratings yet
Chapter 3. Graph Platforms and Processing: Platform Considerations
12 pages
Fdsa Unit 1 Aids Sem 4
No ratings yet
Fdsa Unit 1 Aids Sem 4
26 pages
Power Amazon Bedrock Applications With Neo4j Knowledge Graph
No ratings yet
Power Amazon Bedrock Applications With Neo4j Knowledge Graph
19 pages
2024 - Advanced Database Systems - 83858
No ratings yet
2024 - Advanced Database Systems - 83858
3 pages
Seminar Nosql
No ratings yet
Seminar Nosql
59 pages
Big Data Analytics
No ratings yet
Big Data Analytics
131 pages
Elasticsearch Performance Tuning
No ratings yet
Elasticsearch Performance Tuning
143 pages
Data - Visualisation - Charts and Types of Data
No ratings yet
Data - Visualisation - Charts and Types of Data
7 pages
DBMS Full Notes
No ratings yet
DBMS Full Notes
5 pages
Advanced Techniques in Web Intelligence - Part I
No ratings yet
Advanced Techniques in Web Intelligence - Part I
277 pages
Lecture 6 - NoSQL
No ratings yet
Lecture 6 - NoSQL
43 pages
Spatial Database Assignment
No ratings yet
Spatial Database Assignment
12 pages
Adbase Presentation Group 4
No ratings yet
Adbase Presentation Group 4
60 pages
Thesis
No ratings yet
Thesis
80 pages
A Survey On Semantic Question Answering Systems
No ratings yet
A Survey On Semantic Question Answering Systems
43 pages
On Data Lake Architectures Andmetadata Management
No ratings yet
On Data Lake Architectures Andmetadata Management
24 pages
ADBMS
No ratings yet
ADBMS
12 pages
DB
No ratings yet
DB
3 pages
Why System Design
0% (1)
Why System Design
229 pages
Graph 360 Degree View
No ratings yet
Graph 360 Degree View
10 pages
Multi-Model-Identifies-Fraud-At-Scale-–-ArangoDB-White-Paper
No ratings yet
Multi-Model-Identifies-Fraud-At-Scale-–-ArangoDB-White-Paper
17 pages
NOSQL
No ratings yet
NOSQL
64 pages
Surveyondatamanagementsystemfor Final
No ratings yet
Surveyondatamanagementsystemfor Final
5 pages
The Rise of Vector Databases in the Age of LLMs
No ratings yet
The Rise of Vector Databases in the Age of LLMs
26 pages

5.1 Intro Nosql

Uploaded by

5.1 Intro Nosql

Uploaded by

10/16/2023

RDBMS: one size fits all needs

ICDE 2005 conference

• Web applications have different needs

NoSQL use cases

• Massive data volume at scale (Big volume)

DB engines ranking according to their

Relational data model revisited

• Data is usually stored in row by row

Key/value data model

• Simple key/value interface

Key/value vs. table

• A table with two columns and a simple

Key/value vs. Relational data model

• Open source in-memory key-value caching system

• Open source in-memory key-value store with optional

• Scalable key-value store

• Open source distributed key-value store with support and

Column family store

• Dynamic schema, column-oriented data model

• Group columns into "Column families"

Column family data model vs. relational

• Sparse matrix, preserve table structure

• ACM TOCS 2008

• Open-source Bigtable, written in JAVA

• Apache open source column family database

Graph data model

• Core abstractions: Nodes, Relationships, Properties on both

Graph database store

• A database stored data in an explicitly graph structure

Compared to Relational Databases

Optimized for aggregation Optimized for connections

Compared to Key Value Stores

Optimized for simple look-ups Optimized for traversing connected data

Compared to Document Stores

Linking open data

• Graph database designed to be easy to use by Java

• Documents, not value, not tables

Relational data mapping

• T1–HTML into Objects

Web Service in the middle

• T1 – HTML into Java Objects

• Object-relational mapping has become one of the most

• Documents in the database

Application Layer Database

• Open Source JSON data store created by 10gen

Thank you for your attention!

You might also like