4 NoSql
4 NoSql
By Dinesh Amatya
NoSQL
NoSQL
– schema-free
– no Join
– distributed
– horizontally scalable with easy replication support
– eventually consistent
– open source
NoSQL
– Google’s BigTable
– LinkedIn’s Voldemort
– Facebook’s Cassandra
– Yahoo!’s PNUTS
NoSQL
...and when it didn’t work they tried to scale existing relational solutions:
– Simplifying DB schema
– De-normalization
– Introducing numerous query caching layers
– Separating read-only from write-dedicated replicas
– Data partitioning
CAP Theorem
CAP Theorem And NoSQL
– Soft State (data at some node could change without any explicit user
intervention. This follows from eventual consistency)
– Eventually Consistent (NoSQL guarantees consistency only at some
undefined future time)
NoSQL Taxonomy
● Key/Value Store
- Amazon’s Dynamo, LinkedIn’s Voldemort, MemCached, Redis . . .
● Document Store
- MongoDB, CouchDB, . . .
● Column Store
- Google’s Bigtable, Apache’s HBase, Facebook’s Cassandra, . . .
● Graph Store
- Neo4J, InfiniteGraph, . . .
RDMS Data
`
Key/Value Store
●
Global collection of Key/Value pairs. Every item in the
database is stored as an attribute name (key) together with
its associated value
●
Every key associated to exactly one value. No duplicates
●
The value is simply a binary object. The DB does not
associate any structure to stored values
●
Designed to handle massive load of data
●
Inspired by Distributed Hash Tables
Key/Value Store
JSON
●
Stands for JavaScript Object Notation
●
Syntax for storing and exchanging text information
●
Uses JavaScript syntax but it is language and platform
independent
●
Much like XML but smaller, faster and easier to parse than
XML (and human readable)
●
Basic data types(Number, String, Boolean) and supports
data structures as objects and arrays
JSON
Document Store
●
Same as Key/Value Store but pair each key with a arbitrarily complex
data structure known as a document.
●
Documents may contain many different key-value pairs or key-array
pairs or even nested documents (like a JSON object).
●
Data in documents can be understood by the DB: querying data is
possible by other means than just a key (selection and projection over
results are possible).
Document Store
{
"firstname": "Pramod",
"citiesvisited": [ "Chicago", "London", "Pune", "Bangalore" ],
"addresses": [
{ "state": "AK",
{ "firstname": "Martin",
"city": "DILLINGHAM",
"likes": [ "Biking","Photography" ],
"type": "R"
"lastcity": "Boston",
},
"lastVisited":
{ "state": "MH",
}
"city": "PUNE",
"type": "R" }
],
"lastcity": "Chicago"
}
Column Store
●
”A sparse, distributed multi-dimensional sorted map”
●
Store rows of data in similar fashion as typical RDBMSs do
●
Rows are contained within a Column Families. Column Families can be
considered as tables in RDBMSes
●
Unlike table in RDBMSes, a Column Family can have different columns
for each row it contains
●
Each row is identified by a key that is unique in the context of a single
Column Family. The same key can be however re-used within other
Column Families, so it is possible to store unrelated data about the same
key in different Column Families
Column Store
●
Usually data from the same Column Family are stored contiguously
on disk (and consequently on the same node of the network)
●
Each column is simply a key/value couple
●
Column Store
Graph Store
●
Use graph structures with nodes, edges and
properties to store pieces of data and relations
between them
●
Every element contains direct pointers to its adjacent
elements.
●
Computing answers to queries over the DB
corresponds to finding suitable paths on the graph
structure
Graph Store
Graph Store
References
http://db.cs.berkeley.edu/cs286/papers/errors-cacmblog2010.pdf
http://www.quora.com/Can-someone-provide-an-intuitive-proof-explanation-of-CAP-theorem
http://www.slideshare.net/yoavaa/introduction-to-the-cap-theorem
http://netwovenblogs.com/2013/10/10/hbase-overview-of-architecture-and-data-model/
NoSQL Distilled A Brief Guide to the Emerging World of Polyglot
NoSQL For Dummies by Adam Fowler
http://www.dataversity.net/acid-vs-base-the-shifting-ph-of-database-transaction-processing/
http://databases.about.com/od/otherdatabases/a/Abandoning-Acid-In-Favor-Of-Base.htm