0% found this document useful (0 votes)
33 views

NoSQL Database

The document discusses NoSQL databases and MongoDB. It introduces NoSQL and its categories including key-value stores, document stores, wide-column stores and graph stores. It then focuses on MongoDB as an example of a document store, explaining its data model using JSON-like documents and collections.

Uploaded by

chloegao
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views

NoSQL Database

The document discusses NoSQL databases and MongoDB. It introduces NoSQL and its categories including key-value stores, document stores, wide-column stores and graph stores. It then focuses on MongoDB as an example of a document store, explaining its data model using JSON-like documents and collections.

Uploaded by

chloegao
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 45

Lecture9

NoSQL Database

IIMT 3601 Database Management


HKU Business School
Instructor: Dr. Shengjun Mao
Agenda
• Introduction to NoSQL
• Categories of NoSQL Databases
• Example: MongoDB
Summary of RDBMS
• RDBMSs have been around for decades
• Data stored in tables
• Schema-based, i.e., structured tables
• Each row (data item) in a table has a primary key that is unique within that table
• Relationships between entities are realized by primary-foreign keys
• Queried using SQL, sometimes also called SQL databases
Popular RDBMS
Requirements of Today’s Workloads
• Speed and volume
• Large and unstructured data
• Incremental scalability
• Adaption to changes in data structure

• Especially, web-based applications caused spikes


• A rapid reduction in storage cost
• Hooking RDBMS to web-based application becomes troublesome
NoSQL Data Model
• NoSQL: abbreviated from “Not Only SQL”
• A category of data storage and retrieval technologies that are not based on the
relational model.
• NoSQL DBMSs provide opportunities for “schema on read”, instead of “schema
on write”.
• Schema on write– data model is predefined
• Schema on read – the reporting and analysis organization of the data will be determined at
the time of the use of the data
• NoSQL DBMSs allow “scaling out”, instead of “scaling up”
• Scale up = grow your cluster capacity by replacing with more powerful machines
• Scale out = incrementally grow your cluster capacity by adding more machines
• Most are from open source communities.
• Designed for big data
NoSQL Data Model
• Schema on write vs Schema on read
NoSQL Data Model
• Key features:
• Non-relational
• Do not require schema
• Horizontal scalable
• Data are replicated to multiple nodes and can be partitioned
• down nodes easily replaced
• no single point of failure
• Cheap, easy to implement (open-source)
• Massive write performance
• Fast key-value access
• …
Benefits of NoSQL
• Elastic Scaling
• RDBMS scale up–bigger load, bigger server
• NoSQL scale out–distribute data across multiple hosts seamlessly
• DBA Specialists
• RDBMS require highly trained expert to monitor DB
• NoSQL require less management, automatic repair and simpler data models
• Big Data
• Huge increase in data RDBMS: capacity and constraints of data volumes at its limits
• NoSQL designed for big data
Benefits of NoSQL
• Flexible data models
• Change management to schema for RDBMS have to be carefully managed
• NoSQL databases more relaxed in structure of data
• Database schema changes do not have to be managed as one complicated change unit
• Application already written to address an amorphous schema
• Economics
• RDBMS rely on expensive proprietary servers to manage data
• NoSQL: clusters of cheap commodity servers to manage the data and transaction volumes
• Cost per gigabyte or transaction/second for NoSQL can be lower than the cost for a RDBMS
Drawbacks of NoSQL
• Support
• RDBMS vendors provide a high level of support to clients
• Stellar reputation
• NoSQL –are open source projects with startups supporting them
• Reputation not yet established
• Maturity
• RDBMS mature product: means stable and dependable
• Also means old no longer cutting edge nor interesting
• NoSQL are still implementing their basic feature set
Drawbacks of NoSQL
• Administration
• RDBMS administrator well defined role
• NoSQL’s goal: no administrator necessary; however, NoSQL still requires effort to maintain
• Lack of Expertise
• Whole workforce of trained and seasoned RDBMS developers
• Still recruiting developers to the NoSQL camp
• Analytics and Business Intelligence
• RDBMS designed to support decision-making
• NoSQL designed to meet the needs of an Web 2.0 application - not designed for ad hoc query
of the data
• Tools are being developed to address this need
• More flexible to include new and unstructured data
NoSQL Databases

See more in https://db-engines.com/en/ranking


Who are using them?
NoSQL categories
• key-value stores
• Example: Redis, Amazon DynamoDB, Microsoft Azure Cosmos DB
• Document stores
• Example: MongoDB, Amazon DynamoDB
• Wide-column stores
• Example: Cassandra, Hbase, Google BigTable
• Graph stores
• Example: Neo4j, Microsoft Azure Cosmos DB
Key-value Stores
• NoSQL databases generally rely on key-value store.
• Format: key: value
• Examples:
• Business: Key  Value
• twitter.com: tweet id  information about tweet
• amazon.com: item number  information about it
• facebook.com: user id  user profile, photos, etc.
• kayak.com: flight number  information about flight, e.g., availability
• yourbank.com: account number  account balances, transaction histories
Key-value stores
• Data model: collection of Key-value pairs

Example: REDIS
• Standard key-value stores
• Values can be strings, lists, sets, hashes etc.
Document-based
• Can model more complex objects
• Data model: collection of documents

• Document: a structured set of data formatted using a standard such as JSON.


• A document has its structures.
• Contents can be accessed and modified based on the structure
• Document itself is accessed via “key”
Document-based
• Example: (MongoDB) document
{
Name:"Jaroslav",
Address:"Malostranske nám. 25, 118 00 Praha 1",
Grandchildren: {Claire: "7", Barbara: "6", "Magda: "3",
"Kirsten: "1", "Otis: "3", Richard: "1"},
Phones: ["123-456-7890", "234-567-8963"]
}
Wide column stores
• Tables similarly to RDBMS, but handle semi-structured data
• But each row have different columns structure
• Data model:
• Collection of Column Families
• Column family = (key, value), where value = set of related columns
• One column family can have variable numbers of columns
• Example: Cassandra
Graph-based
• Designed for modeling the connecting data
• Based on graph theory (Vertex and Edges)
• Data model: (property graph) nodes and relationships
• Nodes have “names” (labels),
• Relationships have “names” (types), with properties
• Nodes and relationships are associated with properties, which are key-values pairs
• Collections of properties associated with each node may vary
• Example: Neo4j
Example: MongoDB
• Developed by 10gen
• Founded in 2007
• Document-based, NoSQL database
• Hash-based, schema-less database
• No Data Definition Language
• In practice, this means you can store hashes with any keys and values that you choose
• Keys are a basic data type but in reality stored as strings
• Document Identifiers (_id) will be created for each document, field name reserved by system
• Uses BSON format
• Written in C++
• Supports APIs (drivers) in many computer languages
• JavaScript, Python, Ruby, Perl, Java, Java Scala, C#, C++, Haskell, Erlang

17
For more details, refer to https://www.mongodb.com/docs/manual/
MongoDB: Hierarchical Objects
• A MongoDB instance may have zero or more ‘databases’
• A database may have zero or more ‘collections’
• A collection may have zero or more ‘documents’
• A document may have one or more ‘fields’
0 or more database
collections More collections
Documents More Documents Documents More Documents
Fields More Fields More Fields More Fields More
Fields Fields Fields Fields
MongoDB: Hierarchical Objects
• Document
• BSON format: Binary-encoded object notation (Binary JSON), JSON-like documents
• Identified by a pair of curly brackets {}
• Key-value pairs are stored
• Fields are separated from each other by “,”
• An array is stored in brackets []
• “_id” field is required in the database
• Can have embedded documents

{
name: "travis",
salary: 30000,
designation: "Computer
Scientist",
teams: [ "front-end",
"database" ]
}
MongoDB: Hierarchical Objects
• Collection
• A set of documents that are intended to be stored together
• Each document in the collection may have different structures.
• Only requirement is _id should be unique in the collection

{
_id : <ObjectId2>,
username : "John Backus”,
birth : ISODate("1924-12-
03T05:00:00Z")
}
MongoDB Concepts to RDBMS

RDBMS MongoDB
Database Database Collection is not strict about what it
stores
Table, View Collection
Row Document (BSON) Schema-less
Column Field
Hierarchy is evident in the design
Index Index
Join Embedded Document Embedded document
MongoDB Processes and Configuration
• mongod – Database instance
• Replica set consists of multiple mongod servers
• Replica set members are mirrors of each other
• One is primary
• Others are secondary
• mongos – Sharding processes
• Analogous to a database router (or work allocator)
• Processes all requests
• Decides how many and which mongods should receive the query
• mongos collates the results, and sends it back to the client

23
MongoDB Processes and Configuration
• mongosh – MongoDB Shell, an interactive shell (a client)
• Fully functional JavaScript and Node.js environment for interacting with a MongoDB

23
Querying MongoDB
• Basic CRUD Operations
• Create
• db.collection.insertOne(<document>)
• db.collection.insertMany(<documents>)
• Read
• db.collection.find(<query>, <projection>)
• Update
• db.collection.updateOne(<query>, <update>, <options>)
• db.collection.updateMany(<query>, <update>, <options>)
• Delete
• db.collection. deleteMany(<query>)
Create Operations
• db.collection specifies the collection or the ‘table’ to store the document
• db.collection.insertOne( <document> )
• db.collection.insertMany( <document> )
• Omit the _id field to have MongoDB generate a unique key
• Example:
• db.example.insertOne( {type: "screwdriver", quantity: 15 })
• db.example.insertOne({_id:10, type: "hammer", quantity: 1 })

31
Create Operations
• Create and insert data in collection “inventory”
db.inventory.insertMany([
{ item: "journal", qty: 25, size: { h: 14, w: 21, uom: "cm" }, status: "A" },
{ item: "notebook", qty: 50, size: { h: 8.5, w: 11, uom: "in" }, status: "A" },
{ item: "paper", qty: 100, size: { h: 8.5, w: 11, uom: "in" }, status: "D" },
{ item: "planner", qty: 75, size: { h: 22.85, w: 30, uom: "cm" }, status: "D" },
{ item: "postcard", qty: 45, size: { h: 10, w: 15.25, uom: "cm" }, status: "A" }
]);
Read Operations
• db.collection.find( <query doc>, <projection doc> )
• Provides functionality similar to the SELECT command
• <query> WHERE condition , <projection> fields in SELECT set
• Empty doc parameters: db.collection.find({}) returns all data (SELECT * FROM collection)
• Query document
• Equality condition example { <field1>: <value1>, ... }
• Example: SELECT * FROM inventory WHERE status = ‘D’
• db.inventory.find( { status: "D" } )
• Result:

32
Read Operations
• db.collection.find( <query doc>, <projection doc> )
• Query Operators
• Specify conditions: {<field1>:{<operator1>: <value1> }, ... }
• Example: SELECT * FROM inventory WHERE status IN ("A", "D")
• db.inventory.find( { status: { $in: [ "A", "D" ] } } )
• Result:

32
Read Operations
• db.collection.find( <query doc>, <projection doc> )
• Query Operators
• Specify AND Conditions {<field1>:<value1>, <field2>:<value2>... }
• Example: SELECT * FROM inventory WHERE status = "A" AND qty < 30
• db.inventory.find( { status: "A", qty: { $lt: 30 } } )
• Result:

32
Read Operations
• db.collection.find( <query doc>, <projection doc> )
• Query Operators
• Specify OR Conditions {$or:[<condition1>, <condition2>... }]}
• Example: SELECT * FROM inventory WHERE status = "A" OR qty < 30
• db.inventory.find( { $or: [ { status: "A" }, { qty: { $lt: 30 } } ] } )
• Result:

32
Read Operations
• db.collection.find( <query doc>, <projection doc> )
• Project Fields: <projection doc>
• <field>: 1 to include a field in the returned documents
• <field>: 0 to exclude a field in the returned documents
• Example: SELECT item, qty, size FROM inventory WHERE status = “D”;
• Note: by default _id will be returned. To exclude it in the results set it to 0 in projection doc.
• db.inventory.find( {status: “D” }, {item: 1, qty: 1, size: 1 } );
• Result:

32
Read Operations
• db.collection.find( <query doc>, <projection doc> )
• sort(<order doc>) pipeline to sort the result
• <field>: 1 sort the result in ascending order
• <field>: -1 sort the result in descending order
• Example: SELECT _id, item, qty, size FROM inventory WHERE status = “D” ORDER BY qty;
• db.inventory.find( {status:“D” }, { item: 1, qty: 1, size:
1 } ).sort({qty : 1});
• Result:

32
Query Operators
Name Description
$eq Matches value that are equal to a specified value
$gt, $gte Matches values that are greater than (or equal to a specified value
$lt, $lte Matches values less than or ( equal to ) a specified value
$ne Matches values that are not equal to a specified value
$in Matches any of the values specified in an array
$nin Matches none of the values specified in an array
$or Joins query clauses with a logical OR returns all
$and Join query clauses with a loginal AND
$not Inverts the effect of a query expression
$exists Matches documents that have a specified field

Update Operations
• db.collection.updateOne( <query>, <update>, <options> )
•update the first document that satisfies the query conditions
•<query doc> is same as query doc in find
•<update doc>
• $set operator updates values of identified fields.

34
Update Operations
• Example: Update for the first “paper” item: size.uom as “cm” and status as “P”
and add an indicator lastModified to show it is updated.
• $currentDate operator sets the value of a field to the current date
db.inventory.updateOne(
{ item: "paper" },
{
$set: { "size.uom": "cm", status: "P" },
$currentDate: { lastModified: true }
}
)

34
Update Operations
• db.collection.updateMany( <query>, <update>, <options> )
•update all the document that satisfies the query conditions
•Example: Update for items with less than 50 in stock: size.uom as “in” and status
as “P”
db.inventory.updateMany( db.inventory.find({ "qty": { $lt: 50 } })
{ "qty": { $lt: 50 } },
{
$set: { "size.uom": "in", status: "P" },
$currentDate: { lastModified: true }
}
)

34
Update Operations
• db.collection.updateMany( <query>, <update>, <options> )
•Example
• UPDATE inventory SET status = “P” WHERE qty < 50;
db.inventory.updateMany(
{ "qty": { $lt: 50 } },
{
$set: {status: "P" }
}
)

34
Delete Operations
• db.collection.deleteMany( <query> )
• Removes all documents that match the filter from a collection
• <query> is the same as find query
• Example: DELETE FROM inventory WHERE status = “A”;
db.inventory.deleteMany({ status : "A" })

• Example: DELETE FROM inventory;


db.inventory.deleteMany({})

35
Announcement
• Final Exam
• Datetime & Venue: Next lecture session
• 9:30 – 11:30am, April 26
• MBG07, this lecture room
• written exam
• Closed books and notes
• Pencils, erasers, pens
• Assignment 2
• Due today, 11:59pm, April 19.
• STLF: Student Teaching&Learning Feedback
• Please complete the SFTL online form at http://sftl.hku.hk/ before May 5.
• Submit in Moodle the snapshot when finished.
• It counts one time in the class participation.
• The End
• Thanks

You might also like