02 - Document-Based and MongoDB
02 - Document-Based and MongoDB
It automatically
Takes care of balancing data and load
Redistributing documents
Routing reads/writes to the correct machines
MongoDB – rich with features
File storage
An easy-to-use protocol for storing large files and metadata
MongoDB – does not sacrifice speed
Document
The basic unit of data
Roughly equivalent to a row in RDBMS
Collection
Multiple documents compose a collection
Can be thought as a table (but has a dynamic schema)
Database
Multiple collections compose a database
Instance
MongoDB instance can host multiple databases
MongoDB - documents
JSON document
The Identifier _id
Data is stored in documents
Documents are made up of key-value pairs
A key can be compared to a column in RDBMS
A key is used for querying data from documents
We also need a key to identify a document within a
collection – this is the _id identifier
If you do not specify its value, MongoDB will do that for you
This key is immutable
It can be of any data type but arrays
Capped Collections
This allows MongoDB to store documents in a collection in
their inserted order
As the collection reaches its limit, documents will be
removed from the collection in FIFO
Good for log files
Polymorphic Schema
It is a schema where a collection has documents of different
types/schemas
Perform queries on
common fields
Perform queries on
specific fields
Installation and Configuration
MongoDB is a cross-platform database
A list of all available packages can be found on MongoDb
downloads page www.mongodb.org/downloads
Choose type of deployment (local, community server)
Example 2
db.createCollection(“mycollection”, {capped: true, size:5234182, max:100})
The Company Example
Showing Collections
Use ‘show collections’ to show collections of active database
Example
MongoDB CRUD Operations
Create/insert operations
insertOne
insertMany
Read operations
find
Update operations
updateOne
updateMany
replaceOne
Delete operations
deleteOne
deleteMany
Insert a Single Document
Syntax
db.collection.insertOne(doc,{writeconcern})
insertOne parameters
Parameter Type Description
doc document A doc to insert to a collection
writeConcern document The level of acknowledgment requested from
MongoDB for write operations
Return value
field Type Description
acknowledged boolean true: If the operation ran with write concern
insertedId string The _id of the inserted document
Insert a Single Document
Behavior
If the collection does not exist, it will be created
If the _id field is not specified, it will be added
On error, it throws an exception (writeError, writeConcernError)
Atomicity
All write operations are atomic on the level of a single document
Insert a Single Document
db.products.insertOne({
_id: 10,
"item" : "packing peanuts",
"qty" : 200
})
Insert a Single Document
db.products.insertOne(
{ "item": "envelopes", "qty": 100, type: "Self-Sealing" },
{ writeConcern: { w : "majority", wtimeout : 100 } }
)
Insert Multiple Documents
Syntax
db.collection.insertMany([doc1, doc2,…],{writeconcern, ordered})
parameters
Parameter Type Description
[doc1,doc2,…] Array of An array of documents to be inserted
documents
writeConcern document (optional) The level of acknowledgment
requested from MongoDB for write
operations
ordered boolean (optional) true if ordered insertion is
required (def. true).
db.products.insertMany( [
{ _id: 10, item: "large box", qty: 20 },
{ _id: 11, item: "small box", qty: 55 },
{ _id: 12, item: "medium box", qty: 30 }
])
Insert Multiple Documents
db.products.insertMany( [ BulkWriteError({
"writeErrors" : [
{ _id: 13, item: "envelopes", qty: 60 }, {
{ _id: 13, item: "stamps", qty: 110 }, "index" : 0,
{ _id: 14, item: "packing tape", qty: 38 } "code" : 11000,
"errmsg" : "E11000 duplicate key error collection: inventory.products index: _id_ dup key:
]) { : 13.0 }",
"op" : {
"_id" : 13,
"item" : "stamps",
"qty" : 110
}
}
],
"writeConcernErrors" : [ ],
"nInserted" : 1,
"nUpserted" : 0,
"nMatched" : 0,
"nModified" : 0,
"nRemoved" : 0,
"upserted" : [ ]
})
Insert Multiple Documents
db.products.insertMany( [
{ _id: 10, item: "large box", qty: 20 },
{ _id: 11, item: "small box", qty: 55 },
{ _id: 11, item: "medium box", qty: 30 },
{ _id: 12, item: "envelope", qty: 100},
{ _id: 13, item: "stamps", qty: 125 },
{ _id: 13, item: "tape", qty: 20},
{ _id: 14, item: "bubble wrap", qty: 30}
], { ordered: false } )
Insert Document(s)
Syntax
db.collection.insert(doc or array of docs,{writeconcern, ordered})
The Company Example – option 1
The Company Example – option 2
Worker references are embedded in the project document
The Company Example – option 3
Workers are embedded in the project document.
No need for the worker collection
Read Operations
Read operations retrieve documents from a collection.
MongoDB provides the following method for querying a collection
for documents
db.collection.find()
You can specify query filters that identify the documents to return
Read Operations - querying
Syntax
db.collection.find(query, projection)
find parameters
Parameter Type Description
query document - Specifies selection filter (optional)
- {} for all documents or leave empty
projection document - Specifies fields to return
- Default all fields
Nested form
{ field: { nestedfield: <value> } }
Read Operations – the bios collection
You can
Project new fields
Project existing fields with new values
It is done with
Aggregation expressions
Literals
Aggregation variables
Read Operations – specify values of projected fields
Read Operations – specify values of projected fields
Read Operations – specify values of projected fields
Read Operations – specify values of projected fields
Read Operations – specify values of projected fields
Read Operations – aggregation expressions
Read Operations – aggregation expressions
db.bios.find(
{ },
{
_id: 0,
name: {
$concat: [
{ $ifNull: [ "$name.aka", "$name.first" ] },
" ",
"$name.last"
]
},
birth: 1,
contribs: 1,
awards: { $cond: { if: { $isArray: "$awards" }, then: { $size: "$awards" }, else: 0 } },
db.inventory.insertMany([
{ item: "journal", qty: 25, size: { h: 14, w: 21, uom: "cm" }, status: "A" },
{ item: "notebook", qty: 50, size: { h: 8.5, w: 11, uom: "in" }, status: "A" },
{ item: "paper", qty: 100, size: { h: 8.5, w: 11, uom: "in" }, status: "D" },
{ item: "planner", qty: 75, size: { h: 22.85, w: 30, uom: "cm" }, status: "D" },
{ item: "postcard", qty: 45, size: { h: 10, w: 15.25, uom: "cm" }, status: "A" }
]);
Uses the $in operator to return documents where _id equals either to 5 or ObjectId(…)
Query – using operators
Name Description
$eq Matches values that are equal to a specified value.
$gt Matches values that are greater than a specified
value.
Name Description
$and Joins query clauses with a logical AND returns all
documents that match the conditions of both clauses.
$not Inverts the effect of a query expression and returns
documents that do not match the query expression.
$nor Joins query clauses with a logical NOR returns all
documents that fail to match both clauses.
$or Joins query clauses with a logical OR returns all
documents that match the conditions of either
clause.
Query – logical operators
Name Description
$and Joins query clauses with a logical AND returns all
documents that match the conditions of both clauses.
$not Inverts the effect of a query expression and returns
documents that do not match the query expression.
$nor Joins query clauses with a logical NOR returns all
documents that fail to match both clauses.
$or Joins query clauses with a logical OR returns all
documents that match the conditions of either
clause.
Query – greater than operator
Query – AND operator
db.inventory.find(
{ $or: [ { status: "A" }, { qty: { $lt: 30 } } ] }
)
db.inventory.find( {
status: "A",
$or: [ { qty: { $lt: 30 } }, { item: /^p/ } ]
})
{ <field>: <value> }
Where
<field> is the name of the array
<value> is the exact array to match
Match an array
db.inventory.insertMany([
{ item: "journal", qty: 25, tags: ["blank", "red"], dim_cm: [ 14, 21 ] },
{ item: "notebook", qty: 50, tags: ["red", "blank"], dim_cm: [ 14, 21 ] },
{ item: "paper", qty: 100, tags: ["red", "blank", "plain"], dim_cm: [ 14, 21 ] },
{ item: "planner", qty: 75, tags: ["blank", "red"], dim_cm: [ 22.85, 30 ] },
{ item: "postcard", qty: 45, tags: ["blue"], dim_cm: [ 10, 15.25 ] }
])
queries for all documents where the tags is an array with exactly two
elements in the specified order:
find an array that contains both the elements without regard to order or other
elements in the array
db.inventory.insertMany( [
{ item: "journal", instock: [ { warehouse: "A", qty: 5 }, { warehouse: "C", qty: 15 } ] },
{ item: "notebook", instock: [ { warehouse: "C", qty: 5 } ] },
{ item: "paper", instock: [ { warehouse: "A", qty: 60 }, { warehouse: "B", qty: 15 } ] },
{ item: "planner", instock: [ { warehouse: "A", qty: 40 }, { warehouse: "B", qty: 5 } ] },
{ item: "postcard", instock: [ { warehouse: "B", qty: 15 }, { warehouse: "C", qty: 35 } ] }
]);
Query for a document nested in an array
Selects all documents where an element in the instock array matches the
specified document
db.inventory.find( { "instock": { warehouse: "A", qty: 5 } } )
Query on a field embedded in the array of documents
Selects all documents where the instock array has at least one
embedded document that contains the field qty whose value is less
than or equal to 20
db.inventory.find( { 'instock.qty': { $lte: 20 } } )
Query on a field embedded in the array of documents
Selects all documents where the instock array has as its first element a
document that contains the field qty whose value is less than or equal to
20
Returns documents where the awards array contains at least one element
with award=“turing award” and year > 1980
Selects documents where the instock array has at least one embedded
document that contains the qty equal to 5 and warehouse equal to A:
db.inventory.find(
{ "instock": { $elemMatch: { qty: 5, warehouse: "A" } } }
)
queries for documents where the instock array has at least one embedded
document that contains qty that is greater than 10 and less than or equal to 20
db.inventory.find(
{ "instock": { $elemMatch: { qty: { $gt: 10, $lte: 20 } } } } )
The Query Returned Cursor
sorts the documents first by the age field in descending order and then by the posts
field in ascending order
db.users.find({ }).sort( { age : -1, posts: 1 } )
Limit the Number of Documents to Return
Update Operations
Read operations modify documents in a collection
MongoDB provides the following methods for updating
updateOne
updateMany
replaceOne
{
operator1: {field1: value1},
operator2: {field2: value2},
…
}
Update Operators – fields
Field operators
name format Description
$inc { $inc: { <f1>: <v1>, ... } } f1 = f1 + v1
$min { $min: { <f1>: <v1>, ... } } if v1 < f1 then f1 = v1
$max { $max: { <f1>: <v1>, ... } } if v1 > f1 then f1 = v1
$set { $set: { <f1>: <v1>, ... } } f1 = v1
$mul { $mul: { <f1>: <v1>, ... } } f1 = f1 * v1
$unset { $unset: { <f1>: “”, ... } } deletes f1 from the document
$rename { $inc: { <f1>: <newname>, ... } } renames a field
Update Operators – fields
db.restaurant.updateOne(
{ "name" : "Central Perk Cafe" },
{ $set: { "violations" : 3 } }
)
Update Operators – arrays – $ operator
Refers to the first array element that matches the update filter
parameter
Syntax
db.collection.updateOne(
{ <array>: value ... },
{ <update operator>: { "<array>.$" : value } }
)
Values in an array
db.students.insert([
{ "_id" : 1, "grades" : [ 85, 80, 80 ] },
{ "_id" : 2, "grades" : [ 88, 90, 92 ] },
{ "_id" : 3, "grades" : [ 85, 100, 90 ] }
])
db.students.updateOne(
{ _id: 1, grades: 80 },
{ $set: { "grades.$" : 82 } }
)
Update Operators – arrays – $ operator
Documents in an array
{
_id: 4,
grades: [
{ grade: 80, mean: 75, std: 8 },
{ grade: 85, mean: 90, std: 5 },
{ grade: 85, mean: 85, std: 8 }
]
}
db.students.updateOne(
{ _id: 4, "grades.grade": 85 },
{ $set: { "grades.$.std" : 6 } }
)
Update Operators – arrays – $ operator
Embedded Documents Using Multiple Field Matches
{
_id: 4,
grades: [
{ grade: 80, mean: 75, std: 8 },
{ grade: 85, mean: 90, std: 5 },
{ grade: 85, mean: 85, std: 8 }
]
}
db.students.updateOne(
{
_id: 4,
grades: { $elemMatch: { grade: { $lte: 90 }, mean: { $gt: 80 } } }
},
{ $set: { "grades.$.std" : 6 } }
)
Update Operators – arrays – $[ ] operator
updates all array elements in the specified array that matches the
query
Syntax
db.collection.updateMany(
{ <query> },
{ <update operator>: { "<array>.$[ ]" : value } }
)
Update Operators – arrays – $[ ] operator
db.students.insert([
{ "_id" : 1, "grades" : [ 85, 80, 80 ] },
{ "_id" : 2, "grades" : [ 88, 90, 92 ] },
{ "_id" : 3, "grades" : [ 85, 100, 90 ] }
])
db.students.update(
{ },
{ $inc: { "grades.$[]": 10 } },
{ multi: true }
)
Update Operators – arrays – $[ ] operator
db.results.updateMany(
{ "grades" : { $ne: 100 } },
{ $inc: { "grades.$[]": 10 } },
{ multi: true }
)
db.collection.updateMany(
{ <query conditions> },
{ <update operator>: { "<array>.$[<identifier>]" : value } },
{ arrayFilters: [ { <identifier>: <condition> } ] }
)
Update Operators – arrays – $[identifier] operator
Update array elements that match arrayFilters
{ "_id" : 1, "grades" : [ 95, 92, 90 ] }
{ "_id" : 2, "grades" : [ 98, 100, 102 ] }
{ "_id" : 3, "grades" : [ 95, 110, 100 ] }
db.students.update(
{ },
{ $set: { "grades.$[element]" : 100 } },
{ multi: true,
arrayFilters: [ { "element": { $gte: 100 } } ]
}
)
Name Description
Name Description
$each Modifies the $push and $addToSet operators to
append multiple items for array updates.
$position Modifies the $push operator to specify the position in
the array to add elements.
$slice Modifies the $push operator to limit the size of
updated arrays.
$sort Modifies the $push operator to reorder documents
stored in an array.
Update Operators – arrays – modifiers
db.students.update(
{ name: "joe" },
{ $push: { scores: { $each: [ 90, 92, 85 ] } } }
)
db.inventory.update(
{ _id: 2 },
{ $addToSet: { tags: { $each: [ "camera", "electronics",
"accessories" ] } } }
)
Update with an Aggregation Pipeline
Aggregation operations process data records and return computed
results
They group values from multiple documents
Perform a variety of operations on the grouped data
Return a single result
MongoDb provides three ways to perform aggregations:
Aggregation pipeline
Map-reduce function
Single purpose aggregation methods
Aggregation Pipeline
MongoDB’s aggregation framework is modeled on the concept of
data processing pipelines
Aggregation pipeline consists of stages
Each stage transforms the documents as they pass through the pipeline
Stages do not need to produce one output document for every input
document
Some stages may generate new documents and others may filter out
documents
MongoDB provides the aggregate() shell method to run aggregate
pipeline
It can be applied for updates
Using Aggregation Pipeline in Update
Update method syntax
db.collection.updateOne(filter, aggregation pipeline, options)
db.collection.updateOne(filter, [stage1, stage2,…], options)
{ "_id" : 2, "member" : "xyz123", "status" : "A", "points" : 60, comments: [ "reminder: ping me
at 100pts", "Some random comment" ], "lastUpdate" : ISODate("2019-01-01T00:00:00Z") }
])
db.members.updateOne(
{ _id: 1 },
[
{ $set: { status: "Modified", comments: [ "$misc1", "$misc2" ], lastUpdate: "$$NOW" } },
{ $unset: [ "misc1", "misc2" ] }
]
)
Using Aggregation Pipeline in Update
The third document _id: 3 is missing the average and grade fields. Using an
aggregation pipeline, you can update the document with the calculated grade
average and letter grade.
db.students3.insert([
{ "_id" : 1, "tests" : [ 95, 92, 90 ], "average" : 92, "grade" : "A", "lastUpdate" : ISODate("2020-01-
23T05:18:40.013Z") },
{ "_id" : 2, "tests" : [ 94, 88, 90 ], "average" : 91, "grade" : "A", "lastUpdate" : ISODate("2020-01-
23T05:18:40.013Z") },
{ "_id" : 3, "tests" : [ 70, 75, 82 ], "lastUpdate" : ISODate("2019-01-01T00:00:00Z") }
]);
Using Aggregation Pipeline in Update
The third document _id: 3 is missing the average and grade fields. Using an
aggregation pipeline, you can update the document with the calculated grade
average and letter grade.
db.students3.updateOne(
{ _id: 3 },
[
{ $set: { average: { $trunc: [ { $avg: "$tests" }, 0 ] }, lastUpdate: "$$NOW" } },
db.restaurant.replaceOne(
{ "name" : "Central Perk Cafe" },
{ "name" : "Central Pork Cafe", "Borough" : "Manhattan" }
)
Replace a Single Document
Attempts to replace the document with name: “Pizza Rat’s Pizzaria” with
upsert : true
{ "_id" : 1, "name" : "Central Perk Cafe", "Borough" : "Manhattan", "violations" : 3 },
{ "_id" : 2, "name" : "Rock A Feller Bar and Grill", "Borough" : "Queens", "violations" : 2 },
{ "_id" : 3, "name" : "Empire State Pub", "Borough" : "Brooklyn", "violations" : 0 }
db.restaurant.replaceOne(
{ "name" : "Pizza Rat's Pizzaria" },
{ "_id": 4, "name" : "Pizza Rat's Pizzaria", "Borough" : "Manhattan", "violations" : 8 },
{ upsert: true }
)
Delete Operations
Delete operations remove documents from a collection
MongoDB provides the following methods for updating
deleteOne
deleteMany
delete operations target a single collection
All write operation are atomic on the level of a single
document
Delete a Single Document
Syntax
db.collection.deleteOne(filter, options)
deleteOne options parameter
field Type Description
writeConcern document A document expressing the write concern
collaction document Allows users to specify language-specific rules for
string comparison (lettercase…)
db.orders.deleteOne( { "_id" :
ObjectId("563237a41a4d68582c2509da") } )
Delete a Single Document
Deletes the first document with expiryts > ISODate("2015-11-01T12:40:15Z"
{
_id: ObjectId("563237a41a4d68582c2509da"),
stock: "Brent Crude Futures",
qty: 250,
type: "buy-limit",
limit: 48.90,
creationts: ISODate("2015-11-01T12:30:15Z"),
expiryts: ISODate("2015-11-01T12:35:15Z"),
client: "Crude Traders Inc."
}