0% found this document useful (0 votes)
15 views

advanced-developer-student-workbook

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

advanced-developer-student-workbook

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

MongoDB Advanced Developer

MongoDB Advanced Developer


Training
Release 3.4

MongoDB, Inc.

Jun 28, 2017

Contents
1 Advanced Schema Design 2
1.1 Case Study: Time Series Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Case Study: Content Management System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Case Study: Social Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.4 Case Study: Shopping Cart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.5 Lab: Data Model for an E-Commerce Site . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2 Application Engineering 22
2.1 MongoMart Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.2 Java Driver Labs (MongoMart) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.3 Python Driver Labs (MongoMart) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

1
1 Advanced Schema Design

Case Study: Time Series Data (page 2) Case Study: Time Series Data
Case Study: Content Management System (page 5) Case Study: CMS
Case Study: Social Network (page 10) Case Study: Social Network
Case Study: Shopping Cart (page 15) Case Study: Shopping Cart
Lab: Data Model for an E-Commerce Site (page 20) Schema design group exercise

1.1 Case Study: Time Series Data

Learning Objectives

Upon completing this module, students should understand:


• Various methods for effectively storing time series data in MongoDB
• Trade-offs in methods to store time series data

Time Series Use Cases

• Atlas/Cloud Manager/Ops Manager pre-record a lot of stats in time series fields

Building a Database Monitoring Tool

• Monitor hundreds of thousands of database servers


• Ingest metrics every 1-2 seconds
• Scale the system as new database servers are added
• Provide real-time graphs and charts to users

Potential Relational Design

RDBMS row for client “1234”, recording 50k database operations, at 2015-05-29 (23:06:37):

"clientid" (integer): 1234


"metric (varchar): "op_counter"
"value" (double): 50000
"timestamp" (datetime): 2015-05-29T23:06:37.000Z

2
Translating the Relational Design to MongoDB Documents

RDBMS Row for client “1234”, recording 50k database operations, at 2015-05-29 (23:06:37):

{
"clientid": 1234,
"metric": "op_counter",
"value": 50000,
"timestamp": ISODate("2015-05-29T23:06:37.000Z")
}

Problems With This Design

• Aggregations become slower over time, as database becomes larger


• Asynchronous aggregation jobs won’t provide real-time data
• We aren’t taking advantage of other MongoDB data types

A Better Design for a Document Database

Storing one document per hour (1 minute granularity):

{
"clientid" : 1234,
"timestamp": ISODate("2015-05-29T23:06:00.000Z"),
"metric": "op_counter",
"values": {
0: 0,
...
37: 50000,
...
59: 2000000
}
}

Performing Updates

Update the exact minute in the hour where the op_counter was recorded:

> db.metrics_by_minute.updateOne( {
"clientid" : 1234,
"timestamp": ISODate("2015-05-29T23:06:00.000Z"),
"metric": "op_counter"},
{ $set : { "values.37" : 50000 } })

3
Performing Updates By Incrementing Counters

Increment the counter for the exact minute in the hour where the op_counter metric was recorded:

> db.metrics_by_minute.updateOne( {
"clientid" : 1234,
"timestamp": ISODate("2015-05-29T23:06:00.000Z"),
"metric": "insert"},
{ $inc : { "values.37" : 50000 } })

Displaying Real-time Charts

Metrics with 1 minute granularity for the past 24 hours (24 documents):

> db.metrics_by_minute.find( {
"clientid" : 1234,
"metric": "insert"})
.sort ({ "timestamp" : -1 })
.limit(24)

Condensing a Day’s Worth of Metric Data Into a Single Document

With one minute granularity, we can record a day’s worth of data and update it efficiently with the following structure
(values.<HOUR_IN_DAY>.<MINUTE_IN_HOUR>):

{
"clientid" : 1234,
"timestamp": ISODate("2015-05-29T00:00:00.000Z"),
"metric": "insert",
"values": {
"0": { 0: 123, 1: 345, ..., 59: 123},
...
"23": { 0: 123, 1: 345, ..., 59: 123}
}
}

Considerations

• Document structure depends on the use case


• Arrays can be used in place of embedded documents
• Avoid growing documents (and document moves) by pre-allocating blank values

4
Class Exercise

Look through some charts in MongoDB’s Cloud Manager, how would you represent the schema for those charts,
considering:
• 1 minute granularity for 48 hours
• 5 minute granularity for 48 hours
• 1 hour granularity for 2 months
• 1 day granularity forever
• Expiring data
• Rolling up data
• Queries for charts

1.2 Case Study: Content Management System

Learning Objectives

Upon completing this module, students should understand:


• Various methods for effectively designing the schema for a Content Management System (CMS) in MongoDB
• Optimizations to the schema, and their tradeoffs

Building a CMS with MongoDB

• CMS stands for Content Management System.


• nytimes.com1 , cnn.com2 , and huffingtonpost.com3 are good examples to explore.
• For the purposes of this case study, let’s use any article page from huffingtonpost.com4 .

Building a CMS in a Relational Database

There are many tables for this example, with multiple queries required for every page load.
Potential tables
• article
• author
• comment
• tag
• link_article_tag
• link_article_article (related articles)
• etc.
1 http://nytimes.com
2 http://cnn.com
3 http://huffingtonpost.com
4 http://huffingtonpost.com

5
Building a CMS in MongoDB

{
"_id" : 334456,
"slug" : "/apple-reports-second-quarter-revenue",
"headline" : "Apple Reported Second Quarter Revenue Today",
"date" : ISODate("2015-03-24T22:35:21.908Z"),
"author" : {
"name" : "Bob Walker",
"title" : "Lead Business Editor"
},
"copy" : "Apple beat Wall St expectations by reporting ...",
"tags" : [
"AAPL", "Earnings", "Cupertino"
],
"comments" : [
{ "name" : "Frank", "comment" : "Great Story", "date" : ISODate(...) },
{ "name" : "Wendy", "comment" : "+1", "date" : ISODate(...) }
]
}

Benefits of the Relational Design

With Normalized Data:


• Updates to author information are inexpensive
• Updates to tag names are inexpensive

Benefits of the Design with MongoDB

• Much faster reads


• One query to load a page
• The relational model would require multiple queries and/or many joins.

Every System has Tradeoffs

• Relational design will provide more efficient writes for some data.
• MongoDB design will provide efficient reads for common query patterns.
• A typical CMS may see 1000 reads (or more) for every article created (write).

6
Optimizations

• Optimizing comments
– What happens when an article has one million comments?
• Include more information associated with each tag
• Include stock price information with each article
• Fields specific to an article type

Optimizing Comments Option 1

Changes:
• Include only the last N comments in the “main” document.
• Put all other comments into a separate collection
– One document per comment
Considerations:
• How many comments are shown on the first page of an article?
– This example assumes 10.
• What percentage of users click to read more comments?

Optimizing Comments Option 1: Example

{
"_id" : 334456,
"slug" : "/apple-reports-second-quarter-revenue",
"headline" : "Apple Reported Second Quarter Revenue Today",
...
"last_10_comments" : [
{ "name" : "Frank", "comment" : "Great Story", "date" : ISODate() },
{ "name" : "Wendy",
"comment" : "When can I buy an Apple Watch?",
"date" : ISODate() }
]
}

7
Optimizing Comments Option 1

Considerations:
• Adding a new comment requires writing to two collections
• If the 2nd write fails, that’s a problem.

> db.blog.updateOne(
{ "_id" : 334456 },
{ $push: {
"comments": {
$each: [ {
"name" : "Frank",
"comment" : "Great Story",
"date" : ISODate()
} ],
$sort: { date: -1 },
$slice: 10 } } } )
> db.comments.insertOne( { "article_id" : 334456, name" : "Frank",
"comment" : "Great Story", "date" : ISODate() })

Optimizing Comments Option 2

Changes:
• Use a separate collection for comments, one document per comment.
Considerations:
• Now every page load will require at least 2 queries
• But adding new comments is less expensive than for Option 1.
– And adding a new comment is an atomic operation

> db.comments.insertOne( { "article_id" : 334456, name" : "Frank",


"comment" : "Great Story", "date" : ISODate() })

Include More Information With Each Tag

Changes:
• Make each tag a document with multiple fields.

{
"_id" : "/apple-reports-second-quarter-revenue",
...
"tags" : [
{ "type" : "ticker", "label" : "AAPL" },
{ "type" : "financials", "label" : "Earnings" },
{ "type" : "location", "label" : "Cupertino" }
]
}

8
Include More Information With Each Tag

Considerations:
• $elemMatch is now important for queries

> db.article.find( {
"tags" : {
"$elemMatch" : {
"type" : "financials",
"label" : "Earnings"
}
}
} )

Include Stock Price Information With Each Article

• Maintain the latest stock price in a separate collection


General Rule:
• Don’t de-normalize data that changes frequently!

Fields Specific to an Article Type

Change:
• Fields specific to an article are added to the document.

{
"_id" : 334456,
...
"executive_profile" : {
"name" : "Tim Cook",
"age" : 54,
"hometown" : {
"city" : "Mobile",
"state" : "AL"
},
"photo_url" : "http://..."
}
}

9
Class Exercise 1

Design a CMS similar to the above example, but with the following additional requirements:
• Articles may be in one of three states: “draft”, “copy edit”, “final”
• History of articles as they move between states must be captured, as well as comments by the person moving
the article to a different state
• Within each state, every article must be versioned. If there is a problem, the editor can quickly revert to a
previous version.

Class Exercise 2

• Consult NYTimes, CNN, and huff post for some ideas about other types of views we might want to support.
• How would we support these views?
• Would we require other document types?

Class Exercise 3 (Sharding)

• Consider a production deployment of our CMS.


• First, what should our shard key be?
• Second, assuming our Primary servers are distributed across multiple data centers in different regions, how
might we shard our articles collection to reduce latency?

1.3 Case Study: Social Network

Learning Objectives

Upon completing this module, students should understand:


• Design considerations for building a social network with MongoDB
• Maintaining relationships between users
• Creating a feed service (similar to Facebook’s newsfeed)

10
Design Considerations

• User relationships (followers, followees)


• Newsfeed requirements

User Relationships

What are the problems with the following approach?

db.users.find()
{
"_id" : "bigbird",
"fullname" : "Big Bird",
"followers" : [ "oscar", "elmo"],
"following" : [ "elmo", "bert"],
...
}

User Relationships

Relationships must be split into separate documents:


• This will provide performance benefits.
• Other motivations:
– Some users (e.g., celebrities) will have millions of followers.

* Embedding a “followers” array would literally break the app: documents are limited to 16 MB.
– Different types of relationships may have different fields and requirements.

User Relationships

> db.followers.find()
{ "_id" : ObjectId(), "user" : "bigbird", "following" : "elmo" }
{ "_id" : ObjectId(), "user" : "bigbird", "following" : "bert" }
{ "_id" : ObjectId(), "user" : "oscar", "following" : "bigbird" }
{ "_id" : ObjectId(), "user" : "elmo”", "following" : "bigbird" }

11
Improving User Relationships

Now meta-data about the relationship can be added:

> db.followers.find()
{
"_id" : ObjectId(),
"user" : "bigbird",
"following" : "elmo",
"group" : "work",
"follow_start_date" : ISODate("2015-05-19T06:01:17.171Z")
}

Counting User Relationships

• Counts across a large number of documents may be slow


– Option: maintain an active count in the user profile
• An active count of followers and folowees will be more expensive for creating relationships
– Requires an update to both user documents (plus a relationship document) each time a relationship is
changed
– For a read-heavy system, this cost may be worth paying

Counting User Relationships

> db.users.find()
{
"_id" : "bigbird",
"fullname" : "Big Bird",
"followers" : 2,
"following" : 2,
...
}

User Relationship Traversal

• Index needed on (followers.user, followers.following)


• For reverse lookups, index needed on (followers.following, followers.user)
• Covered queries should be used in graph lookups (via projection)
• May also want to maintain two separate collections: followers, followees

12
User Relationships

• We’ve created a simple, scalable model for storing user relationships

Building a Feed Service

• Newsfeed similar to Facebook


• Show latest posts by followed users
• Newsfeed queries must be extremely fast

Feed Service Design Considerations

Two options:
• Fanout on Read
• Fanout on Write

Fanout on Read

• Newsfeed is generated in real-time, when page is loaded


• Simple to implement
• Space efficient
• Reads can be very expensive (e.g. if you follow 1 million users)

When to Use Fanout on Read

• Newsfeed is viewed less often than posts are made


• Small scale system, users follow few people
• Historic timeline information is commonly viewed

Fanout on Write

• Modify every users timeline when a new post or activity is created by a person they follow
• Extremely fast page loads
• Optimized for case where there are far less posts than feed views
• Scales better for large systems than fanout on read
• Feed updates can be performed asynchronously

13
Fanout on Write

> db.feed.find({"user : "bigbird"}).sort({"date" : -1})


{
"_id" : ObjectId(),
"user" : "bigbird",
"date" : ISODate("2015-05-19T06:01:17.171Z"),
"content" : {
"user" : "cookiemonster",
"post" : "I love cookies!"
}
}

Fanout on Write

• What happens when Cookie Monster creates a new post for his 1 million followers?
• What happens when posts are edited or updated?

Fanout on Write (Non-embedded content)

> db.feed.find({"user : "bigbird"}).sort({"date" : -1})


{
"_id" : ObjectId(),
"user" : "bigbird",
"date" : ISODate("2015-05-19T06:01:17.171Z"),
"content_id" : ObjectId("...de1")
}

> db.content.find({"_id" : ObjectId("...de1")"})

14
Fanout on Write Considerations

• Content can be embedded or referenced


• Feed items may be organized in buckets per user per day
• Feed items can also be bucketed in batches (such as 100 posts per document)

Fanout on Write

• When the following are true:


– The number of newsfeed views are greater than content posts
– The number of users to the system is large
• Fanout on write provides an efficient way to maintain fast performance as the system becomes large.

Class Exercise

Look through a Twitter timeline. E.g. http://twitter.com/mongodb


• Create an example document for a user’s tweets
• Design a partial schema just for for a Twitter user’s newsfeed (including retweets, favorites, and replies)
• Build the queries for the user’s newsfeed
– Initial query
– Later query if they scroll down to see more
• What indexes would the user need?
• Don’t worry about creating the newsfeed documents; assume an application is creating entries, and just worry
about displaying it.

1.4 Case Study: Shopping Cart

Learning Objectives

Upon completing this module, students should understand:


• Creating and working with a shopping cart data model in MongoDB
• Trade offs in shopping cart data models

15
Shopping Cart Requirements

• Shopping cart size will stay relatively small (less than 100 items in most cases)
• Expire the shopping cart after 20 minutes of inactivity

Advantages of using MongoDB for a Shopping Cart

• One simple document per cart (note: optimization for large carts below)
• Sharding to partition workloads during high traffic periods
• Dynamic schema for specific styles/values of an item in a cart (e.g. “Red Sweater”, “17 Inch MacBook Pro
20GB RAM”)

Modeling the Shopping Cart

{
"_id": ObjectId("55932ef370c32e23e6552ced"),
"userid": 1234,
"last_activity": ISODate(...),
"status" : "active",
"items" : [
{
"itemid": 4567,
"title": "Milk",
"price": 5.00,
"quantity": 1,
"img_url": "milk.jpg"
},
{
"itemid": 8910,
"title": "Eggs",
"price": 3.00,
"quantity": 1,
"img_url": "eggs.jpg"
} ]
}

Modeling the Shopping Cart

• Denormalize item information we need for displaying the cart: item name, image, price, etc.
• Denormalizing item information saves an additional query to the item collection
• Use the “last_activity” field for determining when to expire carts
• All operations to the “cart” document are atomic, e.g. adding/removing items, or changing the cart status to
“processing”

16
Add an Item to a User’s Cart

db.cart.updateOne({
"_id": ObjectId("55932ef370c32e23e6552ced")
}, {
$push : {
"items" : {
"itemid": 1357,
"title": "Bread",
"price": 2.00,
"quantity": 1,
"img_url": "bread.jpg"
}
},
$set : {
"last_activity" : ISODate()
}
}
)

Updating an Item in a Cart

• Change the number of eggs in a user’s cart to 5


• The positional $ operator identifies an element in an array to update without explicitly specifying the position
of the element in the array
• Make sure to update the “last_activity” field

db.cart.updateOne({
"_id": ObjectId("55932ef370c32e23e6552ced"),
"items.itemid" : 4567
}, {
$set : {
"items.$.quantity" : 5,
"last_activity" : ISODate()
}
})

Remove an Item from a User’s Cart

db.cart.updateOne({
"_id": ObjectId("55932ef370c32e23e6552ced")
}, {
$pull : {
"items" : { "itemid" : 4567 }
},
$set : {
"last_activity" : ISODate()
}
})

17
Tracking Inventory for an Item

• Use a “item” collection to store more detailed item information


• “item” collection will also maintain a “quantity” field
• “item” collection may also maintain a “quantity_in_carts” field
• When an item is added or removed from a user’s cart, the “quantity_in_carts” field should be incremented or
decremented

{
"_id": 8910,
"img_url": "eggs.jpg",
"quantity" : 2000,
"quantity_in_carts" : 3
...
}

Tracking Inventory for a item

Increment “quantity_in_carts”

db.item.updateOne(
{ "_id": 8910 },
{ $inc : { "quantity_in_carts" : 1 } } )

Decrement “quantity_in_carts”

db.item.updateOne(
{ "_id": 8910 },
{ $inc : { "quantity_in_carts" : -1 } } )

Using aggregate() to Determine Number of Items Across User Carts

• Aggregate can be used to query for number of items across all user carts

// Ensure there is an index on items.itemid


db.cart.createIndex({"items.itemid" : 1})

db.cart.aggregate(
{ $match : { "items.itemid" : 8910 } },
{ $unwind : "$items" },
{ $group : {
"_id" : "$items.itemid",
"amount" : { "$sum" : "$items.quantity" }
} }
)

18
Expiring the Shopping Cart

Three options:
• Use a background process to expire items in the cart collection and update the “quantity_in_carts” field.
• Create a TTL index on “last_activity” field in “cart” collection. Remove the “quantity_in_carts” field from the
item document and create a query for determining the number of items currently allocated to user carts
• Create a background process to change the “status” field of expired carts to “inactive”

Shopping Cart Variations

• Efficiently store very large shopping carts (1000+ items per cart)
• Expire items individually

Efficiently Storing Large Shopping Carts

• The array used for the “items” field will lead to performance degradation as the array becomes very large
• Split cart into “cart” and “cart_item” collections

Efficiently Storing Large Shopping Carts: “cart” Collection

• All information for the cart or order (excluding items)

{
"_id": ObjectId("55932ef370c32e23e6552ced"),
"userid": 1234,
"last_activity": ISODate(...),
"status" : "active",
}

Efficiently Storing Large Shopping Carts: “cart_item” Collection

• Include “cartid” reference


• Index required on “cartid” for efficient queries

{
"_id" : ObjectId("55932f6670c32e23f119073c"),
"cartid" : ObjectId("55932ef370c32e23e6552ced"),
"itemid": 1357,
"title": "Bread",
"price": 2.00,
"quantity": 1,
"img_url": "bread.jpg",
"date_added" : ISODate(...)
}

19
Expire Items Individually

• Add a TTL index to the “cart_item” document for the “date_added” field
• Expiration would occur after a certain amount of time from when the item was added to the cart, similar to a
ticketing site, or flash sale site

Class Exercise

Design a shopping cart schema for a concert ticket sales site:


• Traffic will dramatically spike at times (many users may rush to the site at once)
• There are seated and lawn sections, only one ticket can be sold per seat/concert, many tickets for the lawn section
per concert
• The system will be sharded, and appropriate shard keys will need to be selected

1.5 Lab: Data Model for an E-Commerce Site

Introduction

• In this group exercise, we’re going to take what we’ve learned about MongoDB and develop a basic but reason-
able data model for an e-commerce site.
• For users of RDBMSs, the most challenging part of the exercise will be figuring out how to construct a data
model when joins aren’t allowed.
• We’re going to model for several entities and features.

Product Catalog

• Products. Products vary quite a bit. In addition to the standard production attributes, we will allow for variations
of product type and custom attributes. E.g., users may search for blue jackets, 11-inch macbooks, or size 12
shoes. The product catalog will contain millions of products.
• Product pricing. Current prices as well as price histories.
• Product categories. Every e-commerce site includes a category hierarchy. We need to allow for both that
hierarchy and the many-to-many relationship between products and categories.
• Product reviews. Every product has zero or more reviews and each review can receive votes and comments.

20
Product Metrics

• Product views and purchases. Keep track of the number of times each product is viewed and when each
product is purchased.
• Top 10 lists. Create queries for top 10 viewed products, top 10 purchased products.
• Graph historical trends. Create a query to graph how a product is viewed/purchased over the past.
• 30 days with 1 hour granularity. This graph will appear on every product page, the query must be very fast.

Deliverables

Break into groups and work together to create the following deliverables:
• Sample document and schema for each collection
• Queries the application will use
• Index definitions

Solution

All slides from now on should be shown only after a solution is found by the groups & presented.

21
2 Application Engineering

MongoMart Introduction (page 22) MongoMart Introduction


Java Driver Labs (MongoMart) (page 23) Build an e-commerce site backed by MongoDB (Java)
Python Driver Labs (MongoMart) (page 25) Build an e-commerce site backed by MongoDB (Python)

2.1 MongoMart Introduction

What is MongoMart

MongoMart is an on-line store for buying MongoDB merchandise. We’ll use this application to learn more about
interacting with MongoDB through the driver.

MongoMart Demo of Fully Implemented Version

• View Items
• View Items by Category
• Text Search
• View Item Details
• Shopping Cart

View Items

• http://localhost:8080
• Pagination and page numbers
• Click on a category

View Items by Category

• http://localhost:8080/?category=Apparel
• Pagination and page numbers
• “All” is listed as a category, to return to all items listing

22
Text Search

• http://localhost:8080/search?query=shirt
• Search for any word or phrase in item title, description or slogan
• Pagination

View Item Details

• http://localhost:8080/item?id=1
• Star rating based on reviews
• Add a review
• Related items
• Add item to cart

Shopping Cart

• http://localhost:8080/cart
• Adding an item multiple times increments quantity by 1
• Change quantity of any item
• Changing quantity to 0 removes item

2.2 Java Driver Labs (MongoMart)

Introduction

• In this lab, we’ll set up and optimize an application called MongoMart. MongoMart is an on-line store for
buying MongoDB merchandise.

Lab: Setup and Connect to the Database

• Import the “item” collection to a standalone MongoDB server (without replication) as noted in the README.md
file of the /data directory of MongoMart
• Become familiar with the structure of the Java application in /java/src/main/java/mongomart/
• Modify the MongoMart.java class to properly connect to your local database instance

23
Lab: Populate All Necessary Database Queries

• After running the MongoMart.java class, navigate to “localhost:8080” to view the application
• Initially, all data is static and the application does not query the database
• Modify the ItemDao.java and CartDao.java classes to ensure all information comes from the database (do not
modify the method return types or parameters)

Lab: Use a Local Replica Set with a Write Concern

• It is important to use replication for production MongoDB instances, however, Lab 1 advised us to use a stan-
dalone server.
• Convert your local standalone mongod instance to a three node replica set named “shard1”
• Modify MongoMart’s MongoDB connection string to include at least two nodes from the replica set
• Modify your application’s write concern to MAJORITY for all writes to the “cart” collection, any writes to the
“item” collection should continue using the default write concern of W:1

Lab: Improve How Reviews are Stored and Queried

• Currently, all reviews are stored in an “item” document, within a “reviews” array. This is problematic for the
cases when the number of reviews for a product becomes extremely large.
• Create a new collection called “review” and modify the “reviews” array within the “item” collection to only
contain the last 10 reviews.
• Modify the application to update the last 10 reviews for an item, the average number of stars (based on reviews)
for an item, and insert the review into the new “review” collection

Lab: Use Range Based Pagination

• Pagination throughout MongoMart uses the inefficient sort() and limit() method
• Optimize MongoMart to use range based pagination
• You may modify method names and return values for this lab

Lab: Related Items

• Modify each item document to include an array of “related_items” (maximum is 4).


• Related items should include other items within the same category
• Randomized items can be used if there are less than 4.

24
2.3 Python Driver Labs (MongoMart)

Introduction

• In this lab, we’ll set up and optimize an application called MongoMart.


• MongoMart is an on-line store for buying MongoDB merchandise.

Lab: Setup and Connect to the Database

• Import the “item” collection to a standalone MongoDB server (without replication) as noted in the README.md
file of the /data directory of MongoMart
• Become familiar with the structure of the Python application in /
• Start the application by running “python mongomart.py”, stop it by using ctrl-c
• Modify the mongomart.py file to properly connect to your local database instance

Lab: Populate All Necessary Database Queries

• After running “python mongomart.py”, navigate to “localhost:8080” to view the application


• Initially, all data is static and the application does not query the database
• Modify the itemDao.py and cartDao.py files to ensure all information comes from the database (you will not
need to modify parameters for each DAO method)

Lab: Use a Local Replica Set with a Write Concern

• It is important to use replication for production MongoDB instances, however, Lab 1 advised us to use a stan-
dalone server.
• Convert your local standalone mongod instance to a three node replica set named “rs0”
• Modify MongoMart’s MongoDB connection string to include at least two nodes from the replica set
• Modify your application’s write concern to MAJORITY for all writes to the database

Lab: Improve How Reviews are Stored and Queried

• Currently, all reviews are stored in an “item” document, within a “reviews” array. This is problematic for the
cases when the number of reviews for a product becomes extremely large.
• Create a new collection called “review” and modify the “reviews” array within the “item” collection to only
contain the last 10 reviews (sorted by date).
• Modify the application to update the last 10 reviews for an item, the average number of stars (based on reviews)
for an item, and insert the review into the new “review” collection

25
Lab: Use Range Based Pagination

• Pagination throughout MongoMart uses the inefficient sort() and limit() method
• Optimize MongoMart to use range based pagination
• You may modify method names and return values for this lab

Lab: Related Items

• Modify each item document to include an array of “related_items” (maximum is 4).


• Related items should include other items within the same category
• Randomized items can be used if there are less than 4.

26
Find out more Having trouble? Follow us on twitter
mongodb.com | mongodb.org File a JIRA ticket: @MongoDBInc
university.mongodb.com jira.mongodb.org @MongoDB

You might also like