0% found this document useful (0 votes)

10 views

05 Chapter Performance MongoDB

Mongo

Uploaded by

tai43464

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views

05 Chapter Performance MongoDB

Mongo

Uploaded by

tai43464

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 42

11/22/2023

PERFORMANCE MONGODB

Nguyen Thi Hanh

Table of Contents
˗ Memory size
˗ Schema Design
˗ Indexes
˗ CRUD Optimization
˗ Performance on Clusters

1
11/22/2023

Memory size

˗ MongoDB High Performance Database

˗ But to operate correctly, while supporting your applications,

requires adequate hardware provisioning.

2
11/22/2023

Memory size
Von Neumann Architecture

Memory size
RAM/Memory ˗ Memory is a quintessential
resource.
˗ the availability of RAM and
the fall of its production costs
contributed for the
development of databases'
architectures.

3
11/22/2023

Memory size
RAM/Memory ˗ RAM or memory is 25 times faster
than common SSDs also makes
this transition of Disk oriented into
RAM oriented a nice, strong
appealing factor for databases to
be designed around usage of
memory.
˗ MongoDB has storage engines
that are either very dependent on
RAM, or even completely in
memory execution modes for its
data management operations.

Memory size
RAM/Menory ˗ Ensure your working set fits in RAM
 Aggregation
 Index Traversing
 Write Operations
 Query Engine
 Connections
˗ Properly sizing the working set
holds true whether you run
MongoDB on Atlas or manage
MongoDB yourself.

4
11/22/2023

Schema Design

Modeling Approach MongoDB

Develop Application
and Queries

Define Data Model

Production

New Requirement

5
11/22/2023

Strategy of Modelling

Data Model

6
11/22/2023

Data Model Type

Choose Embedded VS Reference

7
11/22/2023

Design Pattern

Key Consideration (Recap too)

˗ Understand your application’s query patterns, Design your data
model, Select the appropriate indexes.
˗ MongoDB has a flexible schema does not mean you ignore
schema design.
˗ Prioritize embedding, unless there is an unavoidable reason
˗ Don’t be afraid of application-level joins: If the index is built
correctly and the returned results are limited by projection
conditions, then application-level joins will not be much more
expensive than joins in relational databases

8
11/22/2023

Key Consideration (Recap too)

˗ Array should not grow without bound
˗ When the array size growing outbound, index performance on
the array will fall down
˗ Avoid lookup if the can avoided
˗ Avoid huge number of collection
˗ Avoid default _id Filed: 12 bytes is too large and some
computational cost
˗ Optimization for keys: Every Document had schema, so every
document store key name in document and it consume more
space

Indexes

9
11/22/2023

Indexes
˗ In the database world, index plays a vital role in a performance,
that not an exception with MongoDB

Indexing strategies

˗ Use the ESR (Equality, Sort, Range) RuleThe ESR (Equality,

Sort, Range)
˗ Create Indexes to Support Your Queries
˗ Use Indexes to Sort Query
˗ Ensure Indexes Fit in RAM
˗ Create Queries that Ensure Selectivity

10
11/22/2023

Follow ESR Rule in Compound Indexes

Equality
 "Equality" refers to an exact match on a single value.
 Example: db.cars.find( { model: "Cordoba" } )
db.cars.find( { model: { $eq: "Cordoba" } } )

 Place fields that require exact matches first in your index.

 An index may have multiple keys for queries with exact matches. The
index keys for equality matches can appear in any order. However, to
satisfy an equality match with the index, all of the index keys for exact
matches must come before any other index fields.
 Exact matches should be selective. To reduce the number of index
keys scanned, ensure equality tests eliminate at least 90% of possible
document matches.

Follow ESR Rule in Compound Indexes

Sort
 "Sort" determines the order for results. Sort follows equality matches
because the equality matches reduce the number of documents that
need to be sorted.
 An index can support sort operations when the query fields are a
subset of the index keys. Sort operations on a subset of the index keys
are only supported if the query includes equality conditions for all of the
prefix keys that precede the sort keys.
 Example: queries the cars collection, the output is sorted by model
db.cars.find( { manufacturer: "GM" } ).sort( { model: 1 } )

 To improve query performance, create an index on the manufacturer

and model fields:
db.cars.createIndex( { manufacturer: 1, model: 1 } )

11
11/22/2023

Follow ESR Rule in Compound Indexes

Sort - Blocking sort
 A blocking sort indicates that MongoDB must consume and process all
input documents to the sort before returning results. Blocking sorts do
not block concurrent operations on the collection or database.
 If MongoDB cannot use an index or indexes to obtain the sort order,
MongoDB must perform a blocking sort operation on the data.
 MongoDB to use temporary files on disk to store data exceeding the
100 megabyte system memory limit while processing a blocking sort
operation.
 Sort operations that use an index often have better performance than
blocking sorts.

Follow ESR Rule in Compound Indexes

Range
 "Range" filters scan fields. The scan doesn't require an exact match,
which means range filters are loosely bound to index keys. To improve
query efficiency, make the range bounds as tight as possible and use
equality matches to limit the number of documents that must be
scanned.
 Range filters resemble the following:
db.cars.find( { price: { $gte: 15000} } )
db.cars.find( { age: { $lt: 10 } } )
db.cars.find( { priorAccidents: { $ne: null } } )

12
11/22/2023

Follow ESR Rule in Compound Indexes

˗ For compound indexes, this rule of thumb is helpful in deciding
the order of fields in the index:
 First, add those fields against which Equality queries are run.
 The next fields to be indexed should reflect the Sort order of the query.
 The last fields represent the Range of data to be accessed.

˗ If we put equality key first, we will limit the of data we looking

˗ Avoid blocking/in-memory sorting

Follow ESR Rule in Compound Indexes

13
11/22/2023

Follow ESR Rule in Compound Indexes

B-Tree & Prefix Compression: Query

Performance & Disk usage
˗ In B-Tree indexes, Low Cardinality value actually harm
performance
˗ In Low Cardinality value preference to use Partial Index

14
11/22/2023

B-Tree & Prefix Compression: Query

Performance & Disk usage
Cardinality
 Cardinality is defined to be number of unique elements present in a set.
The lower the cardinality, the more duplicated elements.
 So if a set has 5 elements made of Boolean values, then the cardinality
of the set is going to be two. So, all sets made of Booleans will have a
max cardinality of two and a min cardinality of one.

B-Tree & Prefix Compression: Query

Performance & Disk usage
How cardinality impacts indexing
 If a Boolean field is indexed, there is not much the index will improve in
terms of performance.
• Have just Booleans with a 50/50 split. The index will allow you to skip 50% of the
documents, but still be a sequential scan of the rest 50%.
• If there is a 80/20 split between true and false, then the index is pretty much
useless when querying over the true part because you will have to do a sequential
scan of 80% of the documents (But the queries looking for false will benefit from
the index).
• This applies to any field with a low cardinality, if a field has an enum of five values
and thousands of documents in each category, A similar effect can be observed.
 Indexes must be built carefully in conditions like these. One more side
effect of having an index on such fields is that it impacts writes as well.

15
11/22/2023

B-Tree & Prefix Compression: Query

Performance & Disk usage
Partial Index
 Partial indexes only index the documents in a collection that meet a
specified filter expression. By indexing a subset of the documents in a
collection, partial indexes have lower storage requirements and
reduced performance costs for index creation and maintenance.
 For example, the following operation creates a compound index that
indexes only the documents with a rating field greater than 5.
db.restaurants.createIndex(
{ cuisine: 1, name: 1 },
{ partialFilterExpression: { rating: { $gt: 5 } } }
)

Use Covered Queries When Possible

Covered query
 A covered query is a query that can be satisfied entirely using an index
and does not have to examine any documents. An index covers a
query when all of the following apply:
• all the fields in the query are part of an index, and
• all the fields returned in the results are in the same index.
• no fields in the query are equal to null (i.e. {"field" : null} or {"field" : {$eq : null}} ).
 For example, a collection inventory has the following index on the type
and item fields: db.inventory.createIndex( { type: 1, item: 1 } )
 This index will cover the following operation which queries on the type
and item fields and returns only the item field:
db.inventory.find( { type: "food", item:/^c/ },{ item: 1, _id: 0 })

16
11/22/2023

Use Covered Queries When Possible

Covered query
 For the specified index to cover the query, the projection document
must explicitly specify _id: 0 to exclude the _id field from the result
since the index does not include the _id field.
 For example, consider a collection userdata with documents of the
following form:
{ _id: 1, user: { login: "tester" } }
 The collection has the following index:
{ "user.login": 1 }
 The { "user.login": 1 } index will cover the query below:
db.userdata.find( { "user.login": "tester" }, { "user.login": 1, _id: 0 } )

Key Consideration
˗ Index create in foreground will do collection level locking
˗ Index creation in the background helps to overcome the locking
bottleneck but decrease the efficiency of index traversal
˗ Recommend the developer to write the covered query. The kind
of query will be entirely satisfied with an index. So zero
documents need to be inspected to satisfy the query, and this
makes the query run lot faster. All the projection keys need to
be indexed
˗ Use Index to sort the result and avoid blocking sort
˗ Remove Duplicate and unused index, it also improve the disk
throughput and memory optimization

17
11/22/2023

Performance
Considerations in
Distributed Systems

What is a Distributed System in MongoDB?

For a high availability solution
 Replica Cluster

 Shard Cluster

18
11/22/2023

Replica Cluster
Replication

Replica Cluster
Replication
˗ Maintain multiple copies of your data
˗ Provides redundancy and increases data availability
 With multiple copies of data on different database servers, it provides a
level of fault tolerance against the loss of a single database server.
˗ In some cases, it can provide increased read capacity as
clients can send read operations to different servers.
 Maintaining copies of data in different data centers can increase data
locality and availability for distributed applications.
 You can also maintain additional copies for dedicated purposes, such
as disaster recovery, reporting, or backup.

19
11/22/2023

Replica Cluster
Replica Set
- is a group of gomond instances that
maintain the same data set.
- contains several data bearing nodes and
optionally one arbiter node. Of the data
bearing nodes, one and only one member is
deemed the primary node, while the other
nodes are deemed secondary nodes.
- Although clients cannot write data to
secondaries, clients can read data from
secondary members

Replica Cluster
˗ Automatic Failover
 The replica set failover
mechanism is based on voting.
A secondary node will be
elected as the primary node of
the entire replica set.
 For successful voting, the
number of nodes in a replica
set must be odd

20
11/22/2023

Sharded Cluster
Sharding
˗ Sharding is a method for distributing data across multiple
machines. MongoDB uses sharding to support deployments
with very large data sets and high throughput operations.
˗ System growth: vertical and horizontal scaling.
 vertical scaling
 horizontal scaling
˗ MongoDB supports horizontal scaling through sharding.

Sharded Cluster
˗ Sharded Cluster

21
11/22/2023

Considerations Before Sharding

˗ Sharding is an horizontal scaling solution
˗ Have we reached the limits of our vertical scaling?
˗ You need to understand how your data grows and how your
data is accessed
˗ MongoDB uses the shard key to distribute the collection's
documents across shards.
˗ The shard key consists of a field or multiple fields in the
documents.
˗ It’s important to get a good shard key

Working with Distributed Systems

˗ Consider latency
˗ Data is spread across different nodes
˗ Read implications
˗ Write implications

22
11/22/2023

Latency

23
11/22/2023

Latency

24
11/22/2023

Latency

Read in Distributed Systems

˗ Two types of reads:
 Scatter Gather
 Routed Queries
˗ When
 If we are not using the shard key, we will be performing scattered
gathered queries
 If we are using the shard key, we will be performing scattered Routed
Queries
˗ Routed queries and scattered gathered will have two different
performance profiles

25
11/22/2023

Scatter Gather
˗ Ping all nodes of our shard cluster for the information
corresponding to a given query

Routed Queries
˗ Pinpoint exactly which shards contain the information relevant
for our client query.

26
11/22/2023

Sorting
˗ Sorting in a Sharded Cluster involves a few hurdles

Sorting Merge

27
11/22/2023

Limit and Skip

Limit and Skip Merge

28
11/22/2023

Recap
˗ Consideration before sharding
˗ Latency
˗ Scattered gather and routed queries
˗ Sorting, limit & skip

Reading from Secondaries

29
11/22/2023

Read preference.
˗ By default, clients read from the primary; however, clients can
specify a read preference to send read operations to
secondaries.

Reading from Secondaries

30
11/22/2023

Reading from Secondaries

31
11/22/2023

Reading from Secondaries

˗ When Reading from a Secondary is a Good Idea?

 Analytics queries
 Local reads

32
11/22/2023

Analytics Queries

33
11/22/2023

Reading from Secondaries

˗ When Reading from a Secondary is a Bad Idea

 Providing extra capacity for reads

Providing extra capacity for reads

34
11/22/2023

Recaps
˗ Read preferences associated with performance
˗ When it’s a good idea
 Analytics queries
 Local reads
˗ When it’s a bad idea

Replica Sets Nodes with

Differing Indexes

35
11/22/2023

Disclaimer
˗ Specific analytics secondary nodes
˗ Reporting on delayed consistency data
˗ Text Search

Secondary Node Consisderations

˗ Prevent such a secondary from becoming primary
 Priority =0
 Hidden Node
 Deplayed Secondary

36
11/22/2023

Replica Set

37
11/22/2023

Aggregation Pipeline on a
Sharded Cluster

˗ How it works
˗ Where operations are completed
˗ Optimization

38
11/22/2023

39
11/22/2023

˗ $out
˗ $facet
˗ $lookup
˗ $graphLookup

40
11/22/2023

Aggregation Optimizations

41
11/22/2023

Recaps
˗ How it works
˗ Where operation are completed
˗ Optimizations

AWS Certified Solutions Architect - Professional
From Everand
AWS Certified Solutions Architect - Professional
VB Dev
No ratings yet
MongoDB Schema Design Basics
100% (2)
MongoDB Schema Design Basics
51 pages
DF200 - 01 - Indexes and Optimization Mongo DB Training
No ratings yet
DF200 - 01 - Indexes and Optimization Mongo DB Training
69 pages
mongodb-indexing-simplified
No ratings yet
mongodb-indexing-simplified
7 pages
MEAN 3 L4 Advanced MongoDB With Aggregation
No ratings yet
MEAN 3 L4 Advanced MongoDB With Aggregation
94 pages
Dod Unit5
No ratings yet
Dod Unit5
15 pages
Fastquerying Indexingforperformance4 150324144349 Converske01
No ratings yet
Fastquerying Indexingforperformance4 150324144349 Converske01
59 pages
T5 Indexing PDF
No ratings yet
T5 Indexing PDF
10 pages
5 Indexes
No ratings yet
5 Indexes
51 pages
MongoDB Indexes
No ratings yet
MongoDB Indexes
29 pages
5_Indexes
No ratings yet
5_Indexes
39 pages
Unit 5 - Chapter - 5 - Cursors
No ratings yet
Unit 5 - Chapter - 5 - Cursors
21 pages
Data Modeling With Mongodb
No ratings yet
Data Modeling With Mongodb
22 pages
Unit -3
No ratings yet
Unit -3
5 pages
Mongo Performance Tuning MongoSeattle 2012
100% (1)
Mongo Performance Tuning MongoSeattle 2012
20 pages
Notes-Lecture 14 - MongoDB with NodeJS - II-3447
No ratings yet
Notes-Lecture 14 - MongoDB with NodeJS - II-3447
13 pages
Unit 2 Part 2
No ratings yet
Unit 2 Part 2
68 pages
Lab06-Query Optimization With Indexing
No ratings yet
Lab06-Query Optimization With Indexing
2 pages
Mongodb QRC Booklet
No ratings yet
Mongodb QRC Booklet
12 pages
MongoDB Performance Tuning
100% (1)
MongoDB Performance Tuning
19 pages
MongoDB ReferenceCards
No ratings yet
MongoDB ReferenceCards
28 pages
DB Practices For MongoDB
No ratings yet
DB Practices For MongoDB
7 pages
Index
No ratings yet
Index
9 pages
Mongodb Indexes
No ratings yet
Mongodb Indexes
31 pages
MongoDB Reference Card
No ratings yet
MongoDB Reference Card
28 pages
MongoDB_Seminar_Presentation
No ratings yet
MongoDB_Seminar_Presentation
15 pages
Mongo
No ratings yet
Mongo
126 pages
MongoDB Data Modeling - Sample Chapter
No ratings yet
MongoDB Data Modeling - Sample Chapter
40 pages
MongoDB - Cours 4
No ratings yet
MongoDB - Cours 4
58 pages
Get (Ebook) MongoDB: The Definitive Guide by Kristina Chodorow, Michael Dirolf ISBN 9781449381561, 1449381561 PDF ebook with Full Chapters Now
100% (4)
Get (Ebook) MongoDB: The Definitive Guide by Kristina Chodorow, Michael Dirolf ISBN 9781449381561, 1449381561 PDF ebook with Full Chapters Now
81 pages
MongoDB The Definitive Guide 1st Edition Kristina Chodorow - Explore the complete ebook content with the fastest download
100% (1)
MongoDB The Definitive Guide 1st Edition Kristina Chodorow - Explore the complete ebook content with the fastest download
60 pages
DOC-20250318-WA0005.
No ratings yet
DOC-20250318-WA0005.
145 pages
NoSQL 14 MONGO 2
No ratings yet
NoSQL 14 MONGO 2
37 pages
MongoDB Presentaton
No ratings yet
MongoDB Presentaton
30 pages
M10A1
No ratings yet
M10A1
3 pages
FIT3176 W4 Lab 03 Activity Sheet 2 MongoDB Indexing
No ratings yet
FIT3176 W4 Lab 03 Activity Sheet 2 MongoDB Indexing
4 pages
Unit2 MongoDB Practical
No ratings yet
Unit2 MongoDB Practical
148 pages
GROUP B - Assign - 2 - Mongodb-Indexing-And-Aggregation-In-Mongodb-2-33
No ratings yet
GROUP B - Assign - 2 - Mongodb-Indexing-And-Aggregation-In-Mongodb-2-33
32 pages
MongoDB Index Type and Properties
No ratings yet
MongoDB Index Type and Properties
18 pages
MongoDB Indexing PDF
No ratings yet
MongoDB Indexing PDF
3 pages
Mongo DB
100% (2)
Mongo DB
22 pages
CK Pithawala Colege PDF For Big Data Analysis
No ratings yet
CK Pithawala Colege PDF For Big Data Analysis
16 pages
Mongodb
No ratings yet
Mongodb
9 pages
Remaining NGD New
No ratings yet
Remaining NGD New
21 pages
Mongo DB
No ratings yet
Mongo DB
8 pages
Mongodb (Cont.) : Excerpts From "The Little Mongodb Book" Karl Seguin
No ratings yet
Mongodb (Cont.) : Excerpts From "The Little Mongodb Book" Karl Seguin
37 pages
Mongodb Interview Questions (V4.4)
No ratings yet
Mongodb Interview Questions (V4.4)
25 pages
1-MongoDB (3 Files Merged)
No ratings yet
1-MongoDB (3 Files Merged)
7 pages
Indexing: Alvin Richards - Alvin@
No ratings yet
Indexing: Alvin Richards - Alvin@
45 pages
Indexes MongoDB
No ratings yet
Indexes MongoDB
21 pages
Mongodbinternalsdevternity 151209084136 Lva1 App6891
No ratings yet
Mongodbinternalsdevternity 151209084136 Lva1 App6891
52 pages
Full Final
No ratings yet
Full Final
64 pages
Mongo
No ratings yet
Mongo
7 pages
Lo2 Nosql2
No ratings yet
Lo2 Nosql2
23 pages
12 MongoDB Design Patterns Part 1
No ratings yet
12 MongoDB Design Patterns Part 1
24 pages
backend file
No ratings yet
backend file
36 pages
Mongocommands
No ratings yet
Mongocommands
2 pages
Interview Ques
No ratings yet
Interview Ques
14 pages
Unit - 5 - Chpater 4 - Querying
No ratings yet
Unit - 5 - Chpater 4 - Querying
27 pages
UNIT 2 - BDA NOTES
No ratings yet
UNIT 2 - BDA NOTES
37 pages
Linux Interview Questions Answers
No ratings yet
Linux Interview Questions Answers
80 pages
Intc DD Nic 15.1 Windows 32-64
No ratings yet
Intc DD Nic 15.1 Windows 32-64
5 pages
Telugu Sex Stories in PDF Format
40% (5)
Telugu Sex Stories in PDF Format
4 pages
HDS ShadowImage
No ratings yet
HDS ShadowImage
2 pages
05 Monitoring The Cluster
No ratings yet
05 Monitoring The Cluster
12 pages
Log
No ratings yet
Log
13 pages
Chapter!: 1.1 Overview
No ratings yet
Chapter!: 1.1 Overview
2 pages
ACN MICRO PROJECTfinal
No ratings yet
ACN MICRO PROJECTfinal
23 pages
Case Study 1&2 Answers
No ratings yet
Case Study 1&2 Answers
12 pages
Embedded Systems Unit IV
No ratings yet
Embedded Systems Unit IV
11 pages
Educational Robotic Platform Based On Arduino: Richard Balogh
No ratings yet
Educational Robotic Platform Based On Arduino: Richard Balogh
4 pages
Q-PLM EV6CT5 Customization 5.1.0
No ratings yet
Q-PLM EV6CT5 Customization 5.1.0
118 pages
Hex Binary Examples
No ratings yet
Hex Binary Examples
2 pages
Hardware Implementation of A Low Power SD Card Controller: Pan Zhou, Teng Wang, Xin'an Wang, Yinhui Wang
No ratings yet
Hardware Implementation of A Low Power SD Card Controller: Pan Zhou, Teng Wang, Xin'an Wang, Yinhui Wang
4 pages
Solution Screen Problem m4 Ss4040
No ratings yet
Solution Screen Problem m4 Ss4040
27 pages
Pegasis: Power-Efficient Gathering in Sensor Information Systems'
No ratings yet
Pegasis: Power-Efficient Gathering in Sensor Information Systems'
6 pages
How To Configure CSF Firewall
No ratings yet
How To Configure CSF Firewall
6 pages
Streamvault™ Calculator
100% (1)
Streamvault™ Calculator
2 pages
Parts of A Motherboard
No ratings yet
Parts of A Motherboard
4 pages
IGCSE ICT Revision Notes
86% (14)
IGCSE ICT Revision Notes
10 pages
BCSL 013 Viva
No ratings yet
BCSL 013 Viva
2 pages
Accpac - Guide - Installation PDF
No ratings yet
Accpac - Guide - Installation PDF
122 pages
BRKCRS 2501 PDF
No ratings yet
BRKCRS 2501 PDF
273 pages
Websphere Application Server 6.1 Questions and Answers
No ratings yet
Websphere Application Server 6.1 Questions and Answers
29 pages
Calls Period Start Time BSC Namebcf Namebts Name: TCH Normal Seizures
No ratings yet
Calls Period Start Time BSC Namebcf Namebts Name: TCH Normal Seizures
51 pages
Wa0068.
No ratings yet
Wa0068.
49 pages
Pre-Engagement Questionnaire - Cloud+Enterprise Planning Services
No ratings yet
Pre-Engagement Questionnaire - Cloud+Enterprise Planning Services
3 pages
GNUSim8085 Assembly Language Guide
100% (1)
GNUSim8085 Assembly Language Guide
3 pages
Check Point Firewall Interview Questions
100% (2)
Check Point Firewall Interview Questions
5 pages
TSplus User Guide
No ratings yet
TSplus User Guide
45 pages