Capacity Planning For MongoDB
Capacity Planning For MongoDB
http://hci.stanford.edu/jheer/les/zoo/ex/maps/napoleon.html
Why?
What are the consequences of not planning?
Why?
What are the consequences of not planning?
What?
Availability Throughput Latency
When?
Before it's too late!
Start
Launch
Version 2
When?
Before it's too late!
Start
Launch
Version 2
Problems
Capacity
Under Over Just right?
Prediction Models
User/Load System(s) Behavior
Resource Usage
Storage
IOPS Size Data & Loading Patterns
CPU
Speed Cores
Memory
Working Set
Network
Latency Throughput
~ 1,000,000 IOPS
Storage
Active Archival Loading Patterns Integration (BI/DW)
stats()
db.blogs.stats() { "ns" : "test.blogs", "count" : 1338330, "size" : 46915928, "avgObjSize" : 35.05557523181876, "storageSize" : 86092032, "numExtents" : 12, "nindexes" : 2, "lastExtentSize" : 20872960, "paddingFactor" : 1, "flags" : 0, "totalIndexSize" : 99860480, "indexSizes" : { "_id_" : 55877632, "name_1" : 43982848 }, "ok" : 1 }
Size of data
Average document size Size on Disk Size of all indexes Size of each index
Memory
Sorting Aggregation Connections Working Set
Active Data in Memory Measured Over Periods
Ratio to Storage
PFs
% Disk Util
MOPS
Read "user" collection based on supplied credentials Update "last seen" and "ip" on "user" collection Insert into "events" collection Update "activity" collection for time series metrics
Maybe*
Maybe** sizeof(bson) 4k
4k
4k
sizeof(bson) sizeof(bson) 4k
4k 4k
4k 4k
* Only if page is faulted out by the O/S before this operation executes ** if indexes exists on attributes updated, index update will also occur
Update "last seen" and "ip" on "user" collection Insert into "events" collection Update "activity" collection for time series metrics
System load
CPU
Non-indexed Data Sorting Aggregation
Map/Reduce Framework
Data
Fields Nesting Arrays/Embedded-Docs
CPU
MOPs
CPU
MOPs
CPU %
CPU
Non-indexed Data Sorting Aggregation
Map/Reduce Framework
Data
Fields Nesting Arrays/Embedded-Docs
Network
Latency
WriteConcern ReadPreference Batching Documents (and Collections)
Throughput
Update / Write Patterns Reads / Queries SAN / NAS Virtualization
Deployment Types
All of these use the same resources:
Single Instance Multiple Instances (Replica Set) Cluster (Sharding) Data Centers
Monitoring
Storage Memory CPU Network Application Metrics
Tools
MMS (MongoDB Monitoring Service) MongoDB: mongotop, mongostat Linux: iostat, vmstat, sar, etc Windows: Perfmon Load testing
Models
Load/Users
Response Time/TTFB
System Performance
Peak Usage Min Usage
Velocity of Change
Limitations
Data Movement Allocation/Provisioning (servers/mem/disk)
Improvement
Limit Size of Change (if you can) Increase Frequency Practice
Repeat (continuously)
Repeat Testing Repeat Evaluations Repeat Deployment
Starter Questions
What is the working set?
How does that equate to memory How much disk access will that require
How efcient are the queries? What is the rate of data change? How big are the highs and lows?
#MongoDBdays
Thank You
Alvin Richards
@jonnyeight Technical Director, 10gen [email protected] alvinonmongodb.com