7 - Basic Cluster Administration
7 - Basic Cluster Administration
1
MongoDB
Basic Cluster
Administration
1.Giới thiệu về Mongod.
2.Giới thiệu về Replication.
3.Giới thiệu về Sharding.
2
1. mongod
3
Learning Objectives
• What mongod is?
4
What mongod is
• mongod is the main daemon process for MongoDB.
• The core server of the database, handling connections, requests, and most importantly,
persisting your data.
• MongoDB deployment may consist of more than one server. Our data may be distributed in a
replica set or across a sharded cluster.
• We run a separate mongod process for each server.
5
What mongod is
Shard Cluster
mongod mongod mongod
Replica Set
mongos
Application
6
What mongod is
• When we launch mongod, we're essentially starting up a new database. But we don't interact with the
mongod process directly. Instead, we use a database client to communicate with mongod (mongosh, mongo).
CLIENT
mongod
7
What mongod is
Default Configuration:
• The port mongod listens on will default to 27017.
• The default dbpath is /data/db (this folder should be available when we run mongod).
• Bind to localhost by default (127.0.0.1).
• Authentication is turned off by default, so clients are not required to authenticate before accessing
the database.
8
What mongod is
To start up a mongod process': mongod
To shutdown mongod from mongo shell (mongosh):
• use admin
• db.shutdownServer()
• exit (exit mongosh)
9
Mongod Options
--help: output the various options for mongod with a description of their functionality.
• mongod --help or mongod -h
--dbpath <directory path>: Specify where all data files of the database are stored.
--port <port number>: specify the port on which mongod will listen for client connections.
• Run mongo shell connect to above mogod: mongosh --port 27018
--bind_ip: specify which IP addresses mongod should bind to. When mongod binds to an IP address, clients
from that address are able to connect to mongod.
• mongod --bind_ip localhost --port 27018 --dbpath 'c:\mongoDB\data\db'
• mongod --bind_ip localhost , 123.123.123.123 --port 27018 --dbpath 'c:\mongoDB\data\db'
• If using the bind_ip option with external IP addresses, it's recommended to enable auth to ensure that
remote clients connecting to mongod have the proper credential
--auth: enables authentication to control which users can access the database. When auth is specified, all
database clients who want to connect to mongod first need to authenticate.
10
Mongod – Configuration File
• Configuration file is a way to organize the options you need to run the MongoD process into an
easy to parse YAML (Yet Another Markup Language) file
• Why do we need to use configuration file?
mongod --dbpath /data/db --logpath /data/log/mongod.log --replSet 'M103' --keyFile /data/keyfile --bind_ip '127.0.0.1'
11
Mongod – Configuration File
Command Line Options Configuration File Option
--dbpath Storage.dbPath
--logpath systemLog.Path and systemLog.destination
--bind_ip net.bind_ip
--port net.port
--replSet replication.replSetName
--keyFile security.keyFile
… …
12
Mongod – Configuration File
YAML file
Configuration File Option
storage.dbPath
systemLog.path
systemLog.destination
net.bind_ip
net.port
security.keyFile
replication.replSetName
14
Mongod – Basic Commands
User Management:
• db.createUser()
• db.dropUser()
Collection Management:
• db.<collection>.renameCollection( <target>, <dropTarget> ) [dropTarget: optional]
• db.<collection>.createIndex( <keys>, <options>, <commitQuorum> )
• db.<collection>.drop( <options> )
DB management:
• db.dropDatabase( <writeConcern> ) [removes current database]
• db.createCollection( <name>, <options> )
DB status:
• db.serverStatus()
15
Mongod – Basic Security
Why do we have to secure the data?
Authentication Authorization
• Verifies the identity of a user • Verifies the previliges of a user
• Answers the question : Who are you? • Answers the question: What do you have
access to?
• SCRAM: default and most basic • Each user has one or more Roles.
• Each Role has one or more Privileges.
form of client authentication
• A Privilege represent a group of actions and the
(password security) resources that those actions apply to.
• X.509: certificate for authentication,
more secure and more complex
• LDAP
Only for MongoDB Enterprise
• Kerberos
16
Mongod – Basic Security
Authorization: Role Based Access Control
Roles support a high level of responsibility isolation for operational task:
18
Mongod – Basic Security
Example:
Run MongoDB server that enforces authentication (no user created)
mongod -f 'D:\HeQTCSDL_NoSQL\config_files\mongod.conf'
19
Mongod – Basic Security
Roles in MongoDB
• Build-In Roles: Pre-packaged MongoDB Roles.
20
Mongod – Basic Security
Roles Structure
A role is composed of:
• Set of privileges that role enables
• All privileges that role defines will be made available to its users
• Privilege defines the action, or actions, that can be performed over a resource
• Resources:
• Database
• Collection or set of Collections
• Cluster: Replica set, Shard Cluster
21
Mongod – Basic Security
Build-In Roles
Role Levels Roles
Database Users read, readWrite
Database Administration dbAdmin, userAdmin, dbowner
Cluster Administration clusterAdmin, clusterManager, clusterMonitor, hostManager
Backup and Restore backup, restore
Super User root (root is also a role at the all database level)
readAnyFatabase, readWriteAnyDatabase
AllDatabase
dbAdminAnyDatabase, userAdminAnyDatabase
(read more Built-In Roles)
23
Mongod – Basic Security
Build-In Roles: userAdmin
• Allows user to do all operations around user management. Not able to do anything related with data
management or data modifications.
• Provides the ability to create and modify roles and users on the current database. Since the userAdmin
role allows users to grant any privilege to any user, including themselves, the role also indirectly provides
superuser access to either the database or, if scoped to the admin database, the cluster.
(read more userAdmin role)
Example:
Run mongod with config file:
mongod --config 'D:\mongod.conf'
Run mongosh to connect to MongoDB server with root user:
mongosh admin -u root -p root
Create securityUser and grant userAdmin role
use admin //all user should be created on the database admin for simplicity reasons
db.createUser( { user : 'securityUser', pwd : '123', roles : [ { db : 'admin', role : 'userAdmin' } ] } )
24
Mongod – Basic Security
Build-In Roles: dbAdmin
• Provides the ability to perform administrative tasks such as schema-related tasks, indexing, and gathering
statistics. This role does not grant privileges for user and role management.
• Everything that is related with DDL (data definition language), this user will be able to perform.
• Everything that is related with the DML (data modification language) operations, he will not be able to do.
(read more dbAdmin role)
Example:
Create securityUser and grant dbAdmin role
use admin
db.createUser( { user : 'DBAcourse', pwd : '123', roles : [ { db : 'mongoCourse', role : 'dbAdmin' } ] } )
//in this case, the roles of dbAdmin only be granted to mongoCourse db.
Roles can vary between databases. We can have a given user with different roles on a per database basis
db.grantRolesToUser( 'DBAcourse', [ { db : 'reporting', role : 'dbOwner' } ] )
dbOwner role as a meta role. This role combines the privileges granted by the readWrite, dbAdmin,
userAdmin roles
25
2. Replication
26
Replication
- Replication: Maintain multiple copies of your data – Really important
- Why:
▪ Can never assume all servers will always be available
▪ To make sure, if server goes down, you can still access your data Redundancy and Data
Availability
▪ Replication can provide increased read capacity as clients can send read operations to
different servers
Client
Application Data
27
Replication
Client
Application
Replica Set
Primary
Heartbeat
Secondary Secondary
Replica Set
Primary
Heartbeat/2s Arbiter
Secondary (vote only)
- A secondary can become a primary. If the current primary becomes unavailable, the replica set holds an
election to choose which of the secondaries becomes the new primary.
(read more Replica Set)
30
Replication
Client
Application
Replica Set
Primary
Replication
Heartbeat
Primary
Secondary Secondary
31
Replication
- The replica set cannot process write operations until the election completes successfully. The replica set can
continue to serve read queries if such queries are configured to run on secondaries.
- The median time before a cluster elects a new primary should not typically exceed 12 seconds. (read more
Replica Set Elections)
- You can configure a secondary member for a specific purpose. You can configure a secondary to:
▪ Prevent it from becoming a primary in an election, which allows it to reside in a secondary data center or
to serve as a cold standby. See Priority 0 Replica Set Members.
▪ Prevent applications from reading from it, which allows it to run applications that require separation from
normal traffic. See Hidden Replica Set Members.
▪ Keep a running "historical" snapshot for use in recovery from certain errors, such as unintentionally
deleted databases. See Delayed Replica Set Members.
32
Replication
Setting up a Replica Set
• mongod won't be able to communicate with each other until we connect them
Node 1 Node 2
Node 3
33
Replication
Setting up a Replica Set:
1. Use configuration file for standalone mongod;
2. Start a mongod with configuration file;
3. Start a mongo and connect to one of mongo instance;
4. Initialize replica set;
5. Create root user;
6. Exit out of this mongo and then log back in as m-admin user;
7. Add nodes to Replica set
34
Replication
Setting up a Replica Set:
1- Use configuration file for standalone mongod
net: encrypt
net: data exchanged net:
bindIp: localhost between
bindIp: client
localhost bindIp: localhost
port: 27011 port: 27012
application and mongodb port: 27013
35
Replication
Setting up a Replica Set:
2- Start a mongod with configuration file:
mongod -f node1.cfg
mongod -f node2.cfg
mongod -f node3.cfg
3- Start a mongo and connect to one of mongo instance:
mongo --host 127.0.0.1:27011 || mongo --host localhost:27011
4- Initialize replica set:
rs.initiate()
5- Create root user:
use admin
db.createUser({
user: ‘m-admin',
pwd: ‘m-pass',
roles: [ { role : 'root', db : 'admin’ } ]
})
36
Replication
Setting up a Replica Set
6- Exit out of this mongo and then log back in as m-admin user
mongo --host rep-example/localhost:27011 -u m-admin -p m-pass --authenticationDatabase admin
7- Add nodes to Replica set Replica set name
rs.add( 'localhost:27012’ )
rs.add( 'localhost:27013’ )
37
Replication
Replication Configuration Document:
- The replica set configuration document is a simple BSON document that we manage using a
JSON representation
- Can be configured from the shell
- There are set of mongo shell replication helper methods that make it easier to manage
▪ rs.add() : Adds a member to a replica set.
▪ rs.addArb() : Adds an arbiter to a replica set.
▪ rs.initiate() : Initializes a new replica set.
▪ rs.remove() : Remove a member from a replica set.
▪ rs.reconfig() : Reconfigures a replica set by applying a new replica set configuration object.
▪ … (seft study)
38
Replication
Reconfiguring a Running Replica Set:
Replica Set
Primary Secondary
(node 1) (node 4)
Arbiter
Secondary Secondary (node 5)
(node 2) (node 3)
39
Replication
Reconfiguring a Running Replica Set:
1- Create config files for the secondaries 3 and arbiter nodes
storage: storage:
dbPath: d:\db\ReplicaSet\node4 dbPath: d:\db\ReplicaSet\arbiter
net: net:
bindIp: localhost bindIp: localhost
port: 27014 port: 28000
security: security:
authorization: enabled authorization: enabled
keyFile: d:\db\pki\keyfile keyFile: d:\db\pki\keyfile
systemLog: systemLog:
destination: file destination: file
path: d:\db\ReplicaSet\node4\mongod.log path: d:\db\ReplicaSet\arbiter\mongod.log
logAppend: true logAppend: true
replication: replication:
replSetName: rep-example replSetName: rep-example
node4.cfg arbiter.cfg
40
Replication Replica Set
2- Starting up mongod processes for our fourth node and arbiter Secondary Arbiter
(node 2) (node 5)
mongod --config 'c:\mongoDB\configs\node4.conf’
mongod --config 'c:\mongoDB\configs\arbiter.conf’ Secondary
(node 3)
Secondary
(node 4)
41
Replication
Reconfiguring a Running Replica Set:
- Removing the arbiter from our replica set:
rs.remove( 'localhost:28000’ )
- Assigning the current configuration to a shell variable we can edit, in order to reconfigure the replica
set:
cfg = rs.conf()
- Editing our new variable cfg to change topology - specifically, by modifying cfg.members:
cfg.members[3].votes = 0
cfg.members[3].hidden = true
cfg.members[3].priority = 0
- Updating our replica set to use the new configuration cfg:
rs.reconfig( cfg )
42
Replication
Failover and Elections:
- Primary node is the first point where the client application accesses the database.
- if secondaries go down, the client will continue communicating with the node acting as
primary until the primary is unavailable.
Replica Set
Client
Primary Application
Secondary Secondary
43
Replication
Failover and Elections:
• What would cause a primary to become unavailable? → a common reason is
maintenance.
Replica Set
? Client
Primary Application
Secondary Secondary
44
Replication
Failover and Elections:
- Let's say we want to roll upgrade on a three nodes replica set.
- A rolling upgrade just means we're upgrading one server at a time, starting with the secondaries and
eventually, we'll upgrade the primary.
upgrade to
Replica Set version 5.0
Primary
(node1 - v4.2)
Secondary Secondary
(node2 - v4.2) (node3 - v4.2)
45
Replication
Failover and Elections:
Replica Set
Primary
(node1 –- v4.2)
(node2 v5.0)
Secondary Secondary
(node2–––-v5.0)
(node1
(node2
(node1 v4.2)
v4.3)
v5.0) (node3 –- v4.2)
(node3 v5.0)
46
3. Sharding
47
Mongod – Sharding
- In a replica set, we have more than one server in our database and each server has to contain the entire
dataset
- What do we do when the data grows, and the servers can't work properly?
- There are two methods for addressing system growth: vertical and horizontal scaling
• Vertical Scaling: involves increasing the capacity of a single server, such as using a more powerful
CPU, adding more RAM, or increasing the amount of storage space.
48
Mongod – Sharding
50 passengers want to travel
from SG to HN at same time.
But there is only one bus with
capacity of 25 passengers
49
Mongod – Sharding
What is Sharding?
- MongoDB, scaling is done horizontally
- The way we distribute data in MongoDB is called Sharding
- Sharding allows us to grow our dataset without worrying about being able to store it all on one server
- To guarantee high availability in our Sharded Cluster, we deploy each shard as a replica set
Config
Server
(replica
set)
...
Router
client (mongos)
52
Mongod – Sharding
- Information contained on each shard might change with time.
- Mongos queries the config servers often, in case a piece of data is moved.
- Example: a lot of people in our database with the last name Smith, the third shard is going to contain a
disproportionately large amount of data.
- In that case, config servers have to make sure that there's an even distribution of data across each part.
Primary Shard
- In the sharded cluster, we have the primary shard.
- Each database will be assigned a primary shard.
- All the non-sharded collections on that database will remain on primary shard (not all the collections in a
sharded cluster need to be distributed).
Primary
Sharded Non-Sharded
Shard 1 53
Shard 2 Shard 3
Mongod – Sharding
Setting Up a Sharded Cluster:
✓ Build config servers:
1. Create configuration file for config servers;
2. Starting the config servers;
3. Run mongo shell and connect to one of the config servers;
4. Initiating the CSRS;
5. Creating super user on CSRS;
6. Authenticating as the super user;
7. Initiating the CSRS;
8. Add the second and third node to the CSRS replica set;
✓ Config and run Mongos:
✓ Config Shard.
✓ Adding shards to cluster from mongos.
54
Mongod – Sharding
Setting Up a Sharded Cluster:
✓ Build config servers:
1. Create configuration file for config servers:
security:
authorization: enabled
keyFile: d:\db\pki\keyfile
systemLog:
destination: file
path: d:\db\mongos.log
logAppend: true
sharding:
configDB: rep-example/localhost:26001,localhost:26002,localhost:26003
mongos.cfg 57
Mongod – Sharding
Setting Up a Sharded Cluster:
✓ Config Shard (using of Replica set m-example2)
storage: storage: storage:
dbPath: d:\db\ShardCluster\node1 dbPath: d:\db\ShardCluster\node2 dbPath: d:\db\ShardCluster\node3
wiredTiger: wiredTiger: wiredTiger:
engineConfig: engineConfig: engineConfig:
cacheSizeGB: .25 cacheSizeGB: .25 cacheSizeGB: .25
net: net: net:
bindIp: localhost bindIp: localhost bindIp: localhost
port: 27011 port: 27012 port: 27013
security: security: security:
authorization: enabled authorization: enabled authorization: enabled
keyFile: d:\db\pki\keyfile keyFile: d:\db\pki\keyfile keyFile: d:\db\pki\keyfile
systemLog: systemLog: systemLog:
destination: file destination: file destination: file
path: d:\db\ShardCluster\node1\mongod.log path: d:\db\ShardCluster\node3\mongod.log
path: d:\db\ShardCluster\node2\mongod.log
logAppend: true logAppend: true logAppend: true
replication: replication: replication:
replSetName: rep-example2 replSetName: rep-example2 replSetName: rep-example2
sharding: sharding: sharding:
clusterRole: shardsvr clusterRole: shardsvr clusterRole: shardsvr
node1.cfg node2.cfg node3.cfg
Read more WiredTiger Storage Engine
58
Mongod – Sharding
Setting Up a Sharded Cluster:
✓ Config Shard (using of Replica set m-example2)
Run mongod with corresponding config files:
mongod --config node1.cfg
mongod --config node2.cfg
mongod --config node3.cfg
✓ Adding new shard to cluster from mongos
sh.addShard( ‘rep-example2/localhost:27011’ ) //if port:27011 is primary node
Check sharding status:
sh.status()
59
Mongod – Sharding
Shard Keys:
• MongoDB uses the shard key to distribute the collection's documents across shards. The shard key consists
of a field or multiple fields in the documents.
• MongoDB divides the span of shard key values into non-overlapping ranges of shard key values. Each
range is associated with a chunk.
• Cannot unshard a collection.
• MongoDB supports two sharding strategies for distributing data across sharded clusters:
60
Mongod – Sharding
Shard Keys: How to shard
• Use sh.enableSharding('<database>') to enable sharding for the specified database
• Use db.collection.createIndex(key) to create index for shard key
• Use sh.shardCollection('<database>', ‘<collection>’, { shard key }) to shard collection
61
Question?
67