Open In App

Adding and Removing Shards in Mongodb

Last Updated : 27 Feb, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

MongoDB, as a highly scalable NoSQL database, is designed to handle large amounts of data. One of its powerful features for scaling is sharding. Sharding allows MongoDB to distribute data across multiple machines, improving performance and capacity.

In this article will delve into the process of adding and removing shards in MongoDB, covering the steps, considerations, and best practices to manage a sharded cluster effectively.

What is Sharding in MongoDB?

Sharding in MongoDB is the process of distributing data across multiple servers, or "shards," to enable horizontal scaling. This approach is especially useful when the data size exceeds the capacity of a single server. In a sharded cluster, data is divided based on a shard key, which is used to distribute the data across different shards. The cluster also includes two other components:

  • Config servers: Store metadata and configuration settings for the shared cluster.
  • Mongos routers: Act as the interface between client applications and the sharded cluster, routing client requests to the appropriate shard.

Adding a Shard to a MongoDB Cluster

Adding a shard to a MongoDB cluster is a straightforward operation, but it must be done with careful planning to ensure the continued stability and performance of the system. Here are the general steps for adding a new shard to a MongoDB cluster.

Step 1: Prepare the New Shard

Before adding the shard to the cluster, ensure that the new shard is properly configured. A shard in MongoDB can either be a standalone server or a replica set. Most production deployments use replica sets for redundancy and high availability.

For a replica set, ensure that:

  • The replica set is initialized and configured.
  • The mongod instances are running and accessible from the other members of the cluster.

Step 2: Connect to the MongoDB Config Server

The next step is to connect to one of the config servers of the sharded cluster. MongoDB config servers store the metadata necessary to route client requests to the appropriate shard.

You can connect to a config server using the mongo shell:

mongo --host <config_server_host>:<port>

Step 3: Add the Shard to the Cluster

Once connected to the config server, use the sh.addShard() command to add the new shard to the cluster. You will need to specify the address of the shard, which can be either a standalone mongod instance or a replica set. If the shard is a replica set, provide the replica set’s name and primary node’s address.

For example, to add a replica set named shardRS with a primary at hostname:27017, run the following command:

sh.addShard("shardRS/hostname:27017")

MongoDB will verify that the shard is properly configured and then add it to the cluster. After this, the data in the cluster can be distributed across the new shard.

Step 4: Verify the Addition of the Shard

To ensure that the shard has been successfully added, use the sh.status() command to view the status of the sharded cluster. This will show the current shards and their statuses.

sh.status()

You should see the new shard listed among the existing shards.

Step 5: Balancing Data

After adding a shard, MongoDB will start redistributing data across the shards to balance the data load. The balancing process uses the shard key and ensures that each shard stores a roughly equal portion of the data. This can take some time depending on the amount of data in the cluster.

You can monitor the balancer status by running the following:

db.isMaster()

Step 6: Monitor Cluster Health

Once the shard has been added and data is balanced, monitor the cluster health. Ensure that there are no issues with connectivity or data distribution, and check if the new shard is functioning properly.

Removing a Shard from a MongoDB Cluster

Removing a shard from a MongoDB cluster can be necessary for scaling down or decommissioning servers. The process is relatively straightforward, but care must be taken to avoid data loss. MongoDB does not support removing a shard while data is actively being written to it.

Step 1: Start with Balancing the Data

Before removing a shard, ensure that the data is balanced across the remaining shards. MongoDB will attempt to move data away from the shard to be removed, but if there is insufficient capacity on the other shards, it could cause performance issues.

You can use the following command to check if the data is balanced:

sh.status()

If necessary, run the following to manually trigger the balancing process:

sh.startBalancer()

Step 2: Disable the Shard

If you're ready to remove a shard, the first step is to disable it from accepting new writes. This is done by running the following command:

sh.removeShard("<shard_name>")

This command removes the shard from the cluster but does not delete the data. MongoDB will automatically migrate the data to the other shards, ensuring that no data is lost during the process.

Step 3: Wait for Data Migration

MongoDB will migrate data from the shard being removed to other shards in the cluster. Depending on the amount of data and the cluster's load, this process may take some time. You can monitor the migration status by checking the output of the sh.status() command.

Step 4: Confirm the Shard Removal

Once data migration is complete, you can confirm that the shard has been removed from the cluster. Use the sh.status() command again to verify that the shard no longer appears in the list of active shards.

sh.status()

Step 5: Shut Down the Shard Server

Once the shard has been removed and data migration is complete, you can safely shut down the shard server. Ensure that the shard server is properly decommissioned and removed from your infrastructure.

Best Practices for Adding and Removing Shards

While adding and removing shards is a straightforward operation, there are several best practices to follow:

  • Plan for Scaling: When adding a new shard, ensure that the shard key is carefully selected to ensure efficient distribution of data. Poorly chosen shard keys can lead to hotspots, where some shards store much more data than others, affecting performance.
  • Monitor Performance: Always monitor the performance of the cluster during and after adding/removing shards. Use MongoDB’s built-in monitoring tools or third-party tools to track key metrics such as query response time, disk usage, and network traffic.
  • Use Replica Sets: For fault tolerance and high availability, use replica sets as shards. This ensures that if a shard goes down, the replica set can take over, minimizing downtime.
  • Balance the Load: MongoDB automatically redistributes data when a shard is added or removed. However, it’s crucial to ensure that the cluster’s overall load is balanced to avoid performance bottlenecks.
  • Test Changes in a Staging Environment: Before adding or removing shards in a production environment, test the changes in a staging environment. This allows you to identify potential issues without affecting the live cluster.

Conclusion

Sharding is a critical feature for scaling MongoDB horizontally, and adding or removing shards is an essential operation for managing large, distributed systems. By following the proper steps and best practices outlined in this guide, database administrators can ensure smooth scaling operations while maintaining the stability and performance of their MongoDB clusters. Proper planning and monitoring are key to successful shard management, ensuring that MongoDB can handle growing amounts of data with ease.


Next Article
Article Tags :

Similar Reads