Database Sharding - System Design
Last Updated :
04 Apr, 2025
Database sharding is a technique for horizontal scaling of databases, where the data is split across multiple database instances, or shards, to improve performance and reduce the impact of large amounts of data on a single database.
.jpg)
What is Sharding?
Let's understand sharding with the help of an example:
You get the pizza in different slices and you share these slices with your friends. Sharding which is also known as data partitioning works on the same concept of sharing the Pizza slices.
It is basically a database architecture pattern in which we split a large dataset into smaller chunks (logical shards) and we store/distribute these chunks in different machines/database nodes (physical shards).
- Each chunk/partition is known as a "shard" and each shard has the same database schema as the original database.
- We distribute the data in such a way that each row appears in exactly one shard.
- It's a good mechanism to improve the scalability of an application.

Methods of Sharding
1. Key Based Sharding
Key Based Sharding is a technique is also known as hash-based sharding. Here, we take the value of an entity such as customer ID, customer email, IP address of a client, zip code, etc and we use this value as an input of the hash function. This process generates a hash value which is used to determine which shard we need to use to store the data.
- We need to keep in mind that the values entered into the hash function should all come from the same column (shard key) just to ensure that data is placed in the correct order and in a consistent manner.
- Basically, shard keys act like a primary key or a unique identifier for individual rows.
For example:
You have 3 database servers and each request has an application id which is incremented by 1 every time a new application is registered.
To determine which server data should be placed on, we perform a modulo operation on these applications id with the number 3. Then the remainder is used to identify the server to store our data.

Advantages of Key Based Sharding:
- Predictable Data Distribution:
- Key-based sharding provides a predictable way to distribute data across shards.
- Every distinct key value is associated with a particular shard, guaranteeing a uniform and consistent distribution of data.
- Optimized Range Queries:
- If queries involve ranges of key values, key-based sharding can be optimized to handle these range queries efficiently.
- This is especially beneficial when dealing with operations that span a range of consecutive key values.
Disadvantages of Key Based Sharding:
- Uneven Data Distribution: If the sharding key is not well-distributed it may result in uneven data distribution across shards
- Limited Scalability with Specific Keys: The scalability of key-based sharding may be limited if certain keys experience high traffic or if the dataset is heavily skewed toward specific key ranges.
- Complex Key Selection: Selecting an appropriate sharding key is crucial for effective key-based sharding.
2. Horizontal or Range Based Sharding
In Horizontal or Range Based Sharding, we divide the data by separating it into different parts based on the range of a specific value within each record. Let's say you have a database of your online customers' names and email information. You can split this information into two shards.
- In one shard you can keep the info of customers whose first name starts with A-P
- In another shard, keep the information of the rest of the customers.

Advantages of Range Based Sharding:
- Scalability: Horizontal or range-based sharding allows for seamless scalability by distributing data across multiple shards, accommodating growing datasets.
- Improved Performance: Data distribution among shards enhances query performance through parallelization, ensuring faster operations with smaller subsets of data handled by each shard.
Disadvantages of Range Based Sharding:
- Complex Querying Across Shards: Coordinating queries involving multiple shards can be challenging.
- Uneven Data Distribution: Poorly managed data distribution may lead to uneven workloads among shards.
3. Vertical Sharding
In Vertical Sharding, we split the entire column from the table and we put those columns into new distinct tables. Data is totally independent of one partition to the other ones. Also, each partition holds both distinct rows and columns. We can split different features of an entity in different shards on different machines.
For example:
On Twitter users might have a profile, number of followers, and some tweets posted by his/her own. We can place the user profiles on one shard, followers in the second shard, and tweets on a third shard.

Advantages of Vertical Sharding:
- Query Performance: Vertical sharding can improve query performance by allowing each shard to focus on a specific subset of columns. This specialization enhances the efficiency of queries that involve only a subset of the available columns.
- Simplified Queries: Queries that require a specific set of columns can be simplified, as they only need to interact with the shard containing the relevant columns.
Disadvantages of Vertical Sharding:
- Potential for Hotspots: Certain shards may become hotspots if they contain highly accessed columns, leading to uneven distribution of workloads.
- Challenges in Schema Changes: Making changes to the schema, such as adding or removing columns, may be more challenging in a vertically sharded system. Changes can impact multiple shards and require careful coordination.
4. Directory-Based Sharding
In Directory-Based Sharding, we create and maintain a lookup service or lookup table for the original database. Basically we use a shard key for lookup table and we do mapping for each entity that exists in the database. This way we keep track of which database shards hold which data.

The lookup table holds a static set of information about where specific data can be found. In the above image, you can see that we have used the delivery zone as a shard key:
- Firstly the client application queries the lookup service to find out the shard (database partition) on which the data is placed.
- When the lookup service returns the shard it queries/updates that shard.
Advantages of Directory-Based Sharding:
- Flexible Data Distribution: Directory-based sharding allows for flexible data distribution, where the central directory can dynamically manage and update the mapping of data to shard locations.
- Efficient Query Routing: Queries can be efficiently routed to the appropriate shard using the information stored in the directory. This results in improved query performance.
- Dynamic Scalability: The system can dynamically scale by adding or removing shards without requiring changes to the application logic.
Disadvantages of Directory-Based Sharding:
- Centralized Point of Failure: The central directory represents a single point of failure. If the directory becomes unavailable or experiences issues, it can disrupt the entire system, impacting data access and query routing.
- Increased Latency: Query routing through a central directory introduces an additional layer, potentially leading to increased latency compared to other sharding strategies.
Ways to optimize database sharding for even data distribution
Here are some simple ways to optimize database sharding for even data distribution:
- Use Consistent Hashing: This helps distribute data more evenly across all shards by using a hashing function that assigns records to different shards based on their key values.
- Choose a Good Sharding Key: Picking a well-balanced sharding key is crucial. A key that doesn’t create hotspots ensures that data spreads out evenly across all servers.
- Range-Based Sharding with Caution: If using range-based sharding, make sure the ranges are properly defined so that one shard doesn’t get overloaded with more data than others.
- Regularly Monitor and Rebalance: Keep an eye on data distribution and rebalance shards when necessary to avoid uneven loads as data grows.
- Automate Sharding Logic: Implement automation tools or built-in database features that automatically distribute data and handle sharding to maintain balance across shards.
Alternatives to database sharding
Below are some of the alternatives to database sharding:
- Vertical Scaling: Instead of splitting the database, you can upgrade your existing server by adding more CPU, memory, or storage to handle more load. However, this has limits as you can only scale a server so much.
- Replication: You can create copies of your database on multiple servers. This helps with load balancing and ensures availability, but can lead to synchronization issues between replicas.
- Partitioning: Instead of sharding across multiple servers, partitioning splits data within the same server. It divides data into smaller sections, improving query performance for large datasets.
- Caching: By storing frequently accessed data in a cache (like Redis or Memcached), you reduce the load on your main database, improving performance without needing to shard.
- CDNs: For read-heavy workloads, using a Content Delivery Network (CDN) can offload some of the data access from your primary database, reducing the need for sharding.
Advantages of Sharding in System Design
Sharding offers many advantages in system design such as:
- Enhances Performance: By distributing the load among several servers, each server can handle less work, which leads to quicker response times and better performance all around.
- Scalability: Sharding makes it easier to scale as your data grows. You can add more servers to manage the increased data load without affecting the system’s performance.
- Improved Resource Utilization: When data is dispersed, fewer servers are used, reducing the possibility of overloading one server.
- Fault Isolation: If one shard (or server) fails, it doesn’t take down the entire system, which helps in better fault isolation.
- Cost Efficiency: You can use smaller, cheaper servers instead of investing in a large, expensive one. As the system grows, sharding helps keep costs in control.
Disadvantages of Sharding in System Design
Sharding comes with some disadvantages in system design such as:
- Increased Complexity: Managing and maintaining multiple shards is more complex than working with a single database. It requires careful planning and management.
- Rebalancing Challenges: If data distribution becomes uneven, rebalancing shards (moving data between servers) can be difficult and time-consuming.
- Cross-Shard Queries: Queries that need data from multiple shards can be slower and more complicated to handle, affecting performance.
- Operational Overhead: With sharding, you’ll need more monitoring, backups, and maintenance, which increases operational overhead.
- Potential Data Loss: If a shard fails and isn’t properly backed up, there’s a higher risk of losing the data stored on that shard.
Must read:
Conclusion
Sharding is a great solution when the single database of your application is not capable to handle/store a huge amount of growing data. Sharding helps to scale the database and improve the performance of the application. However, it also adds some complexity to your system. The above methods and architectures have clearly shown the benefits and drawbacks of each sharding technique.
Similar Reads
System Design Tutorial System Design is the process of designing the architecture, components, and interfaces for a system so that it meets the end-user requirements. This specifically designed System Design tutorial will help you to learn and master System Design concepts in the most efficient way from basics to advanced
4 min read
System Design Bootcamp - 20 System Design Concepts Every Engineer Must Know We all know that System Design is the core concept behind the design of any distributed system. Therefore every person in the tech industry needs to have at least a basic understanding of what goes behind designing a System. With this intent, we have brought to you the ultimate System Design Intervi
15+ min read
What is System Design
What is System Design? A Comprehensive Guide to System Architecture and Design PrinciplesSystem Design is the process of defining the architecture, components, modules, interfaces, and data for a system to satisfy specified requirements. It involves translating user requirements into a detailed blueprint that guides the implementation phase. The goal is to create a well-organized and ef
11 min read
System Design Life Cycle | SDLC (Design)System Design Life Cycle is defined as the complete journey of a System from planning to deployment. The System Design Life Cycle is divided into 7 Phases or Stages, which are:1. Planning Stage 2. Feasibility Study Stage 3. System Design Stage 4. Implementation Stage 5. Testing Stage 6. Deployment S
7 min read
What are the components of System Design?The process of specifying a computer system's architecture, components, modules, interfaces, and data is known as system design. It involves looking at the system's requirements, determining its assumptions and limitations, and defining its high-level structure and components. The primary elements o
10 min read
Goals and Objectives of System DesignThe objective of system design is to create a plan for a software or hardware system that meets the needs and requirements of a customer or user. This plan typically includes detailed specifications for the system, including its architecture, components, and interfaces. System design is an important
5 min read
Why is it Important to Learn System Design?System design is an important skill in the tech industry, especially for freshers aiming to grow. Top MNCs like Google and Amazon emphasize system design during interviews, with 40% of recruiters prioritizing it. Beyond interviews, it helps you build scalable, efficient systems to solve real-world c
4 min read
Important Key Concepts and Terminologies â Learn System DesignSystem Design is the core concept behind the design of any distributed systems. System Design is defined as a process of creating an architecture for different components, interfaces, and modules of the system and providing corresponding data helpful in implementing such elements in systems. In this
9 min read
Advantages of System DesignSystem Design is the process of designing the architecture, components, and interfaces for a system so that it meets the end-user requirements. System Design for tech interviews is something that canât be ignored! Almost every IT giant whether it be Facebook, Amazon, Google, Apple or any other asks
4 min read
System Design Fundamentals
Analysis of Monolithic and Distributed Systems - Learn System DesignSystem analysis is the process of gathering the requirements of the system prior to the designing system in order to study the design of our system better so as to decompose the components to work efficiently so that they interact better which is very crucial for our systems. System design is a syst
10 min read
What is Requirements Gathering Process in System Design?The Requirements gathering process is an important phase in the system design and development process where the needs and expectations of stakeholders are identified, analyzed, and documented to ensure that the final system meets their requirements. It is the process of determining what your project
4 min read
Differences between System Analysis and System DesignSystem Analysis and System Design are two stages of the software development life cycle. System Analysis is a process of collecting and analyzing the requirements of the system whereas System Design is a process of creating a design for the system to meet the requirements. Both are important stages
4 min read
Horizontal and Vertical Scaling | System DesignIn system design, scaling is crucial for managing increased loads. This article explores horizontal and vertical scaling, detailing their differences. Understanding these approaches helps organizations make informed decisions for optimizing performance and ensuring scalability as their needs evolveH
8 min read
Capacity Estimation in Systems DesignCapacity Estimation in Systems Design explores predicting how much load a system can handle. Imagine planning a party where you need to estimate how many guests your space can accommodate comfortably without things getting chaotic. Similarly, in technology, like websites or networks, we must estimat
10 min read
Object-Oriented Analysis and Design(OOAD)Object-Oriented Analysis and Design (OOAD) is a way to design software by thinking of everything as objects similar to real-life things. In OOAD, we first understand what the system needs to do, then identify key objects, and finally decide how these objects will work together. This approach helps m
6 min read
How to Answer a System Design Interview Problem/Question?System design interviews are crucial for software engineering roles, especially senior positions. These interviews assess your ability to architect scalable, efficient systems. Unlike coding interviews, they focus on overall design, problem-solving, and communication skills. You need to understand r
5 min read
Functional vs. Non Functional RequirementsRequirements analysis is an essential process that enables the success of a system or software project to be assessed. Requirements are generally split into two types: Functional and Non-functional requirements. functional requirements define the specific behavior or functions of a system. In contra
6 min read
Communication Protocols in System DesignModern distributed systems rely heavily on communication protocols for both design and operation. They facilitate smooth coordination and communication by defining the norms and guidelines for message exchange between various components. Building scalable, dependable, and effective systems requires
6 min read
Web Server, Proxies and their role in Designing SystemsIn system design, web servers and proxies are crucial components that facilitate seamless user-application communication. Web pages, images, or data are delivered by a web server in response to requests from clients, like browsers. A proxy, on the other hand, acts as a mediator between clients and s
9 min read
Scalability in System Design
Databases in Designing Systems
Complete Guide to Database Design - System DesignDatabase design is key to building fast and reliable systems. It involves organizing data to ensure performance, consistency, and scalability while meeting application needs. From choosing the right database type to structuring data efficiently, good design plays a crucial role in system success. Th
11 min read
SQL vs. NoSQL - Which Database to Choose in System Design?When designing a system, one of the most critical system design choices you will face is choosing the proper database management system (DBMS). The choice among SQL vs. NoSQL databases can drastically impact your system's overall performance, scalability, and usual success. This is why we have broug
7 min read
File and Database Storage Systems in System DesignFile and database storage systems are important to the effective management and arrangement of data in system design. These systems offer a structure for data organization, retrieval, and storage in applications while guaranteeing data accessibility and integrity. Database systems provide structured
4 min read
Block, Object, and File Storage in System DesignStorage is a key part of system design, and understanding the types of storage can help you build efficient systems. Block, object, and file storage are three common methods, each suited for specific use cases. Block storage is like building blocks for structured data, object storage handles large,
6 min read
Database Sharding - System DesignDatabase sharding is a technique for horizontal scaling of databases, where the data is split across multiple database instances, or shards, to improve performance and reduce the impact of large amounts of data on a single database.Table of ContentWhat is Sharding?Methods of ShardingKey Based Shardi
9 min read
Database Replication in System DesignDatabase replication is essential to system design, particularly when it comes to guaranteeing data scalability, availability, and reliability. It involves building and keeping several copies of a database on various servers to improve fault tolerance and performance.Table of ContentWhat is Database
7 min read
High Level Design(HLD)
What is High Level Design? â Learn System DesignHLD plays a significant role in developing scalable applications, as well as proper planning and organization. High-level design serves as the blueprint for the system's architecture, providing a comprehensive view of how components interact and function together. This high-level perspective is impo
9 min read
Availability in System DesignIn system design, availability refers to the proportion of time that a system or service is operational and accessible for use. It is a critical aspect of designing reliable and resilient systems, especially in the context of online services, websites, cloud-based applications, and other mission-cri
6 min read
Consistency in System DesignConsistency in system design refers to the property of ensuring that all nodes in a distributed system have the same view of the data at any given point in time, despite possible concurrent operations and network delays. In simpler terms, it means that when multiple clients access or modify the same
8 min read
Reliability in System DesignReliability is crucial in system design, ensuring consistent performance and minimal failures. The reliability of a device is considered high if it has repeatedly performed its function with success and low if it has tended to fail in repeated trials. The reliability of a system is defined as the pr
5 min read
CAP Theorem in System DesignThe CAP Theorem explains the trade-offs in distributed systems. It states that a system can only guarantee two of three properties: Consistency, Availability, and Partition Tolerance. This means no system can do it all, so designers must make smart choices based on their needs. This article explores
8 min read
What is API Gateway | System Design?An API Gateway is a key component in system design, particularly in microservices architectures and modern web applications. It serves as a centralized entry point for managing and routing requests from clients to the appropriate microservices or backend services within a system.Table of ContentWhat
9 min read
What is Content Delivery Network(CDN) in System DesignThese days, user experience and website speed are crucial. Content Delivery Networks (CDNs) are useful in this situation. It promotes the faster distribution of web content to users worldwide. In this article, you will understand the concept of CDNs in system design, exploring their importance, func
8 min read
What is Load Balancer & How Load Balancing works?A load balancer is a crucial component in system design that distributes incoming network traffic across multiple servers. Its main purpose is to ensure that no single server is overburdened with too many requests, which helps improve the performance, reliability, and availability of applications.Ta
9 min read
Caching - System Design ConceptCaching is a system design concept that involves storing frequently accessed data in a location that is easily and quickly accessible. The purpose of caching is to improve the performance and efficiency of a system by reducing the amount of time it takes to access frequently accessed data.Table of C
10 min read
Communication Protocols in System DesignModern distributed systems rely heavily on communication protocols for both design and operation. They facilitate smooth coordination and communication by defining the norms and guidelines for message exchange between various components. Building scalable, dependable, and effective systems requires
6 min read
Activity Diagrams - Unified Modeling Language (UML)Activity diagrams are an essential part of the Unified Modeling Language (UML) that help visualize workflows, processes, or activities within a system. They depict how different actions are connected and how a system moves from one state to another. By offering a clear picture of both simple and com
10 min read
Message Queues - System DesignMessage queues enable communication between various system components, which makes them crucial to system architecture. Because they serve as buffers, messages can be sent and received asynchronously, enabling systems to function normally even if certain components are temporarily or slowly unavaila
9 min read
Low Level Design(LLD)
What is Low Level Design or LLD? - Learn System DesignLow-Level Design (LLD) is the detailed design process in the software development process that focuses on implementing individual components described in the High-Level Design. It provides a blueprint for how each component in the system will function and process and it also includes UML Diagrams, d
5 min read
Difference between Authentication and Authorization in LLD - System DesignTwo fundamental ideas in system design, particularly in low-level design (LLD), are authentication and authorization. While authorization establishes what resources or actions a user is permitted to access, authentication confirms a person's identity. Both are essential for building secure systems b
4 min read
Performance Optimization Techniques for System DesignThe ability to design systems that are not only functional but also optimized for performance and scalability is paramount. As systems grow in complexity, the need for effective optimization techniques becomes increasingly critical. This article explores various strategies and best practices for opt
9 min read
Object-Oriented Analysis and Design(OOAD)Object-Oriented Analysis and Design (OOAD) is a way to design software by thinking of everything as objects similar to real-life things. In OOAD, we first understand what the system needs to do, then identify key objects, and finally decide how these objects will work together. This approach helps m
6 min read
Data Structures and Algorithms for System DesignSystem design relies on Data Structures and Algorithms (DSA) to provide scalable and effective solutions. They assist engineers with data organization, storage, and processing so they can efficiently address real-world issues. In system design, understanding DSA concepts like arrays, trees, graphs,
6 min read
Containerization Architecture in System DesignIn system design, containerization architecture describes the process of encapsulating an application and its dependencies into a portable, lightweight container that is easily deployable in a variety of computing environments. Because it makes the process of developing, deploying, and scaling appli
10 min read
Introduction to Modularity and Interfaces In System DesignIn software design, modularity means breaking down big problems into smaller, more manageable parts. Interfaces are like bridges that connect these parts together. This article explains how using modularity and clear interfaces makes it easier to build and maintain software, with tips for making sys
9 min read
Unified Modeling Language (UML) DiagramsUnified Modeling Language (UML) is a general-purpose modeling language. The main aim of UML is to define a standard way to visualize the way a system has been designed. It is quite similar to blueprints used in other fields of engineering. UML is not a programming language, it is rather a visual lan
14 min read
Data Partitioning Techniques in System DesignUsing data partitioning techniques, a huge dataset can be divided into smaller, easier-to-manage portions. These techniques are applied in a variety of fields, including distributed systems, parallel computing, and database administration. Data Partitioning Techniques in System DesignTable of Conten
9 min read
How to Prepare for Low-Level Design Interviews?Low-Level Design (LLD) interviews are crucial for many tech roles, especially for software developers and engineers. These interviews test your ability to design detailed components and interactions within a system, ensuring that you can translate high-level requirements into concrete implementation
4 min read
Essential Security Measures in System DesignIn today's digitally advanced and Interconnected technology-driven worlds, ensuring the security of the systems is a top-notch priority. This article will deep into the aspects of why it is necessary to build secure systems and maintain them. With various threats like cyberattacks, Data Breaches, an
12 min read
Design Patterns
Software Design Patterns TutorialSoftware design patterns are important tools developers, providing proven solutions to common problems encountered during software development. This article will act as tutorial to help you understand the concept of design patterns. Developers can create more robust, maintainable, and scalable softw
9 min read
Creational Design PatternsCreational Design Patterns focus on the process of object creation or problems related to object creation. They help in making a system independent of how its objects are created, composed, and represented. Creational patterns give a lot of flexibility in what gets created, who creates it, and how i
4 min read
Structural Design PatternsStructural Design Patterns are solutions in software design that focus on how classes and objects are organized to form larger, functional structures. These patterns help developers simplify relationships between objects, making code more efficient, flexible, and easy to maintain. By using structura
7 min read
Behavioral Design PatternsBehavioral design patterns are a category of design patterns that focus on the interactions and communication between objects. They help define how objects collaborate and distribute responsibility among them, making it easier to manage complex control flow and communication in a system. Table of Co
5 min read
Design Patterns Cheat Sheet - When to Use Which Design Pattern?In system design, selecting the right design pattern is related to choosing the right tool for the job. It's essential for crafting scalable, maintainable, and efficient systems. Yet, among a lot of options, the decision can be difficult. This Design Patterns Cheat Sheet serves as a guide, helping y
7 min read
Interview Guide for System Design
How to Crack System Design Interview Round?In the System Design Interview round, You will have to give a clear explanation about designing large scalable distributed systems to the interviewer. This round may be challenging and complex for you because you are supposed to cover all the topics and tradeoffs within this limited time frame, whic
9 min read
System Design Interview Questions and Answers [2025]In the hiring procedure, system design interviews play a significant role for many tech businesses, particularly those that develop large, reliable software systems. In order to satisfy requirements like scalability, reliability, performance, and maintainability, an extensive plan for the system's a
7 min read
Most Commonly Asked System Design Interview Problems/QuestionsThis System Design Interview Guide will provide the most commonly asked system design interview questions and equip you with the knowledge and techniques needed to design, build, and scale your robust applications, for professionals and newbies Below are a list of most commonly asked interview probl
2 min read
5 Common System Design Concepts for Interview PreparationIn the software engineering interview process system design round has become a standard part of the interview. The main purpose of this round is to check the ability of a candidate to build a complex and large-scale system. Due to the lack of experience in building a large-scale system a lot of engi
12 min read
5 Tips to Crack Low-Level System Design InterviewsCracking low-level system design interviews can be challenging, but with the right approach, you can master them. This article provides five essential tips to help you succeed. These tips will guide you through the preparation process. Learn how to break down complex problems, communicate effectivel
6 min read