How to Restrict Results to Top N Rows per Group in MySQL
Last Updated :
12 Mar, 2024
When working with MySQL databases, the need to limit results to the top N rows per group is a common requirement. This is particularly useful in scenarios where you aim to retrieve a specific number of records for each distinct group in your dataset. Achieving this involves leveraging the ROW_NUMBER() window function along with the PARTITION BY clause to effectively partition the data into groups.
How to Restrict Results to Top N Rows per Group in MySQL
When dealing with databases, limiting results to the top N rows within specific groups is crucial. We'll explore three approaches in MySQL to efficiently handle this scenario.
- Using Subquery with
ROW_NUMBER()
- Using Common Table Expression (CTE)
- Using
DENSE_RANK()
for Tie-Breaking
Syntax:
SELECT *
FROM (
SELECT
your_columns,
ROW_NUMBER() OVER (PARTITION BY your_grouping_column ORDER BY your_ordering_column DESC) AS row_num
FROM
your_table
) ranked
WHERE
row_num <= N;
Using ROW_NUMBER() and PARTITION BY
The ROW_NUMBER() function, combined with PARTITION BY, allows efficient grouping and filtering based on assigned row numbers.
SELECT
your_columns,
ROW_NUMBER() OVER (PARTITION BY your_grouping_column ORDER BY your_ordering_column DESC) AS row_num
FROM
your_table;
Example 1: Restricting Sales Data to the Top 2 Products per Category
Let's consider a scenario where you have a table named sales with columns product, category, and revenue. The goal is to retrieve the top 2 products based on revenue for each category.
-- Sample Data
CREATE TABLE sales (
product VARCHAR(50),
category VARCHAR(50),
revenue INT
);
INSERT INTO sales VALUES ('Laptop', 'Electronics', 1000);
INSERT INTO sales VALUES ('Smartphone', 'Electronics', 1200);
INSERT INTO sales VALUES ('Refrigerator', 'Appliances', 800);
INSERT INTO sales VALUES ('Washing Machine', 'Appliances', 900);
INSERT INTO sales VALUES ('Tablet', 'Electronics', 500);
INSERT INTO sales VALUES ('Coffee Maker', 'Appliances', 600);
-- Query to Retrieve Top 2 Products per Category
SELECT
product,
category,
revenue
FROM (
SELECT
product,
category,
revenue,
ROW_NUMBER() OVER (PARTITION BY category ORDER BY revenue DESC) AS row_num
FROM
sales
) ranked
WHERE
row_num <= 2;
Output:
product
| category
| revenue
|
---|
Smartphone
| Electronics
| 1200
|
Laptop
| Electronics
| 1000
|
Washing Machine
| Appliances
| 900
|
Refrigerator
| Appliances
| 800
|
Explanation: The query retrieves the top 2 products per category from the 'sales' table based on descending revenue. The output displays columns 'product', 'category', and 'revenue'. It utilizes the ROW_NUMBER() window function to assign row numbers within each category, and filters rows where the row number is 1 or 2.
Example 2: Retrieving Top 3 Employees per Department Based on Salary
Consider a table named employees with columns employee_id, department, and salary. You want to retrieve the top 3 employees based on salary for each department.
-- Sample Data
CREATE TABLE employees (
employee_id INT PRIMARY KEY,
department VARCHAR(50),
salary INT
);
INSERT INTO employees VALUES (1, 'HR', 50000);
INSERT INTO employees VALUES (2, 'IT', 60000);
INSERT INTO employees VALUES (3, 'HR', 55000);
INSERT INTO employees VALUES (4, 'IT', 65000);
INSERT INTO employees VALUES (5, 'Finance', 70000);
-- Query to Retrieve Top 3 Employees per Department
SELECT
employee_id,
department,
salary
FROM (
SELECT
employee_id,
department,
salary,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS row_num
FROM
employees
) ranked
WHERE
row_num <= 3;
Output:
employee_id
| department
| salary
|
---|
2
| IT
|
60000
|
4
| IT
|
65000
|
5
| Finance
|
70000
|
1
| HR
|
50000
|
3
| HR
|
55000
|
Explanation: The output displays the top 3 employees with the highest salaries in each department. It includes their 'employee_id', 'department', and 'salary'. The result is obtained by ranking employees within each department based on salary in descending order using ROW_NUMBER().
Using Common Table Expression (CTE)
Utilizing a Common Table Expression enhances readability and provides a clear structure for the query.
Syntax:
WITH ranked_data AS (
SELECT
your_columns,
ROW_NUMBER() OVER (PARTITION BY your_grouping_column ORDER BY your_ordering_column DESC) AS row_num
FROM
your_table
)
SELECT *
FROM ranked_data
WHERE row_num <= N;
Example
WITH ranked_data AS (
SELECT
product,
category,
revenue,
ROW_NUMBER() OVER (PARTITION BY category ORDER BY revenue DESC) AS row_num
FROM
sales
)
SELECT *
FROM ranked_data
WHERE row_num <= 2;
Output:
| product | category | revenue | row_num |
|------------------|-------------|---------|---------|
| Smartphone | Electronics | 1200 | 1 |
| Laptop | Electronics | 1000 | 2 |
| Washing Machine | Appliances | 900 | 1 |
| Refrigerator | Appliances | 800 | 2 |
| Coffee Maker | Appliances | 600 | 3 |
Using DENSE_RANK()
Use DENSE_RANK() instead of ROW_NUMBER() for scenarios where tied rows should share the same rank.
Syntax:
SELECT
your_columns,
DENSE_RANK() OVER (PARTITION BY your_grouping_column ORDER BY your_ordering_column DESC) AS rank
FROM
your_table;
Example:
SELECT
product,
category,
revenue,
DENSE_RANK() OVER (PARTITION BY category ORDER BY revenue DESC) AS rank
FROM
sales;
Output:
| product | category | revenue | rank |
|------------------|-------------|---------|------|
| Smartphone | Electronics | 1200 | 1 |
| Laptop | Electronics | 1000 | 2 |
| Tablet | Electronics | 500 | 3 |
| Washing Machine | Appliances | 900 | 1 |
| Refrigerator | Appliances | 800 | 2 |
| Coffee Maker | Appliances | 600 | 3 |
The result shows the products, their respective categories, revenues, and the calculated ranks. Adjust the column names and table accordingly for your specific use case.
Conclusion
So, overall to Restricting results to the top N rows per group in MySQL is a powerful capability that can be achieved by using the ROW_NUMBER() window function along with PARTITION BY. This technique allows you to efficiently filter and retrieve specific records for each distinct group in your data, providing valuable insights into top-performing items or individuals within each category. Whether it's sales data, employee records, or any other dataset, the flexibility of this approach makes it a valuable tool for data analysis and reporting in MySQL.
Similar Reads
How to Restrict Results to top N Rows per Group in SQLite?
Assume a situation where the data to be retrieved is grouped by specific criteria and the rows are desired to be filtered so that data from the top N rows in each group can be obtained, SQLite databases will be the tools used. This can be particularly important where, for example, items need to be r
4 min read
How to Restrict Results to top N Rows per Group in PL/SQL?
In the world of database administration, retrieving top N rows from every grouping is a frequent but complicated one. Whether you are performing cross-tabulations with large datasets or looking for specific insights within the grouping of data, the ability to restrict the output to top N rows per gr
4 min read
How to Restrict Results to Top N Rows per Group in PostgreSQL?
In data analysis, understanding how to extract meaningful insights from grouped data is crucial. PostgreSQL, a powerful open-source relational database management system, provides robust features to handle such scenarios effectively. One common analytical task is limiting results to the top N rows w
4 min read
How to Filter Query Results in MySQL
MySQL, popular among relational database management systems for its performance, reliability, and ease of use, is a database that does its task very well. Regardless of whether you are an experienced web developer or just at the beginning of your data journey, knowing how to speed up filters in quer
5 min read
How to Update Top N Records in MySQL
MySQL is a free and open-source relational database management system written in C and C++ that is extremely popular among developers. Like other relational database management systems, MySQL provides a variety of rich features to create databases and tables, insert data in them, and further manipul
6 min read
How to SELECT Top N Rows For Each Group in SQL Server
SQL Serverâs ROW_Number() window function is often used to get the top number of rows in a group. In this article, weâll look at how to get the top N Row Counts in SQL Server using this function. Weâll provide step-by-step instructions, syntax examples, and examples to help you get the results you n
4 min read
How to Export Query Result in MySQL?
As a database administrator or developer, it is important to be able to store, manipulate, or analyze data outside of the database environment. Exporting query results from MySQL can be done in several ways, each with its advantages and applications. In this article, we will discuss two methods for
4 min read
PL/SQL Query to List the Last 25% Rows in a Result Set
We sometimes need to fetch some specific rows from our result set. There are many reasons to do so. When we are working on real-time analytics, focusing on recent data is then our need. This helps us in quicker decision-making as it provides us the quicker insight into the latest data. PL/SQL is a p
5 min read
How to SELECT Rows With MAX PARTITION By Another Column in MySQL
MySQL is a widely used relational database management system (RDBMS) that provides a robust and scalable platform for managing and organizing data. MySQL is an open-source software developed by Oracle Corporation, that provides features for creating, modifying, and querying databases. It utilizes St
6 min read
How To Limit The Number Of Rows Returned In SQLite
SQLite is a lightweight and self-contained relational database management system in short RDBMS. It supports standard SQL syntax. It is designed as a simple and easy-to-use database. It requires very less configurations which makes it very easy for developers to integrate it into any application. In
4 min read