Open In App

SQL | GROUP BY

Last Updated : 17 Apr, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

The SQL GROUP BY clause is a powerful tool used to organize data into groups based on shared values in one or more columns. It’s most often used with aggregate functions like SUM, COUNT, AVG, MIN, and MAX to perform summary operations on each group helping us extract meaningful insights from large datasets.

Whether we’re analyzing sales by region, users by age group, or orders by product category, the GROUP BY clause helps transform raw data into structured reports.

What is GROUP BY Clause in SQL?

The GROUP BY statement in SQL is used to arrange identical data into groups based on specified columns. If a particular column has the same values in multiple rows, the GROUP BY clause will group these rows together. It’s commonly used with aggregate functions to calculate totals or averages per group. Key Points About GROUP BY:

  • GROUP BY clause is used with the SELECT statement.
  • In the query, the GROUP BY clause is placed after the WHERE clause.
  • In the query, the GROUP BY clause is placed before the ORDER BY clause if used.
  • In the query, the GROUP BY clause is placed before the Having clause.
  • Place condition in the having clause.

Syntax:

SELECT column1, function_name(column2)

FROM table_name

GROUP BY column1, column2

Key Terms

  1. function_name: Name of the function used for example, SUM() , AVG().
  2. table_name: Name of the table.
  3. condition: Condition used.

Examples of GROUP BY in SQL

Let’s assume that we have two tables Employee and Student Sample Table is as follows after adding two tables we will do some specific operations to learn about GROUP BY. Insert some random data into a table and then we will perform some operations in GROUP BY.

Employee Table

CREATE TABLE emp (
emp_no INT PRIMARY KEY,
name VARCHAR(50),
sal DECIMAL(10,2),
age INT
);

INSERT INTO emp (emp_no, name, sal, age) VALUES
(1, 'Aarav', 50000.00, 25),
(2, 'Aditi', 60000.50, 30),
(3, 'Aarav', 75000.75, 35),
(4, 'Anjali', 45000.25, 28),
(5, 'Chetan', 80000.00, 32),
(6, 'Divya', 65000.00, 27),
(7, 'Gaurav', 55000.50, 29),
(8, 'Divya', 72000.75, 31),
(9, 'Gaurav', 48000.25, 26),
(10, 'Divya', 83000.00, 33);
SELECT * from emp;

Output:

Screenshot-2024-06-11-213404

Emp TABLE

Student Table

CREATE TABLE student (
name VARCHAR(50),
year INT,
subject VARCHAR(50)
);

INSERT INTO student (name, year, subject) VALUES
('Alice', 1, 'Mathematics'),
('Bob', 2, 'English'),
('Charlie', 3, 'Science'),
('David', 1, 'Mathematics'),
('Emily', 2, 'English'),
('Frank', 3, 'Science');

Output:

Screenshot-2024-06-11-214050

Student TABLE

Example 1 : Group By Single Column

Group By single column means, placing all the rows with the same value of only that particular column in one group. Consider the query for Calculating the Total Salary of each Employee by their name in the emp table.

Query:

SELECT name, SUM(sal) FROM emp 
GROUP BY name;

Output:

Screenshot-2024-06-11-213528

Output

Explanations:

As you can see in the above output, the rows with duplicate NAMEs are grouped under the same NAME and their corresponding SALARY is the sum of the SALARY of duplicate rows. The SUM() function of SQL is used here to calculate the sum. The NAMES that are added are Aarav, Divya and Gaurav.

Example 2 : Group By Multiple Columns

Group by multiple columns is say, for example, GROUP BY column1, column2. This means placing all the rows with the same values of columns column 1 and column 2 in one group. This SQL query groups student records by both SUBJECT and YEAR and then counts the number of records (i.e., students) in each of those groups.

Query:

SELECT SUBJECT, YEAR, Count(*)
FROM Student
GROUP BY SUBJECT, YEAR;

Output:

Screenshot-2024-06-11-214212

Output

Explantions:

As we can see in the above output the students with both the same SUBJECT and YEAR are placed in the same group. And those whose only SUBJECT is the same but not YEAR belong to different groups. So here we have grouped the table according to two columns or more than one column. The Grouped subject and years are (English,2) , (Mathematics,1) and (Science,3). The above mentioned all groups and years are repeated twice.

HAVING Clause in GROUP BY Clause

We know that the WHERE clause is used to place conditions on columns but what if we want to place conditions on groups? This is where the HAVING clause comes into use. We can use the HAVING clause to place conditions to decide which group will be part of the final result set. Also, we can not use aggregate functions like SUM(), COUNT(), etc. with the WHERE clause. So we have to use the HAVING clause if we want to use any of these functions in the conditions. 

Syntax:

SELECT column1, function_name(column2)
FROM table_name
WHERE condition
GROUP BY column1, column2
HAVING condition
ORDER BY column1, column2;

Key Terms

  • function_name: Name of the function used for example, SUM() , AVG().
  • table_name: Name of the table.
  • condition: Condition used.

Example:

SELECT NAME, SUM(sal) FROM Emp
GROUP BY name
HAVING SUM(sal)>50000;

Output:

Screenshot-2024-06-11-214759

Output

Explanation:

In the result, only employees whose total salary (SUM(sal)) exceeds 50,000 are displayed. For example, if Anjali has a total salary less than 50,000, she will be excluded from the output.

Conclusion

The GROUP BY function in SQL organizes identical data into groups, enabling aggregate analysis on each group. It is commonly used with aggregate functions like SUM(), COUNT(), AVG(), etc., to summarize data efficiently. The HAVING clause further refines the results by applying conditions to these grouped records. GROUP BY can operate on single or multiple columns, making it a versatile tool for data retrieval and reporting.



Next Article
Article Tags :

Similar Reads