How to Solve Must Appear in the GROUP BY Clause in SQL
Last Updated :
20 Feb, 2024
SQL error “Must Appear in GROUP BY Clause” is one of the most common SQL errors encountered by database developers and analysts alike. This error occurs when we attempt to run queries that include grouping and aggregation without taking proper account of the structure of the SELECT statement.
In this article, we’ll look at the source of the SQL GROUP BY Clause error and three effective ways to resolve it to make sure your SQL queries perform optimally.
How to Resolve the "Must Appear in the GROUP BY Clause" Error?
The GROUP BY statement is used to group the rows based on the columns specified. It is often combined with aggregate functions such as COUNT, SUM, AVG, etc. However, when you select other columns in the SELECT statement that are not included in the GROUP BY statement, SQL must provide explicit guidance on how to deal with these non-grouped columns.
Syntax:
SELECT column1, column2, aggregate_function(column3)
FROM your_table
GROUP BY column1, column2;
In this article, we'll resolve this issue by understanding three methods. I'll also explain the concepts so that you get the points easily with the hands-on examples. The three methods I'll explain here are:
1. Add Non-Aggregated Columns to GROUP BY
2. Use Aggregate Functions
3. Use Subquery
We can create the company table using the following code which defines the table structure with columns such as ‘department,’ ‘salary,’ as Columns.
CREATE DATABASE company;
USE company;
CREATE TABLE employees (id INT PRIMARY KEY,
department VARCHAR(50),
salary DECIMAL(10, 2) NOT NULL);
We have successfully created our 'employees' table now. Let's insert values into our table. To insert values in our table, we'll simply use the 'INSERT INTO' clause and add the values.
INSERT INTO employees VALUES (1, 'HR', 50000.00);
INSERT INTO employees VALUES (2, 'IT', 60000.00);
INSERT INTO employees VALUES (3, 'Finance', 55000.00);
INSERT INTO employees VALUES (4, 'HR', 48000.00);
INSERT INTO employees VALUES (5, 'IT', 65000.00);
Adding Non-Aggregated Columns to GROUP BY
The SQL query calculates the average salary for each distinct department in the "employees" table using the GROUP BY clause. Here we'll add the non-aggregated columns to Group By clause.
SELECT department, AVG(salary)
FROM employees
GROUP BY department;
Output:
Non-Aggregated Columns to GROUP BY
Explanation: We first grouped the data with 'department' column and calculated the average salary for each department by using the 'AVG()' function. Meaning that the 'department' is in both the SELECT statement and the GROUP BY clause. And then I applied the AVG function to 'salary,' because of what it'll resolve the error.
Using Aggregate Functions
The provided SQL query retrieves the maximum salary for each department from the "employees" table. Here's an explanation of the query:
SELECT department, MAX(salary) as max_salary
FROM employees
GROUP BY department;
Output:
Aggregate Functions
Explanation: Here, I have used the 'MAX()' function which is an Aggregate Function, to find the maximum salary in each department. As you can see on the image output, we got the maximum salaries in three departments, i.e. HR, IT and Finance.
Using Subquery
The provided SQL query retrieves the departments and corresponding salaries of employees who have the highest salary in each department. Here's an explanation of the query:
SELECT department, salary
FROM employees
WHERE (department, salary) IN (
SELECT department, MAX(salary)
FROM employees
GROUP BY department
);
Output:
Subquery
Explanation: In this example, I have used a subquery to find the maximum salary for each department. After that, I have selected the corresponding rows.
Bonus Example: (Try Avoiding Errors in SQL)
SELECT department, salary FROM employees GROUP BY department;
aggregate function and GROUP BY ErrorExplanation: In this case, we are getting the error as 'salary' is not part of an aggregate function nor included in the GROUP BY clause. To understand it more clearly, look at the bottom of the image and you'll find the error with the reason.
But Make sure, before you use the queries that I have provided, you check whether you've already created the specific database and the table I've used here in these examples or not. If you don't know how to create the database and the table, you can use the following query before you try the examples.
Also, remember that you don't run the whole schema at once. Run each query one by one to get good results. This is the database I've created for these examples only. You can create different types of databases and different tables and try them by yourself. so in this case, when you create the 'Company' Database and the 'employee' table with the columns inside it, you will be able to find it on the left side of your queries (in the case of MySQL).
MySQL Database with the Table
And if you already have created your table then you can proceed with the examples I have given.
Conclusion
Understanding the GROUP BY clause and aggregate functions in SQL is an essential thing for writing error-free queries. By ensuring that all columns in the SELECT statement are appropriately handled in the GROUP BY clause, you can overcome the pitfalls of this common error message and streamline your database operations. Getting warnings while writing SQL queries is okay, but getting errors every time is a bad practice. So, try avoiding errors while writing SQL queries as much as you can.
Similar Reads
How to Solve Must Appear in the GROUP BY Clause in SQL Server
In SQL when we work with a table many times we want to use the window functions and sometimes SQL Server throws an error like "Column 'Employee. Department' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause." This error means that while
4 min read
How to Fix the âmust appear in the GROUP BY clauseâ Error in PostgreSQL?
PostgreSQL is a powerful open-source relational database management system known for its robust features and reliability. However, like any database system, users may encounter errors during query execution. One common error is the "must appear in the GROUP BY clause" error. In this article, We will
3 min read
Group by clause in MS SQL Server
Group by clause will be discussed in detail in this article. There are tons of data present in the database system. Even though the data is arranged in form of a table in a proper order, the user at times wants the data in the query to be grouped for easier access. To arrange the data(columns) in fo
3 min read
How to Select the First Row of Each GROUP BY in SQL?
Selecting the first row of each group in SQL is a common requirement in database queries, especially when working with grouped data. SQL GROUP BY queries are highly effective for organizing data into logical groups, and additional techniques like window functions help to isolate the first row from e
5 min read
Python MySQL - GROUP BY and HAVING Clause
In this article, we will see how to perform groupby() and HAVING() operations on SQL using Python. Here we will consider a college database to perform group by operation on the department with respect to student strength. GROUP BY The GROUP BY statement groups rows that have the same values into sin
2 min read
How To Get Last Record In Each Group In MySQL?
In MySQL, we group the data using the GROUP BY clause. There can be a need to access these groups and get the latest record in each group, for example, tracking the last login timestamp for each user. Knowing how to retrieve the latest record from a group is essential, as it is used in many practica
4 min read
How to SELECT Top N Rows For Each Group in SQL Server
SQL Serverâs ROW_Number() window function is often used to get the top number of rows in a group. In this article, weâll look at how to get the top N Row Counts in SQL Server using this function. Weâll provide step-by-step instructions, syntax examples, and examples to help you get the results you n
4 min read
How to GROUP BY Month and Year in SQLite?
SQLite is a lightweight and server-less relational database management system. It requires very minimal configurations therefore it is easy to integrate into applications. It is also considered as an ideal choice for small mobile and desktop applications. In SQLite, GROUP BY month and year is consid
4 min read
PHP - Mysql GROUP BY HAVING Clause
Problem Statement :In this article, we are going to perform database operations with GROUP BY HAVING operation through PHP through xampp server. In this scenario, we are going to consider the food database. Requirements :xampp server Introduction :PHP is a server side scripting language that can com
3 min read
How to Find the Maximum of Multiple Columns in PL/SQL?
In PL/SQL finding the maximum value of multiple columns is a common requirement for maintaining the database. This operation is important for various applications, from financial analysis to data reporting. In this article, we will learn How to find the maximum of multiple columns in PL/SQL with the
5 min read