Benja's Notes
Benja's Notes
Then the database engineer links the product table to the color table with the Color
ID:
Color Color
ID Name
Color 1 Blue
Color 2 Red
SQL statements
SQL statements, or SQL queries, are valid instructions that relational
database management systems understand. Software developers build SQL
statements by using different SQL language elements. SQL language
elements are components such as identifiers, variables, and search
conditions that form a correct SQL statement.
SQL SELECT Statement
The SELECT statement is used to select data from a database. Example Return
data from the Customers table: SELECT CustomerName, City FROM Customers;
Syntax
SELECT column1, column2, ...FROM table_name;
Here, column1, column2... are the field names of the table you want to select data
from. The table_name represents the name of the table you want to select data
from.
let’s create the table schemas and insert some sample data into them.
Create Sales table
-- Create Sales table
4. Filter the Sales table to show only sales with a total_price greater than
$100.
Query:
SELECT * FROM Sales WHERE total_price > 100;
Output:
sale_i product quantity_s sale_dat total_pri
d _id old e ce
2024-01-
1 101 5 2500.00
01
2024-01-
2 102 3 900.00
02
Explanation:
This SQL query selects all columns from the Sales table but only returns rows where
the total_price column is greater than 100. It filters out sales with a total_price less
than or equal to $100.
5. Filter the Products table to show only products in the ‘Electronics’
category.
Query:
SELECT * FROM Products WHERE category = 'Electronics';
Output:
product product_na unit_pri
category
_id me ce
Electronic
101 Laptop 500.00
s
Electronic
102 Smartphone 300.00
s
Electronic
103 Headphones 30.00
s
Electronic
104 Keyboard 20.00
s
Electronic
105 Mouse 15.00
s
Explanation:
This SQL query selects all columns from the Products table but only returns rows
where the category column equals ‘Electronics’. It filters out products that do not
belong to the ‘Electronics’ category.
6. Retrieve the sale_id and total_price from the Sales table for sales made
on January 3, 2024.
Query:
SELECT sale_id, total_price
FROM Sales
WHERE sale_date = '2024-01-03';
Output:
sale_i total_pri
d ce
4 80.00
5 90.00
Explanation:
This SQL query selects the sale_id and total_price columns from the Sales table but
only returns rows where the sale_date is equal to ‘2024-01-03’. It filters out sales
made on any other date.
7. Retrieve the product_id and product_name from the Products table for
products with a unit_price greater than $100.
Query:
SELECT product_id, product_name
FROM Products
WHERE unit_price > 100;
Output:
product product_na
_id me
101 Laptop
102 Smartphone
Explanation:
This SQL query selects the product_id and product_name columns from the Products
table but only returns rows where the unit_price is greater than $100. It filters out
products with a unit_price less than or equal to $100.
8. Calculate the total revenue generated from all sales in the Sales table.
Query:
SELECT SUM(total_price) AS total_revenue
FROM Sales;
total_reve
nue
3630.00
Explanation:
This SQL query calculates the total revenue generated from all sales by summing up
the total_price column in the Sales table using the SUM() function.
9. Calculate the average unit_price of products in the Products table.
Query:
SELECT AVG(unit_price) AS average_unit_price
FROM Products;
Output:
average_unit_p
rice
173
Explanation:
This SQL query calculates the average unit_price of products by averaging the
values in the unit_price column in the Products table using the AVG() function.
10. Calculate the total quantity_sold from the Sales table.
Query:
SELECT SUM(quantity_sold) AS total_quantity_sold
FROM Sales;
Output:
total_quantity_
sold
20
Explanation:
This SQL query calculates the total quantity_sold by summing up the quantity_sold
column in the Sales table using the SUM() function.
11. Retrieve the sale_id, product_id, and total_price from the Sales table
for sales with a quantity_sold greater than 4.
Query:
SELECT sale_id, product_id, total_price
FROM Sales
WHERE quantity_sold > 4;
Output:
sale_i product total_pri
d _id ce
1 101 2500.00
5 105 90.00
Explanation:
This SQL query selects the sale_id, product_id, and total_price columns from the
Sales table but only returns rows where the quantity_sold is greater than 4.
12. Retrieve the product_name and unit_price from the Products table,
ordering the results by unit_price in descending order.
Query:
SELECT product_name, unit_price
FROM Products
ORDER BY unit_price DESC;
Output:
product_na unit_pri
me ce
Laptop 500.00
Smartphone 300.00
Headphones 30.00
Keyboard 20.00
Mouse 15.00
Explanation:
This SQL query selects the product_name and unit_price columns from the Products
table and orders the results by unit_price in descending order using the ORDER BY
clause with the DESC keyword.
13. Retrieve the total_price of all sales, rounding the values to two
decimal places.
Query:
SELECT ROUND(SUM(total_price), 2) AS total_sales
FROM Sales;
Output:
product_na
me
3630.00
Explanation:
This SQL query calculates the total sales revenu by summing up the total_price
column in the Sales table and rounds the result to two decimal places using the
ROUND() function.
14. Calculate the average total_price of sales in the Sales table.
Query:
SELECT AVG(total_price) AS average_total_price
FROM Sales;
Output:
average_total_
price
726.000000
Explanation:
This SQL query calculates the average total_price of sales by averaging the values
in the total_price column in the Sales table using the AVG() function.
15. Retrieve the sale_id and sale_date from the Sales table, formatting the
sale_date as ‘YYYY-MM-DD’.
Query:
SELECT sale_id, DATE_FORMAT(sale_date, '%Y-%m-%d') AS formatted_date
FROM Sales;
Output:
sale_i formatted_d
d ate
1 2024-01-01
2 2024-01-02
3 2024-01-02
4 2024-01-03
5 2024-01-03
Explanation:
This SQL query selects the sale_id and sale_date columns from the Sales table and
formats the sale_date using the DATE_FORMAT() function to display it in ‘YYYY-MM-
DD’ format.
16. Calculate the total revenue generated from sales of products in the
‘Electronics’ category.
Query:
SELECT SUM(Sales.total_price) AS total_revenue
FROM Sales
JOIN Products ON Sales.product_id = Products.product_id
WHERE Products.category = 'Electronics';
Output:
total_reve
nue
3630.00
Explanation:
This SQL query calculates the total revenue generated from sales of products in the
‘Electronics’ category by joining the Sales table with the Products table on the
product_id column and filtering sales for products in the ‘Electronics’ category.
17. Retrieve the product_name and unit_price from the Products table,
filtering the unit_price to show only values between $20 and $600.
Query:
SELECT product_name, unit_price
FROM Products
WHERE unit_price BETWEEN 20 AND 600;
Output:
product_na unit_pri
me ce
Laptop 500.00
Smartphone 300.00
Headphones 30.00
Keyboard 20.00
Explanation:
This SQL query selects the product_name and unit_price columns from the Products
table but only returns rows where the unit_price falls within the range of $50 and
$200 using the BETWEEN operator.
18. Retrieve the product_name and category from the Products table,
ordering the results by category in ascending order.
Query:
SELECT product_name, category
FROM Products
ORDER BY category ASC;
Output:
product_na
category
me
Electronic
Laptop
s
Electronic
Smartphone
s
Electronic
Headphones
s
Electronic
Keyboard
s
Electronic
Mouse
s
Explanation:
This SQL query selects the product_name and category columns from the Products
table and orders the results by category in ascending order using the ORDER BY
clause with the ASC keyword.
19. Calculate the total quantity_sold of products in the ‘Electronics’
category.
Query:
SELECT SUM(quantity_sold) AS total_quantity_sold
FROM Sales
JOIN Products ON Sales.product_id = Products.product_id
WHERE Products.category = 'Electronics';
Output:
total_quantity_
sold
20
Explanation:
This SQL query calculates the total quantity_sold of products in the ‘Electronics’
category by joining the Sales table with the Products table on the product_id column
and filtering sales for products in the ‘Electronics’ category.
20. Retrieve the product_name and total_price from the Sales table,
calculating the total_price as quantity_sold multiplied by unit_price.
Query:
SELECT product_name, quantity_sold * unit_price AS total_price
FROM Sales
JOIN Products ON Sales.product_id = Products.product_id;
Output:
product_na total_pri
me ce
Laptop 2500.00
Smartphone 900.00
Headphones 60.00
Keyboard 80.00
Mouse 90.00
Explanation:
This SQL query retrieves the product_name from the Sales table and calculates the
total_price by multiplying quantity_sold by unit_price, joining the Sales table with
the Products table on the product_id column.
SQL Practice Exercises for Intermediate
These exercises are designed to challenge you beyond basic queries, delving into
more complex data manipulation and analysis. By tackling these problems, you’ll
solidify your understanding of advanced SQL concepts like joins, subqueries,
functions, and window functions, ultimately boosting your ability to work with real-
world data scenarios effectively.
1. Calculate the total revenue generated from sales for each product
category.
Query:
SELECT p.category, SUM(s.total_price) AS total_revenue
FROM Sales s
JOIN Products p ON s.product_id = p.product_id
GROUP BY p.category;
Output:
total_reve
category
nue
Electronic
3630.00
s
Explanation:
This query joins the Sales and Products tables on the product_id column, groups the
results by product category, and calculates the total revenue for each category by
summing up the total_price.
2. Find the product category with the highest average unit price.
Query:
SELECT category
FROM Products
GROUP BY category
ORDER BY AVG(unit_price) DESC
LIMIT 1;
Output:
categor
y
Electroni
cs
Explanation:
This query groups products by category, calculates the average unit price for each
category, orders the results by the average unit price in descending order, and
selects the top category with the highest average unit price using the LIMIT clause.
3. Identify products with total sales exceeding 30.
Query:
SELECT p.product_name
FROM Sales s
JOIN Products p ON s.product_id = p.product_id
GROUP BY p.product_name
HAVING SUM(s.total_price) > 30;
Output:
product_na
me
Headphones
Keyboard
Laptop
Mouse
Smartphone
Explanation:
This query joins the Sales and Products tables on the product_id column, groups the
results by product name, calculates the total sales revenue for each product, and
selects products with total sales exceeding 30 using the HAVING clause.
4. Count the number of sales made in each month.
Query:
SELECT DATE_FORMAT(s.sale_date, '%Y-%m') AS month, COUNT(*) AS sales_count
FROM Sales s
GROUP BY month;
Output:
sales_cou
month
nt
2024-
5
01
Explanation:
This query formats the sale_date column to extract the month and year, groups the
results by month, and counts the number of sales made in each month.
5. Determine the average quantity sold for products with a unit price
greater than $100.
Query:
SELECT AVG(s.quantity_sold) AS average_quantity_sold
FROM Sales s
JOIN Products p ON s.product_id = p.product_id
WHERE p.unit_price > 100;
Output:
average_quantity
_sold
4.0000
Explanation:
This query joins the Sales and Products tables on the product_id column, filters
products with a unit price greater than $100, and calculates the average quantity
sold for those products.
6. Retrieve the product name and total sales revenue for each product.
Query:
SELECT p.product_name, SUM(s.total_price) AS total_revenue
FROM Sales s
JOIN Products p ON s.product_id = p.product_id
GROUP BY p.product_name;
Output:
product_na total_reve
me nue
Laptop 2500.00
Smartphone 900.00
Headphones 60.00
Keyboard 80.00
Mouse 90.00
Explanation:
This query joins the Sales and Products tables on the product_id column, groups the
results by product name, and calculates the total sales revenue for each product.
7. List all sales along with the corresponding product names.
Query:
SELECT s.sale_id, p.product_name
FROM Sales s
JOIN Products p ON s.product_id = p.product_id;
Output:
sale_i product_na
d me
1 Laptop
2 Smartphone
3 Headphones
4 Keyboard
5 Mouse
Explanation:
This query joins the Sales and Products tables on the product_id column and
retrieves the sale_id and product_name for each sale.
8. Retrieve the product name and total sales revenue for each product.
Query:
SELECT p.category,
SUM(s.total_price) AS category_revenue,
(SUM(s.total_price) / (SELECT SUM(total_price) FROM Sales)) * 100 AS
revenue_percentage
FROM Sales s
JOIN Products p ON s.product_id = p.product_id
GROUP BY p.category
ORDER BY revenue_percentage DESC
LIMIT 3;
Output:
category_reve revenue_percen
category
nue tage
Electronic
3630.00 100.000000
s
Explanation:
This query will give you the top three product categories contributing to the highest
percentage of total revenue generated from sales. However, if you only have one
category (Electronics) as in the provided sample data, it will be the only result.
9. Rank products based on total sales revenue.
Query:
SELECT p.product_name, SUM(s.total_price) AS total_revenue,
RANK() OVER (ORDER BY SUM(s.total_price) DESC) AS revenue_rank
FROM Sales s
JOIN Products p ON s.product_id = p.product_id
GROUP BY p.product_name;
Output:
product_na total_reve revenue_r
me nue ank
Laptop 2500.00 1
Smartphone 900.00 2
Mouse 90.00 3
Keyboard 80.00 4
Headphones 60.00 5
Explanation:
This query joins the Sales and Products tables on the product_id column, groups the
results by product name, calculates the total sales revenue for each product, and
ranks products based on total sales revenue using the RANK() window function.
10. Calculate the running total revenue for each product category.
Query:
SELECT p.category, p.product_name, s.sale_date,
SUM(s.total_price) OVER (PARTITION BY p.category ORDER BY s.sale_date) AS
running_total_revenue
FROM Sales s
JOIN Products p ON s.product_id = p.product_id;
Output:
product_na sale_dat running_total_rev
category
me e enue
Electronic 2024-01-
Laptop 2500.00
s 01
Electronic 2024-01-
Smartphone 3460.00
s 02
Electronic 2024-01-
Headphones 3460.00
s 02
Electronic 2024-01-
Keyboard 3630.00
s 03
Electronic 2024-01-
Mouse 3630.00
s 03
Explanation:
This query joins the Sales and Products tables on the product_id column, partitions
the results by product category, orders the results by sale date, and calculates the
running total revenue for each product category using the SUM () window function.
11. Categorize sales as “High”, “Medium”, or “Low” based on total price
(e.g., > $200 is High, $100-$200 is Medium, < $100 is Low).
Query:
SELECT sale_id,
CASE
WHEN total_price > 200 THEN 'High'
WHEN total_price BETWEEN 100 AND 200 THEN 'Medium'
ELSE 'Low'
END AS sales_category
FROM Sales;
Output:
sale_i sales_categ
d ory
1 High
2 High
3 Low
4 Low
5 Low
Explanation:
This query categorizes sales based on total price using a CASE statement. Sales
with a total price greater than $200 are categorized as “High”, sales with a total
price between $100 and $200 are categorized as “Medium”, and sales with a total
price less than $100 are categorized as “Low”.
12. Identify sales where the quantity sold is greater than the average
quantity sold.
Query:
SELECT *
FROM Sales
WHERE quantity_sold > (SELECT AVG(quantity_sold) FROM Sales);
Output:
sale_i product quantity_s sale_dat total_pri
d _id old e ce
2024-01-
1 101 5 2500.00
01
2024-01-
5 105 6 90.00
03
Explanation:
This query selects all sales where the quantity sold is greater than the average
quantity sold across all sales in the Sales table.
13. Extract the month and year from the sale date and count the number
of sales for each month.
Query:
SELECT CONCAT(YEAR(sale_date), '-', LPAD(MONTH(sale_date), 2, '0')) AS month,
COUNT(*) AS sales_count
FROM Sales
GROUP BY YEAR(sale_date), MONTH(sale_date);
Output:
sales_cou
month
nt
2024-
5
01
Explanation:
This query selects all sales where the quantity sold is greater than the average
quantity sold across all sales in the Sales table.
14. Calculate the number of days between the current date and the sale
date for each sale.
Query:
SELECT sale_id, DATEDIFF(NOW(), sale_date) AS days_since_sale
FROM Sales;
Output:
sale_i days_since_s
d ale
1 185
2 184
3 184
4 183
5 183
Explanation:
This query calculates the number of days between the current date and the sale
date for each sale using the DATEDIFF function.
15. Identify sales made during weekdays versus weekends.
Query:
SELECT sale_id,
CASE
WHEN DAYOFWEEK(sale_date) IN (1, 7) THEN 'Weekend'
ELSE 'Weekday'
END AS day_type
FROM Sales;
Output:
sale_i day_ty
d pe
Weekda
1
y
Weekda
2
y
Weekda
3
y
Weeken
4
d
Weeken
5
d
Explanation:
This query categorizes sales based on the day of the week using the DAYOFWEEK
function. Sales made on Sunday (1) or Saturday (7) are categorized as “Weekend”,
while sales made on other days are categorized as “Weekday”.
Dataset
The dataset consists of two tables. The first one is shown below; you can create this table by
copying and running this query from GitHub.
id first_name last_name department salary
Like any table, it has a name: employees. Each table has columns which also have names. They
describe what data each column contains.
The columns and data in the above table are:
id – The unique ID of the employee and the table’s primary key.
first_name – The employee’s first name.
last_name – The employee’s last name.
department – The employee’s department.
salary – The employee’s monthly salary, in USD.
All this tells us that this table is a list of a company’s employees and their salaries. There is also
data on the employees’ departments. All employees work in the sales division, where the
department can be either Corporate or Private Individuals. In other words, the employees sell the
company’s products to companies and private individuals.
The other table in the dataset is named quarterly_sales. It is shown below, and the query for
creating it is.
employee_id q1_2022 q2_2022 q3_2022 q4_2022
FROM employees;
Explanation
Whenever you want to select any number of columns from any table, you need to use the SELECT
statement. You write it, rather obviously, by using the SELECT keyword.
After the keyword comes an asterisk (*), which is shorthand for “all the columns in the table”.
To specify the table, use the FROM clause and write the table’s name afterward.
Output
The query’s output is the whole table employees, as shown below.
id first_name last_name department salary
FROM employees;
Explanation
The approach is similar to the previous query. However, this time, instead of an asterisk, we
write the specific column name in SELECT. In this case, it’s the column first_name.
The second line of the query is the same: it references the table in the FROM clause.
Output
The query returns the list of employees’ first names.
first_name
Paul
Astrid
Matthias
Lucy
first_name
Tom
Claudia
Walter
Stephanie
Luca
Victoria
last_name
FROM employees;
Explanation
Again, the approach is similar to earlier examples. To select two columns, you need to write their
names in SELECT. The important thing is that the columns need to be separated by a comma. You
can see in the example that there’s a comma between the columns first_name and last_name.
Then, as usual, reference the table employees in FROM.
Output
Now the query shows the employees’ full names.
first_name last_name
Paul Garrix
Astrid Fox
Matthias Johnson
Lucy Patterson
Tom Page
Claudia Conte
Walter Deer
Stephanie Marx
first_name last_name
Luca Pavarotti
Victoria Pollock
4. Selecting Two (or More) Columns From One Table and Filtering Using
Numeric Comparison in WHERE
Knowing this SQL query will allow you to filter data according to numeric values. You can do
that using comparison operators in the WHERE clause.
Here’s the overview of the SQL comparison operators.
Comparison Operator Description
= Is equal to
Query
SELECT
first_name,
last_name,
salary
FROM employees
Explanation
The query actually selects three, not two columns. It’s the same as with two columns: simply
write them in SELECT and separate them with commas.
Then we reference the table in FROM.
Now, we need to show only employees with a salary above 3,800. To do this, you need to use
WHERE. It’s a clause that accepts conditions and is used for filtering the output. It goes through
the table and returns only the data that satisfies the condition.
In our case, we’re looking for salaries ‘greater than’ a certain number. In other words, a
condition using the > comparison operator.
To set the condition, we write the column name in WHERE. Then comes the comparison operator,
and after that, the value that the data has to be greater than. This condition will now return all the
salaries that are above 3,800.
Output
The query returns four employees and their salaries. As you can see, they all have salaries above
3,800.
first_name last_name salary
first_name,
last_name
FROM employees
Explanation
The query selects employees’ first and last names.
However, we want to show only employees whose name is Luca. For this, we again use WHERE.
The approach is similar to the previous example: we use WHERE, write the column name, and
use the comparison operator. This time, our condition uses the equal sign (=).
In other words, the values in the column first_name have to be equal to Luca. Also, when the
condition is not a number but a text or a date/time, it has to be written in single quotes ( '').
That’s why our condition is written as 'Luca', not simply Luca.
Output
The output shows there’s only one employee named Luca, and his full name is Luca Pavarotti.
first_name last_name
Luca Pavarotti
6. Selecting Two Columns and Ordering by One Column
Here’s another basic SQL query example that you’ll find useful. It can be used whenever you
have to order the output in a certain way to make it more readable.
Ordering or sorting the output is done using the ORDER BY clause. By default, it orders the output
in ascending order. This works alphabetically (for text data), from the lowest to the highest
number (for numerical data), or from the oldest to the newest date or time (for dates and times).
Query
SELECT
first_name,
last_name
FROM employees
ORDER BY last_name;
Explanation
We again select employees’ first and last names. But now we want to sort the output in a specific
way. In this example, it’s by employees’ surname. To do that, we use ORDER BY. In it, we simply
write the column name.
Learn SQL by actually writing SQL code. Complete 129 interactive exercises in our SQL Basics
course and gain confidence in your coding skills.
We might add the keyword ASC after that to sort the output ascendingly. However, that’s not
mandatory, as ascending sorting is a default in SQL.
Output
The query returns a list of employees ordered alphabetically by their last names.
first_name last_name
Claudia Conte
Walter Deer
Astrid Fox
Paul Garrix
Matthias Johnson
Stephanie Marx
Tom Page
Lucy Patterson
Luca Pavarotti
Victoria Pollock
first_name last_name
first_name,
last_name
FROM employees
Explanation
The query is almost exactly the same as in the previous example. The only difference is we’re
ordering the output by the employee’s name descendingly.
To do that, put the keyword DESC after the last_name column in the ORDER BY clause.
Output
first_name last_name
Victoria Pollock
Luca Pavarotti
Lucy Patterson
Tom Page
Stephanie Marx
Matthias Johnson
Paul Garrix
Astrid Fox
Walter Deer
Claudia Conte
You can see that the output is ordered the way we wanted.
8. Selecting Two Columns From One Table and Ordering Descendingly by Two
Columns
Sorting an SQL query can get more sophisticated. It’s common to sort data by two or more
columns, which you’re probably already familiar with as an Excel or Google Sheets user. The
same can be done in SQL.
Query
SELECT
first_name,
last_name,
salary
FROM employees
Explanation
With this query, we’re building on the previous example; we want to sort the output by the
employee’s salary and their last name. This time, we sort by salary descending and then by last
name ascendingly.
We reference the column salary in ORDER BY and follow it with the keyword DESC. The DESC
keyword indicates descending order. Before the second ordering criteria, we need to put a
comma. After it comes the second criteria/column, which is last_name in this case. You can add
or omit the keyword ASC to sort the output in ascending order.
Note: The order of the columns in ORDER BY is important! The query written as it is above
will first sort by salary descendingly and then by the last name ascendingly. If you wrote ORDER
BY last_name ASC, salary DESC, it would sort by last name first and then by salary in
descending order.
Output
first_name last_name salary
The output is ordered by salary. When the salary is the same (green rows), the data is ordered
alphabetically by last name.
9. Selecting Two Columns With a Complex Logical Condition in WHERE
This example will again demonstrate how to filter output using WHERE. It will be a bit more
advanced this time, as we’ll use a logical operator. In SQL, logical operators allow you to test if
the filtering condition is true or not. They also allow you to set multiple conditions.
The three basic logical operators in SQL are AND, OR, and NOT. In the query below, we’ll use
OR to get salaries below 3,000 or above 5,000.
Query
SELECT
first_name,
last_name,
salary
FROM employees
Explanation
We use this query to select the employee’s first name, last name, and salary from the table
employees.
However, we want to show only those employees whose salaries are either above $5,000 or
below $3,000. We do that by using the logical operator OR and the comparison operators in
WHERE.
We write the first condition in WHERE, where we reference the salary column and set the
condition that the values must be above 5,000. Then we use the OR operator, followed by the
second condition. The second condition again references the salary column and uses the ‘less
than’ operator to return the values below 3,000.
Output
first_name last_name salary
The query returns only three employees and their salaries, as they are the only ones that satisfy
the conditions.
10. Simple Computations on Columns
In this example, we’ll show how you can perform simple mathematical operations on the table’s
columns.
We’ll use one of SQL’s arithmetic operators.
Arithmetic Operator Description
+ Addition
- Subtraction
* Multiplication
/ Division
Query
SELECT
employee_id,
FROM quarterly_sales;
Explanation
In the above query, we want to find the sales in the first half of 2022 for each employee.
We do it by first selecting the column employee_id from the table quarterly_sales.
Then we select the column q1_2022 and use the addition arithmetic operator to add the q2_2022
column. We also give this new calculated column an alias of h1_2022 using the AS keyword.
Output
employee_id h1_2022
8 18,260.66
employee_id h1_2022
4 18,264.04
10 2,817.18
1 17,181.20
3 37,558.82
2 10,092.45
7 33,695.03
6 11,240.08
5 13,905.29
9 8,586.86
The output shows all the employees’ IDs and their respective sales in the first half of 2022.
11. Using SUM() and GROUP BY
This query uses the aggregate function SUM() with GROUP BY. In SQL, aggregate functions
work on groups of data; for example, SUM(sales) shows the total of all the values in the sales
column. It’s useful to know about this function when you want to put data into groups and show
the total for each group.
Query
SELECT
department,
SUM(salary) AS total_salaries
FROM employees
GROUP BY department;
Explanation
The purpose of the above query is to find the total salary amount for each department. This is
achieved in the following way.
First, select the column department from the table employees. Then, use the SUM() function. As
we want to add the salary values, we specify the column salary in the function. Also, we give this
calculated column the alias total_salaries.
Finally, the output is grouped by the column department.
Note: Any non-aggregated column appearing in SELECT must also appear in GROUP BY.
But this is logical – the whole purpose is to group data by department, so of course we’ll put it in
GROUP BY.
Output
department total_salaries
Corporate 21,919.82
The output shows all the departments and the sum of total monthly salary costs by department.
12. Using COUNT() and GROUP BY
Here’s another basic SQL query that uses an aggregate function. This time, it’s COUNT(). You
can use it if you want to group data and show the number of occurrences in each group.
Query
SELECT
department,
COUNT(*) AS employees_by_department
FROM employees
GROUP BY department;
Explanation
We want to show the number of employees by department.
Select the department from the table employees. Then, use the COUNT() aggregate function. In
this case, we use the COUNT(*) version, which counts all the rows. We give the column the alias
employees_by_department.
As a final step, we group the output by the department.
Note: COUNT(*) counts all the rows, including those with the NULL values. If you don’t want
to include the possible NULL values in your output, use the COUNT(column_name) version of the
function. We can use COUNT(*) here because we know no NULL values are in the table.
Output
department employees_by_department
Corporate 5
Private Individuals 5
AVG(salary) AS average_salary
FROM employees
GROUP BY department;
Explanation
The query is the same as the last one, only this time we use the AVG() function, as we want to
calculate the average salary by department.
We select the department, use AVG() with the salary column, and group the output by
department.
Output
department average_salary
Corporate 4,383.96
department,
MIN(salary) AS minimum_salary
FROM employees
GROUP BY department;
Explanation
Again, we use the same query and change only the aggregate function.
The query calculates the minimum salary by department.
Output
department minimum_salary
Corporate 2,894.51
The output shows the departments and the lowest salary in each department.
15. Using MAX() and GROUP BY
This example shows how to use the MAX() aggregate function to show the highest value within
each group.
Query
SELECT
department,
MAX(salary) AS maximum_salary
FROM employees
GROUP BY department;
Explanation
We use the query to show the highest salary in each department, together with the department’s
name.
You already know how this works. The query is the same as in the previous example, but now it
uses the MAX() function.
Output
department maximum_salary
Corporate 5,974.41
The output shows us the highest salaries in the Corporate and Private Individuals department.
If you’re interested in learning more about SQL but have no prior knowledge of programming or
databases, take a look at our SQL Basics course.
16. Using SUM(), WHERE, and GROUP BY
This one might seem more complicated, but it’s still a basic SQL query. It is used when you want
to show the total values for each group but you want to include only specific rows in the sum.
Query
SELECT
department,
SUM(salary) AS total_salary
FROM employees
GROUP BY department;
Explanation
The query will show the total salary by department, but it will include only individual salaries
above $3,500 in the sum. Here’s how it works.
First, of course, select the departments and use SUM() with the salary column from the table
employees. You learned that already.
Then use the WHERE clause to specify the values you want included in the sum. In this case, it’s
where the column salary is higher than 3,500. In other words, the query will now sum only
values above 3,500.
Finally, group by department.
Output
department total_salary
Corporate 19,025.31
These totals now include only salaries above $3,500. Compare this to the output from the
eleventh example (shown below; mind the different sorting), and you’ll see that the totals are
lower. It’s logical, as the below output also includes salaries equal to or less than $3,500.
department total_salaries
Corporate 21,919.82
department,
COUNT(*) AS number_of_employees
FROM employees
GROUP BY department;
Explanation
This is similar to the previous query, only it uses the COUNT() aggregate function. Its goal is to
show the department name and the number of employees in that department, but it counts only
the employees with a salary above $3,500.
To achieve that, first select the department. Then use COUNT(*) to count all the rows within each
department. Each row equals one employee. We are free to use this version of the COUNT()
function because we know there are no NULL rows.
Now, use WHERE to include only employees with salaries higher than $3500 in the counting.
In the end, you only need to group data by department.
Output
department number_of_employees
Private Individuals 3
Corporate 4
The output shows there are three employees in the Private Individuals department paid above
$3,500 and there are four such employees in the Corporate department.
Some employees are obviously missing, as they should be. We learned in one of the previous
examples that there are five employees in each department.
18. Accessing Data in Two Tables Using INNER JOIN
This type of query is used whenever you want to access data from two or more tables. We’ll
show you INNER JOIN, but it’s not the only join type you can use.
Here’s a short overview of join types in SQL. These are the full join names. What’s shown in the
brackets can be omitted in the query and the join will work without it.
SQL Join Type Description
Returns all the values from the left table and only the matching values from the right
LEFT (OUTER) JOIN
table.
RIGHT (OUTER) Returns all the values from the right table and only the matching values from the left
JOIN table.
FULL (OUTER) JOIN Returns all the rows from both tables.
Returns all combinations of all rows from the first and second table, i.e. the
CROSS JOIN
Cartesian product.
Query
SELECT
e.id,
e.first_name,
e.last_name,
FROM employees e
JOIN quarterly_sales qs
ON e.id = qs.employee_id;
Explanation
This query wants to show each employee’s ID and name, together with their total sales in 2022.
For that, it uses JOIN, as the required data is in both tables of our dataset.
Let’s start explaining the query with the FROM clause. This is familiar: to use the data from the
table employees, you need to reference it in FROM. We also give this table an alias (‘e’), so that
we don’t have to write the table’s full name later on.
After that, we use the JOIN keyword to join the second table. We do that by referencing the table
quarterly_sales in JOIN and giving it the alias ‘qs’.
Now comes the ON condition. It is used to specify the columns on which the two tables will be
joined. Usually, those are the columns that store the same data in both tables. In other words, we
join the tables on the primary and foreign keys. A primary key is a column (or columns) that
uniquely defines each row in the table. A foreign key is a column in the second table that refers
to the first table. In our example, the column id from the table employees is its primary key. The
column employee_id from the table quarterly_sales is the foreign key, as it contains the
value of the column id from the first table.
So we’ll use these columns in ON, but we also need to specify which table each column is from.
Remember, we gave our tables aliases. This will come in handy here, as we won’t need to write
the tables’ full names – only one letter for each table. We write the first table’s alias (instead of
its full name), separate them with a dot, and then the column name. We put the equal sign, the
second table’s alias, and the column name.
Now that we have two tables joined, we are free to select any column from both tables. We
select id, first_name, and last_name from employees. Then we add each column from the
table quarterly_sales showing the quarterly sales and name it total_sales_2022. Each column
in SELECT also has the table alias before it, with the alias and the column name separated by a
dot.
Note: When joining tables, using the table names in front of the column names in SELECT is
advisable. This will make it easier to determine which column comes from which table. Also,
the tables can have columns of the same name. However, table names can become wordy, so
giving them aliases in JOIN is also advisable. That way, you can use much shorter aliases
(instead of the full table names) in front of the column names.
Output
id first_name last_name total_sales_2022
The output lists each employee and shows their total sales in 2022.
19. Accessing Data in Two Tables Using INNER JOIN and Filtering Using
WHERE
Of course, you can filter data in joined tables the same way as you can with only one table.
You’ll again need the WHERE clause.
Query
SELECT
e.id,
e.first_name,
e.last_name,
qs.q4_2022-qs.q3_2022 AS sales_change
FROM employees e
JOIN quarterly_sales qs
ON e.id = qs.employee_id
Explanation
We tweaked the previous query to show the decrease in sales between the third and the fourth
quarter.
Here’s how we did it. Just as we did earlier, we selected the employee’s ID and name.
We subtracted one quarter from another to calculate the change between the quarters. In this
case, it’s the column with the fourth quarter sales minus the third quarter sales. This new column
is named sales_change.
The tables are joined exactly the same way as in the previous example.
To show only the sales decrease, we use the WHERE clause. In it, we again subtract the third
quarter from the fourth and set the condition that the result has to be below zero, i.e. a decrease.
As you noticed, WHERE comes after the tables are joined.
Output
id first_name last_name sales_change
The output shows all the employees who had a sales decrease in the last quarter and the amount
of that decrease.