The Ultimate Guide of SQL
The Ultimate Guide of SQL
DDL is used to define and manage database structures like tables and schemas.
These clauses and operators are used to filter and organize query results.
5. Aggregate Functions
● GROUP BY: Group rows that have the same values into summary rows.
● HAVING: Filter groups based on a condition.
These clauses are used to group data and filter aggregated results.
7. SQL Joins
Joins are used to combine rows from two or more tables based on related columns.
● CTE: Temporary result set defined within the execution scope of a SELECT,
INSERT, UPDATE, or DELETE statement.
● SUBQUERIES: A query nested inside another query.
Subqueries and CTEs are used to simplify complex queries and improve readability.
9. Set Operators
● UNION: Combine the result sets of two or more SELECT statements (without
duplicates).
● UNION ALL: Combine the result sets of two or more SELECT statements (with
duplicates).
Existential queries check for the existence of data, while CASE WHEN is used for
conditional logic within queries.
Window functions allow for calculations across a set of rows related to the current row,
without collapsing data into groups.
By using the OVER() clause, aggregate functions become window functions, allowing for
aggregate calculations over a defined set of rows (the window) without collapsing the
data like a traditional GROUP BY would.
These functions allow for the manipulation and comparison of date and time values in
queries.
RESOURCES:
Websites:
1. https://www.w3schools.com/sql/
2. https://sqlbolt.com/
Youtube Playlist:
This below playlist contains the complete tutorial video of SQL with all the required
topics in English.
https://youtube.com/playlist?list=PLavw5C92dz9Ef4E-1Zi9KfCTXS_IN8gXZ&si=XCw
pStf9zZ0YISN8
And if you want to learn in Hindi, then you can follow this below playlist:
https://youtube.com/playlist?list=PLdOKnrf8EcP17p05q13WXbHO5Z_JfXNpw&si=8m
4E9IGf-2MR9ZKA
Note - Below mentioned are some top playlists of SQL interview Q&A which I also
use to prepare before any SQL interview.
https://youtube.com/playlist?list=PLavw5C92dz9Hxz0YhttDniNgKejQlPoAn&si=NgKE
CJfJ8gYCMzxS
https://youtube.com/playlist?list=PLBTZqjSKn0IfuIqbMIqzS-waofsPHMS0E&si=kurTh
9-krlyBTZSc
https://youtube.com/playlist?list=PLBTZqjSKn0IeKBQDjLmzisazhqQy4iGkb&si=HFvZ
N7s3pPAQlpYL
These websites are perfect for beginners who want to start with the basics and
build a solid foundation:
Medium
These platforms are great for those with some SQL knowledge who want to
practice real-world problems and prepare for interviews:
Expert
For advanced practice, you can solve the hard-difficulty questions available on the
medium-level websites mentioned above, such as:
Overview
For any data-related job role, SQL is a key skill, and your interview will mostly
revolve around it. In any SQL round, you can face two types of questions:
Query-Based Questions
Overview
For any data-related job role, SQL is a key skill, and your interview will mostly
revolve around it. In any SQL round, you can face two types of questions:
Query-Based Questions
Question 1: Write a SQL query to find the second highest salary from the table
emp.
● Table: emp
● Columns: id, salary
WITH RankedSalaries AS (
SELECT salary, DENSE_RANK() OVER (ORDER BY salary DESC) AS rank
FROM emp
)
SELECT salary AS SecondHighestSalary
FROM RankedSalaries
WHERE rank = 2;
Question 2: Write a SQL query to find the numbers which consecutively occur 3
times.
● Table: table_name
● Columns: id, numbers
Answer:
SELECT numbers
FROM (
SELECT numbers,
LEAD(numbers, 1) OVER (ORDER BY id) AS next_num,
LEAD(numbers, 2) OVER (ORDER BY id) AS next_next_num
FROM table_name
)t
WHERE numbers = next_num AND numbers = next_next_num;
Question 3: Write a SQL query to find the days when temperature was higher than
its previous dates.
● Table: table_name
● Columns: Days, Temp
WITH TempWithLag AS (
SELECT Days, Temp, LAG(Temp) OVER (ORDER BY Days) AS prev_temp
FROM table_name
)
SELECT Days
FROM TempWithLag
WHERE Temp > prev_temp;
● Table: table_name
● Columns: column1, column2, ..., columnN
Answer:
Question 5: Write a SQL query for the cumulative sum of salary of each employee
from January to July.
● Table: table_name
● Columns: Emp_id, Month, Salary
Answer:
SELECT Emp_id, Month, SUM(Salary) OVER (
PARTITION BY Emp_id ORDER BY Month ROWS BETWEEN
UNBOUNDED PRECEDING AND CURRENT ROW
) AS CumulativeSalary
FROM table_name;
Question 6: Write a SQL query to display year-on-year growth for each product.
● Table: table_name
● Columns: transaction_id, Product_id, transaction_date, spend
WITH YearlySpend AS (
SELECT
Product_id,
YEAR(transaction_date) AS year,
SUM(spend) AS total_spend
FROM table_name
GROUP BY Product_id, YEAR(transaction_date)
),
Growth AS (
SELECT
year,
Product_id,
total_spend,
LAG(total_spend) OVER (PARTITION BY Product_id ORDER BY year) AS
prev_year_spend
FROM YearlySpend
)
SELECT year, Product_id,
(total_spend - prev_year_spend) / prev_year_spend AS yoy_growth
FROM Growth
WHERE prev_year_spend IS NOT NULL;
Question 7: Write a SQL query to find the rolling average of posts on a daily basis
for each user_id. Round up the average to two decimal places.
● Table: table_name
● Columns: user_id, date, post_count
Answer:
Question 8: Write a SQL query to get the emp_id and department for each
department where the most recently joined employee is still working.
● Table: table_name
● Columns: emp_id, first_name, last_name, date_of_join, date_of_exit,
department
Answer:
Question 9: How many rows will come in the outputs of Left, Right, Inner, and
Outer Join from two tables having duplicate rows?
● Left Table A:
Column
1
1
1
2
2
3
4
5
● Right Table B:
Column
1
1
2
2
2
3
3
3
4
Answer:
Explanation:
● Left Join: The left join combines all rows from Table A with matching rows
in Table B. For values like 1 and 2, multiple matches occur, leading to
repeated rows in the output. Unique values in A without matches in B (5) are
included with NULL values.
3 rows of 1 from left table * 2 rows of 1 from right table = 6 Rows of 1
2 rows of 2 from left table * 3 rows of 2 from right table = 6 Rows of 2
1 rows of 5 from left table will come with Null in corresponding row as
there is no value of 5 in right and we are doing left join so it is mandatory to
take all values from left table -
So, Total output of left join will be 17 rows
Note - Please use above method and try to understand other joins
output too
● Right Join: The right join behaves symmetrically, including all rows from
Table B with matches in Table A. Unique values in B without matches in A
(None in this case) would appear with NULL values, but no such rows exist
here.
● Inner Join: The inner join only includes rows with matching values in both
tables. Duplicates amplify the matches, yielding 16 rows.
● Outer Join: The full outer join includes all rows from both tables,
combining matched rows and appending unmatched rows with NULL
values. Here, only 5 from Table A contributes an unmatched row, leading to
17 total rows.
Question 10: Write a query to get mean, median, and mode for earnings.
● Table: table_name
● Columns: Emp_id, salary
Answer:
-- Mean
SELECT AVG(salary) AS MeanSalary FROM table_name;
-- Median
SELECT AVG(salary) AS MedianSalary
FROM (
SELECT salary
FROM table_name
ORDER BY salary
LIMIT 2 - (SELECT COUNT(*) FROM table_name) % 2 OFFSET (SELECT
(COUNT(*) - 1) / 2 FROM table_name)
) t;
-- Mode
SELECT salary AS ModeSalary
FROM table_name
GROUP BY salary
ORDER BY COUNT(*) DESC
LIMIT 1;
Question 11: Determine the count of rows in the output of the following queries
for Table X and Table Y.
● Table X:
ids
1
1
1
1
● Table Y:
ids
1
1
1
1
1
1
1
1
Queries:
Answer:
Since the join condition X.ids != Y.ids cannot be satisfied (as all ids in both tables
are 1), the output for all queries will be:
● Query 1: 0 rows
● Query 2: 0 rows
● Query 3: 0 rows
● Query 4: 0 rows
Explanation:
● The condition X.ids != Y.ids checks for inequality between the columns,
which is not possible as every row in both tables has the same value for ids.
● Hence, no rows are returned for any join type.
Question 12: Write a SQL query to calculate the percentage of total sales
contributed by each product category in a given year.
● Table: sales
● Columns: product_category, sale_year, revenue
Answer:
WITH TotalSales AS (
FROM sales
GROUP BY sale_year
FROM sales s
Question 13: Write a SQL query to find the longest streak of consecutive days an
employee worked.
● Table: attendance
● Columns: emp_id, work_date
Answer:
WITH ConsecutiveDays AS (
FROM attendance
FROM ConsecutiveDays
LIMIT 1;
Question 14: Write a query to identify customers who made purchases in all
quarters of a year.
● Table: transactions
● Columns: customer_id, transaction_date
Answer:
WITH QuarterlyData AS (
SELECT customer_id,
FROM transactions
SELECT customer_id
FROM QuarterlyData
GROUP BY customer_id
Question 15: Write a query to find the first and last purchase dates for each
customer, along with their total spending.
● Table: transactions
● Columns: customer_id, transaction_date, amount
Answer:
SELECT customer_id,
MIN(transaction_date) AS first_purchase,
MAX(transaction_date) AS last_purchase,
SUM(amount) AS total_spending
FROM transactions
GROUP BY customer_id;
Question 16: Write a query to find the top 3 employees who generated the highest
revenue in the last year.
● Table: employee_sales
● Columns: emp_id, sale_date, revenue
Answer:
FROM employee_sales
GROUP BY emp_id
LIMIT 3;
Question 17: Write a query to calculate the monthly retention rate for a
subscription-based service.
● Table: subscriptions
● Columns: user_id, start_date, end_date
Answer:
WITH MonthlyRetention AS (
SELECT DATE_FORMAT(start_date, '%Y-%m') AS subscription_month,
FROM subscriptions
GROUP BY subscription_month
SELECT subscription_month,
FROM MonthlyRetention;
Question 18: Write a query to identify products with declining sales for 3
consecutive months.
● Table: monthly_sales
● Columns: product_id, month, sales
Answer:
WITH DeclineCheck AS (
SELECT product_id
FROM DeclineCheck
GROUP BY product_id;
Question 19: Write a query to find the average order value (AOV) for customers
who placed at least 5 orders in the last year.
● Table: orders
● Columns: customer_id, order_date, order_amount
Answer:
WITH OrderCounts AS (
FROM orders
GROUP BY customer_id
FROM OrderCounts
WHERE total_orders >= 5;
Answer:
1. FROM: Specifies the source table or tables and establishes any joins
between them.
2. WHERE: Filters rows based on specified conditions before grouping or
aggregations.
3. GROUP BY: Groups rows into summary rows based on specified columns.
4. HAVING: Filters aggregated groups, often used with aggregate functions.
5. SELECT: Specifies the columns or expressions to include in the final output.
6. ORDER BY: Sorts the result set in ascending or descending order.
7. LIMIT: Restricts the number of rows returned in the final output.
Answer:
Example: Use WHERE to filter employees with a salary above 50,000, and
HAVING to filter departments with an average salary above 60,000.
Answer:
1. INNER JOIN: Returns rows where there is a match in both tables.
○ Example: Find employees with matching departments.
2. LEFT JOIN: Returns all rows from the left table, and matching rows from
the right table. Non-matches are filled with NULL.
○ Example: List all employees with their departments, even if they are
not assigned.
3. RIGHT JOIN: Returns all rows from the right table, and matching rows
from the left table. Non-matches are filled with NULL.
○ Example: List all departments with their employees, even if they have
none.
4. FULL OUTER JOIN: Returns all rows from both tables, with NULL in
places where no match exists.
○ Example: Combine all employees and departments, regardless of
matches.
5. CROSS JOIN: Produces the Cartesian product of both tables.
○ Example: Pair every employee with every department.
Example: Automatically update a log table whenever a row is inserted into the
orders table.
Example: A stored procedure to calculate monthly sales and store the result in a
report table.
Answer:
● RANK: Assigns a rank to rows within a partition, skipping ranks for ties.
● ROW_NUMBER: Assigns a unique sequential number to rows within a
partition, without skipping.
● DENSE_RANK: Similar to RANK, but does not skip ranks for ties.
● LEAD: Accesses data from the following row in the same partition.
● LAG: Accesses data from the preceding row in the same partition.
Answer:
Question 10: What are aggregate functions, and when do we use them? Explain
with examples.
Answer: Constraints enforce data integrity and rules on tables. Types include:
Question 13: What are keys, and what are their types?
Answer:
Answer:
Question 15: What are indexes, and what are their types?
Question 16: What are views, and what are their limitations?
Answer: Views are virtual tables based on SQL queries. They do not store data but
simplify query reuse. Limitations:
● Cannot be indexed.
● Performance depends on the underlying base tables.
● Cannot directly include ORDER BY.
Answer:
1. One-to-One: Each row in Table A links to exactly one row in Table B.
2. One-to-Many: Each row in Table A links to multiple rows in Table B.
3. Many-to-Many: Rows in Table A link to multiple rows in Table B and vice
versa.
Answer:
WITH Retention AS (
SELECT customer_id, COUNT(*) AS total_orders,
COUNT(CASE WHEN order_date >= DATE_ADD(first_order_date,
INTERVAL 1 MONTH) THEN 1 END) AS retained_orders
FROM orders
GROUP BY customer_id
)
SELECT customer_id, (retained_orders / total_orders) * 100 AS retention_rate
FROM Retention;