Unit - 2
Unit - 2
sc
UNIT – 2
Relational Algebra
Relational Algebra: Introduction to Relational Algebra, Operations: Selection, Projection, Set Operations,
Join Operations, Division. Structured Query Language (SQL): SQL Basics: DDL and DML, Aggregate
Functions (Min(), Max(), Sum(), Avg(), Count ()), Logical operators (AND, OR, NOT), Predicates (Like,
Between, Alias, Distinct), Clauses (Group By, Having, Order by, top/limit).
Introduction to Relational Algebra
Relational Algebra:
Relational Algebra is a formal language used to query and manipulate relational databases,
consisting of a set of operations like selection, projection, union, and join. It provides a
mathematical framework for querying databases, ensuring efficient data retrieval and
manipulation. Relational algebra serves as the mathematical foundation for query SQL.
• It provides a clear, structured approach for formulating database queries.
• Helps in understanding and optimizing query execution plans for better performance.
• SQL queries are based on relational algebra operations, making it essential for learning
SQL.
• Enables complex queries, like joins and nested queries, that are critical for working with
large datasets.
Key Concepts in Relational Algebra:
• Relations: In relational algebra, a relation is a table that consists of rows and columns,
representing data in a structured format. Each relation has a unique name and is made up
of tuples.
• Tuples: A tuple is a single row in a relation, which contains a set of values for each attribute.
It represents a single data entry or record in a relational table.
• Attributes: Attributes are the columns in a relation, each representing a specific
characteristic or property of the data. For example, in a “Students” relation, attributes could
be “Name”, “Age”, and “Grade”.
• Domains: A domain is the set of possible values that an attribute can have. It defines the
type of data that can be stored in each column of a relation, such as integers, strings, or
dates.
Operators in Relational Algebra
Relational algebra consists of various operators that help us fetch and manipulate data from
relational tables in the database to perform certain operations on relational data. The
fundamental operators in relational algebra, such as selection, projection, and join, are essential
for querying and transforming data efficiently within a relational database.
Relational Operations
1. Unary Operation
Operations which perform on Single relation
1. Selection (𝝈):
Select operation chooses the subset of tuples from the relation that satisfies the given condition
mentioned in the syntax of selection. The selection operation is also known as horizontal
partitioning since it partitions the table or relation horizontally.
Notation:
σ c(R)
where ‘c’ is the selection condition which is a boolean expression(condition), we can have a
single condition like Roll= 3 or a combination of conditions like X>2 AND Y<1, and symbol
‘σ (sigma)’ is used to denote select(choose) operator, R is a relational algebra expression, whose
result is a relation. The boolean expression specified in condition ‘c’ can be written in the
following form:
Example-1:
σ Place = 'Mumbai' or Salary >= 1000000 (Citizen)
σ Department = 'Analytics'(σ Location = 'NewYork'(Manager))
The query above(immediate) is called nested expression, here, as usual, we evaluate the inner
expression first (which results in relation say Manager1), then we calculate the outer expression
on Manager1(the relation we obtained from evaluating the inner expression), which results in
relation again, which is an instance of a relation we input.
2. Projection(𝝅):
Project operation selects (or chooses) certain attributes discarding other attributes. The
Project operation is also known as vertical partitioning since it partitions the relation or
table vertically discarding other columns or attributes.
Notation:
πA(R)
where ‘A’ is the attribute list, it is the desired set of attributes from the attributes of
relation(R),
symbol ‘π(pi)’ is used to denote the Project operator,
R is generally a relational algebra expression, which results in a relation.
Example –
πAge(Student)
πDept, Sex(Emp)
Project Class and Dept from Faculty –
πClass, Dept(Faculty)
3. Rename(𝝆):
The RENAME operation is used to rename the output of a relation. Sometimes it is simple and
suitable to break a complicated sequence of operations and rename it as a relation with different
names. Reasons to rename a relation can be many, like
• We may want to save the result of a relational algebra expression as a relation so that we
can use it later.
• We may want to join a relation with itself, in that case, it becomes too confusing to specify
which one of the tables we are talking about, in that case, we rename one of the tables and
perform join operations on them.
Notation:
ρ X (R)
where the symbol ‘ρ’ is used to denote the RENAME operator and R is the result of the
sequence of operation or expression which is saved with the name X.
• Example-1: Query to rename the relation Student as Male Student and the attributes
of Student – RollNo, SName as (Sno, Name).
• ρ MaleStudent(Sno, Name) πRollNo, SName(σCondition(Student))
Example-2: Query to rename the attributes Name, Age of table Department to A,B.
ρ (A, B) (Department)
Example-3: Query to rename the table name Project to Pro and its attributes to P, Q, R.
ρ Pro(P, Q, R) (Project)
Example-4: Query to rename the first attribute of the table Student with attributes A, B, C
to P.
ρ (P, B, C) (Student)
2) Intersection(∩):
✓ Returns only the common tuples present in both relations.
✓ Like UNION, both relations must have the same attributes and domains.
Syntax in SQL:
SELECT column_name FROM table1
INTERSECT
SELECT column_name FROM table2;
4) Cross Product/Cartesian(X):
The Cartesian Product is also an operator which works on two sets. It is sometimes called the
CROSS PRODUCT or CROSS JOIN.
Syntax in SQL:
SELECT * FROM table1, table2;
OR
SELECT * FROM table1 CROSS JOIN table2;
JOINS:
A DBMS JOIN operates through SQL to link different tables using matching field contents.
The DBMS JOIN functionality allows users to seek related data by creating linkages between
different tables which helps produce efficient data retrieval and organization processes. Hard
and soft JOIN operations including INNER JOIN and LEFT JOIN and FULL JOIN enable
diverse ways to conduct data unions and analysis.
JOINS (⋈)
1. Inner Join:
The INNER JOIN keyword selects all rows from both the tables as long as the condition is
satisfied. This keyword will create the result set by combining all rows from both the tables
where the condition satisfies i.e value of the common field will be the same.
Syntax:
SELECT table1.column1,table1.column2,table2.column1,.... FROM table1 INNER JOIN
table2 ON table1.matching_column = table2.matching_column;
Note: We can also write JOIN instead of INNER JOIN. JOIN is same as INNER JOIN.
Inner Join
Example of INNER JOIN
Consider the two tables, Student and StudentCourse, which share a common
column ROLL_NO. Using SQL JOINS, we can combine data from these tables based on their
relationship, allowing us to retrieve meaningful information like student details along with their
enrolled courses.
Student
2. StudentCourse Table:
.
Query:
SELECT StudentCourse.COURSE_ID, [Link], [Link] FROM Student
INNER JOIN StudentCourse
ON Student.ROLL_NO = StudentCourse.ROLL_NO;
Output
Outer Joins
Theta Join, Equijoin, and Natural Join are called inner joins. An inner join includes only those
tuples with matching attributes and the rest are discarded in the resulting relation. Therefore,
we need to use outer joins to include all the tuples from the participating relations in the
resulting relation. There are three kinds of outer joins − left outer join, right outer join, and full
outer join.
Outer Join further divided into three subtypes:
1. Left join
2. Full join
3. Right join
1. SQL LEFT JOIN
A LEFT JOIN returns all rows from the left table, along with matching rows from the right
table. If there is no match, NULL values are returned for columns from the right table. LEFT
JOIN is also known as LEFT OUTER JOIN.
Syntax
SELECT table1.column1,table1.column2,table2.column1,....
FROM table1
LEFT JOIN table2
ON table1.matching_column = table2.matching_column;
Note: We can also use LEFT OUTER JOIN instead of LEFT JOIN, both are the same.
Left JOIN
LEFT JOIN Example
In this example, the LEFT JOIN retrieves all rows from the Student table and the matching
rows from the StudentCourse table based on the ROLL_NO column.
Query:
SELECT [Link],StudentCourse.COURSE_ID
FROM Student
Right JOIN
RIGHT JOIN Example
In this example, the RIGHT JOIN retrieves all rows from the StudentCourse table and the
matching rows from the Student table based on the ROLL_NO column.
Query:
SELECT [Link], StudentCourse.COURSE_ID
FROM Student
RIGHT JOIN StudentCourse
ON StudentCourse. ROLL_NO = Student. ROLL_NO;
Output
Syntax
SELECT table1.column1,table1.column2,table2.column1,....
FROM table1
FULL JOIN table2
ON table1.matching_column = table2.matching_column;
Key Terms
• table1: First table.
• table2: Second table
• matching_column: Column common to both the tables.
FULL JOIN Example
This example demonstrates the use of a FULL JOIN, which combines the results of both
LEFT JOIN and RIGHT JOIN. The query retrieves all rows from the Student and
StudentCourse tables. If a record in one table does not have a matching record in the other
table, the result set will include that record with NULL values for the missing fields
Query:
SELECT [Link],StudentCourse.COURSE_ID
FROM Student
FULL JOIN StudentCourse
ON StudentCourse.ROLL_NO = Student.ROLL_NO;
4. SQL Natural Join
A Natural Join is a type of INNER JOIN that automatically joins two tables based on columns
with the same name and data type. It returns only the rows where the values in the common
columns match.
• It returns rows where the values in these common columns are the same in both tables.
• Common columns appear only once in the result, even if they exist in both tables.
• Unlike a CROSS JOIN, which creates all possible combinations of rows, a Natural Join
only includes rows with matching values
Example:
Look at the two tables below: Employee and Department
Employee
Emp_id Emp_name Dept_id
1 Ram 10
2 Jon 30
3 Bob 50
Department
Dept_id Dept_name
10 IT
30 HR
40 TIS
Find all Employees and their respective departments.
(Employee) ? (Department)
Output:
Emp_id Emp_name Dept_id Dept_id Dept_name
1 Ram 10 10 IT
2 Jon 30 30 HR
Division :
Division in SQL is typically required when you want to find out entities that are interacting
with all entities of a set of different types of entities. The division operator is used when we
have to evaluate queries that contain the keyword 'all'. The DIVISION (÷) operation is a special
type of operation in Relational Algebra that is used to retrieve records that are related to all
entries of another set. It is typically used when dealing with "all" conditions in a query.
1. Definition
• The division operation (÷) is used when a relation contains a set of values that need to be
matched with all values in another relation.
• It is used in scenarios where we need to find entities that have a relationship with all
possible values of another set.
2. Syntax
If we have two relations A(X, Y) and B(Y), the result of A ÷ B will return a set of values from
X that are related to
all values in B.
A(X, Y) ÷ B(Y) = C(X)
• A(X, Y): A table containing attributes X and Y.
• B(Y): A table containing only Y values.
• C(X): The result contains only X values that are associated with all Y values in B.
4) TRUNCATE :It is used to remove the whole content of the table along with the deallocation of
the space occupied by the data, without affecting the table's structure.
MySQL Syntax -To remove data present inside a table :
TRUNCATE TABLE table_name;
1. SELECT :It is used to retrieve or fetch data from a database. The SELECT statement cannot
manipulate data, it can only access it. Hence, it is known as the Data Query Language, i.e., a
limited form of DML statement.
MySQL Syntax -To fetch an entire table :
SELECT * FROM table_name;
To fetch particular columns from a table :
SELECT column_1, column_2 FROM table_name;
To fetch particular columns from a table based on a condition:
SELECT column_1, column_2
FROM table_name
WHERE <condition>;
Fetching data with various clauses (General SELECT statement):
SELECT column_list FROM table-name
[WHERE Clause]
[GROUP BY clause]
[HAVING clause]
[ORDER BY clause];
4. DELETE :It is used to delete existing records from a table, i.e., it is used to remove one or more
rows from a table.
MySQL Syntax -To delete rows from a table based on a condition :
DELETE FROM table_name [WHERE condition];
Note: The DELETE statement only removes the data from the table, whereas the TRUNCATE
statement also frees the memory along with data removal. Hence, TRUNCATE is more efficient in
removing all the data from a table.
6. CALL :It is used to execute a structured query language function or a Java subprogram from
within [Link] use the CALL statement, we need to define a function using CREATE
PROCEDURE command. Its syntax is:
CREATE PROCEDURE procedure_name (parameter_1 DATATYPE_1, ...)
AS
BEGIN
QUERY
END
Now, we can execute this procedure using the CALL statement.
MySQL Syntax -
SET @parameter1 = value1;
CALL procedure_name([@parameter1,.. ]
Aggregate Functions
In SQL, grouping and aggregating data are essential techniques for analysing datasets. When
dealing with large volumes of data, we often need to summarize or categorize it into meaningful
groups. The combination of the GROUP BY clause and aggregate functions like COUNT (),
SUM(), AVG(), MIN(), and MAX() makes it easy to perform such tasks.
aggregate functions perform a calculation on a set of values and return a single aggregated
result. These functions are typically used in conjunction with the GROUP BY clause to
summarize data, enabling users to derive meaningful insights from large datasets. Common
aggregate functions in MySQL include SUM(), AVG(), COUNT(), MIN(), and MAX(), which
help in calculating totals, averages, counts, and finding the minimum or maximum values in a
dataset.
Aggregating Data: Aggregate functions perform calculations on multiple rows of data and
return a single result. Common aggregate functions include COUNT(), SUM(), AVG(), MIN(),
and MAX(). These help us get summaries like totals, averages, and counts for different
categories or groups
SELECT column1, column2, …, AGGREGATE_FUNCTION(column3)
FROM table_name
WHERE condition
GROUP BY column1, column2, …
HAVING condition
ORDER BY column1;
The five aggregate functions that we can use with the SQL Order By statement are:
• AVG(): Calculates the average of the set of values.
• COUNT(): Returns the count of rows.
• SUM(): Calculates the arithmetic sum of the set of numeric values.
• MAX(): From a group of values, returns the maximum value.
• MIN(): From a group of values, returns the minimum value.
4) MIN(): This function returns the smallest (minimum) value within a set of values in a
specific column.
SELECT MIN(column_name)
FROM table_name
WHERE condition;
5) MAX(): This function returns the largest (maximum) value within a set of values in a
specific column.
SELECT MAX(column_name)
FROM table_name
WHERE condition;
Logical operators
In SQL, logical operators are used to create conditional expressions that evaluates to either true
or false. They are used in the WHERE clause of SELECT, UPDATE, DELETE, and other SQL
statements to filter data based on specified conditions.
1) AND Operator:
The AND operator is used to combine two or more conditions in an SQL query. It returns
records only when all conditions specified in the query are true. This operator is commonly
used when filtering data that must satisfy multiple criteria simultaneously.
Example
Retrieve the records of employees from the employees table who are located in 'Allahabad' and
belong to 'India', ensuring that both conditions are met.
Query:
SELECT * FROM employee WHERE emp_city = 'Allahabad' AND emp_country = 'India';
Output
Explanation:
In the output, both conditions (emp_city = 'Allahabad' and emp_country = 'India') are satisfied
for the listed employees, so these records are returned by the query.
2) OR Operator
The OR operator combines multiple conditions in a SQL query and returns TRUE if at least
one of the conditions is satisfied. It is ideal for situations where you want to retrieve records
that meet any of several possible conditions.
Example
Retrieve the records of employees from the employee table who are either from 'Varanasi' or
have 'India' as their country.
Query
SELECT * FROM employee WHERE emp_city = 'Varanasi' OR emp_country = 'India';
Output
Explanation:
In this case, the output includes employees from 'Varanasi' as well as those who have 'India' as
their country, even if they are from different cities. The query returns all records where at least
one of the conditions is true.
3) NOT Operator
The NOT operator is used to reverse the result of a condition, returning TRUE when the
condition is FALSE. It is typically used to exclude records that match a specific condition,
making it useful for filtering out unwanted data.
Example
Retrieve the records of employees from the employee table whose city names do not start with
the letter 'A'.
Query:
SELECT * FROM employee WHERE emp_city NOT LIKE 'A%';
Output
Explanation:
In this query, the NOT operator negates the LIKE condition. The LIKE operator is used to
match patterns in string data, and the 'A%' pattern matches any city name that starts with the
letter 'A'. By using the NOT operator, we exclude cities starting with 'A' from the result set.
Predicates
1) LIKE
The LIKE operator in SQL is used in the WHERE clause to search for a specified pattern in a
column. It is particularly useful when we want to perform pattern matching on string data.
The LIKE operator works with two main wildcards:
• %: Represents zero or more characters. It allows matching any sequence of characters in
the string.
• _: Represents exactly one character. It is used when you want to match a specific number
of characters at a given position.
Example
Retrieve the records of employees from the employee table whose city names start with the
letter 'P'.
Query:
SELECT * FROM employee WHERE emp_city LIKE 'P%';
Output
Explanation:
In this case, the output includes only those employees whose emp_city starts with 'P'.
The % wildcard ensures that the query matches any city name starting with the specified letter,
regardless of how many additional characters follow it.
2) BETWEEN
The BETWEEN operator in SQL allows us to test if a value or expression lies within
a specified range.
• Inclusive Range – Includes both the lower and upper limits in the result set.
• Versatile Use – Works with numbers, dates, and text values.
Explanation:
In this query, the BETWEEN operator is used to filter employees with emp_id values ranging
from 101 to 104. Since the BETWEEN operator is inclusive, employees with emp_id values
of 101, 102, 103, and 104 will be included in the result set.
3) Alias
In a Database Management System (DBMS), an alias is a temporary, alternative name given to
a table or a column within a SQL query. This temporary name exists only for the duration of
that specific query and does not alter the original name of the table or column in the database
schema.
There are two types of aliases in SQL:
• Column Aliases: Temporary names for columns in the result set.
• Table Aliases: Temporary names for tables used within a query.
Example of SQL Aliases
We will use the following Customer table to demonstrate all SQL alias concepts. This table
contains customer information such as ID, name, country, age, and phone number.
CREATE TABLE Customer (
CustomerID INT PRIMARY KEY,
CustomerName VARCHAR(50),
LastName VARCHAR(50),
Country VARCHAR(50),
Age INT,
Phone VARCHAR(15)
);
-- Inserting sample data into the Customer table
INSERT INTO Customer (CustomerID, CustomerName, LastName, Country, Age, Phone)
VALUES
(1, 'Shubham', 'Thakur', 'India', 23, '9876543210'),
(2, 'Aman', 'Chopra', 'Australia', 21, '9876543211'),
(3, 'Naveen', 'Tulasi', 'Sri Lanka', 24, '9876543212'),
(4, 'Aditya', 'Arpan', 'Austria', 21, '9876543213'),
(5, 'Nishant', 'Jain', 'Spain', 22, '9876543214');
1. Column Aliases
A column alias is used to rename a column just for the output of a query. They are useful when:
2. Table Aliases
A table alias is used when you want to give a table a temporary name for the duration of a
query. Table aliases are especially helpful in JOIN operations to simplify queries, particularly
when the same table is referenced multiple times (like in self-joins).
Example 2: Table Alias for Joining Tables
We want to join the Customer table with itself to find customers who have the same country
and are aged 21. We will use table aliases for each instance of the Customer table.
Query:
SELECT [Link], [Link]
FROM Customer AS c1, Customer AS c2
WHERE [Link] = [Link] AND [Link] = [Link];
4) Distinct:
The SQL DISTINCT keyword is used to retrieve unique values from a specified column or set
of columns in a database table. It eliminates duplicate records, ensuring that only distinct, non-
repeated values are returned.
When you would use it
You would use the DISTINCT keyword when you want to eliminate duplicate values and
retrieve a list of unique values from one or more columns in a table. It is especially useful when
working with datasets that may contain redundant or repeated information.
Syntax
The syntax for using DISTINCT is as follows:
SELECT DISTINCT column1, column2, ...
FROM table_name
WHERE condition;
• column1, column2, ...: The column(s) from which you want to retrieve distinct values.
• table_name: The name of the table containing the data.
• condition (optional): An optional condition to filter the rows.
Parameter values
• column1, column2, ...: The column(s) for which you want to retrieve distinct values. These
columns should be part of the specified table.
• table_name: The table where the data is stored.
• condition (optional): A condition that filters the rows if you want to apply additional
filtering before returning distinct values.
Example 1: Fetch Unique Names from the NAME Field.
The query returns only unique names, eliminating the duplicate entries from the table.
Query:
SELECT DISTINCT NAME FROM students;
Clauses:
Structured Query Language (SQL) is a powerful language used to manage and manipulate
relational databases. One of the essential features of SQL is its clauses, which allow users to
filter, sort, group, and limit data efficiently. SQL clauses simplify querying and enhance
database performance by retrieving only the necessary records based on specific conditions.
1) GROUP BY CLAUSE
SQL GROUP BY is used to arrange identical data into groups. It is used with the SQL SELECT
statement. The GROUP BY statement follows the WHERE clause in a SELECT statement and
precedes the ORDER BY clause. It is also used with aggregation functions.
The syntax is as follows −
SELECT column FROM table_name WHERE conditions GROUP BY column ORDER BY
column;
The GROUP BY clause is used to group records with the same values in a column and
perform aggregate functions such as SUM(), COUNT(), etc. This example calculates
the total student fees per class.
Query:
SELECT stu_class, SUM(stu_fees) AS total_fees
FROM Students
GROUP BY stu_class;
2) HAVING CLAUSE
It is used to search conditions for a group or an aggregate. It is generally used in Group by
clause. If we are not using group by clause then we can use having clause just like where
clause.
The syntax is as follows −
SELECT column1, column2
FROM table_name
WHERE conditions
GROUP BY column1, column2
HAVING conditions
ORDER BY column1, column2;
Query:
SELECT company, COUNT(*) FROM product group by company
having count(*)>1;
3) ORDER BY Clause
This clause sorts the result in either ascending or descending order. By default, it does an
ascending order if you don’t mention anything. ASC and DESC are the keywords used to
order the records.
The syntax is as follows −
SELECT column1, column2
FROM table_name
WHERE condition
ORDER BY column1, column2... ASC|DESC;
Sorting in ascending order
Use the below mentioned command to sort in ascending order −
SELECT *
FROM product
ORDER BY company;