lec07 (1)
lec07 (1)
Relational Algebra
CS3402 1
Relational Algebra
Relational algebra: a formal language for the relational model
The operations in relational algebra enable a user to specify basic
retrieval requests (or queries)
CS3402 2
Importance of Relational Algebra
Foundation of SQL: Relational algebra forms the theoretical
foundation of SQL. SQL is a practical implementation of the
concepts and operations defined in relational algebra. By learning
relational algebra, you gain a deeper understanding of the
fundamental principles that underpin SQL, allowing you to write
better SQL queries
CS3402 3
Relational Algebra Overview
Relational algebra consists of several groups of operations
Unary Relational Operations
SELECT (symbol: (sigma))
PROJECT (symbol: (pi))
RENAME (symbol: (rho))
Binary Relational Operations
JOIN (several variations of JOIN exist)
DIVISION
Relational algebra operations from set theory
UNION ( ), INTERSECTION ( ), DIFFERENCE (or MINUS, – )
CARTESIAN PRODUCT ( x )
Additional Relational Operations
AGGREGATE FUNCTIONS (These compute summary of
information: for example, SUM, COUNT, AVG, MIN, MAX)
CS3402 4
Database State for COMPANY
All examples discussed below refer to the COMPANY database
shown here
Slide 8- 5
CS3402 5
The following query results refer to this
database state
CS3402 6
Unary Relational Operations: SELECT
The SELECT operation (denoted by (sigma)) is used to select a
subset of the tuples from a relation based on a selection condition.
CS3402 7
Unary Relational Operations: SELECT
Examples1:
Select the EMPLOYEE tuples whose department number is 4:
DNO = 4 (EMPLOYEE)
Equivalent to:
SELECT *
FROM EMPLOYEE
WHERE DNO=4;
CS3402 8
Unary Relational Operations: SELECT
Examples2:
Select the employee tuples whose department number is 4 and
salary is greater than $25,000 or department number is 5 and
salary is greater than $30,000:
(Dno =4 AND Salary>25,000 ) OR
(Dno=5 AND Salary> 30 ,000)(EMPLOYEE)
Equivalent to:
SELECT *
FROM EMPLOYEE
WHERE (Dno=4 AND
Salary>25,000) OR
(Dno=5 AND
Salary>30,000)
CS3402 9
Unary Relational Operations: SELECT
SELECT Operation Properties
SELECT is commutative:
<condition1>( < condition2> (R)) = <condition2> ( < condition1> (R))
CS3402 10
Unary Relational Operations: SELECT
SELECT Operation Properties
CS3402 11
Unary Relational Operations: PROJECT
PROJECT Operation is denoted by (pi)
CS3402 12
Unary Relational Operations: PROJECT
Example: To list each employee’s first and last name and salary,
the following is used:
LNAME, FNAME,SALARY(EMPLOYEE)
Equivalent to:
SELECT LNAME,
FNAME, SALARY
FROM EMPLOYEE;
CS3402 13
Examples of applying SELECT and
PROJECT operations
The project operation removes any duplicate tuples
This is because the result of the project operation must be a set of
tuples
Mathematical sets do not allow duplicate elements
Example: Sex,Salary(EMPLOYEE)
CS3402 15
Relational Algebra Expressions
We may want to apply several relational algebra operations one
after the other
Either we can write the operations as a single relational algebra
expression by nesting the operations, or
We can apply one operation at a time and create intermediate
result relations
CS3402 16
Single expression versus sequence of
relational operations
Example: To retrieve the first name, last name, and salary of all
employees who work in department number 5, we must apply a
select and a project operation
CS3402 17
Unary Relational Operations: RENAME
The RENAME operator is denoted by (rho)
If we write:
• R (First_name , Last_name , Salary))(RESULT )
• The 3 attributes of RESULT are renamed to First_name ,
Last_name and Salary, respectively; and R is the name of the
result relation.
CS3402 19
Example of applying multiple operations
and RENAME
Slide 8- 20
CS3402 20
Set Theory: UNION
UNION Operation
Binary operation, denoted by
CS3402 21
Relational Algebra Operations from
Set Theory: UNION
Example:
To retrieve the social security numbers of all employees who
either work in department 5 (RESULT1 below) or directly
supervise an employee who works in department 5 (RESULT2
below)
CS3402 23
Relational Algebra Operations from Set
Theory: INTERSECTION
INTERSECTION is denoted by
CS3402 24
Relational Algebra Operations from Set
Theory: SET DIFFERENCE
SET DIFFERENCE (also called MINUS or EXCEPT) is denoted by –
R S = (R S) – (R – S)) – (S – R)
CS3402 25
Example to illustrate the result of UNION,
INTERSECT, and DIFFERENCE
CS3402 26
Example to illustrate the result of UNION,
INTERSECT, and DIFFERENCE
CS3402 27
Example to illustrate the result of UNION,
INTERSECT, and DIFFERENCE
CS3402 28
Example to illustrate the result of UNION,
INTERSECT, and DIFFERENCE
CS3402 29
Requirements of UNION, INTERSECT, and
DIFFERENCE
Type compatibility of operands is required for the binary set
operation UNION , (also for INTERSECTION , and SET
DIFFERENCE –)
R1(A1, A2, ..., An) and R2(B1, B2, ..., Bn) are type compatible if:
they have the same number of attributes, and
the domains of corresponding attributes are type compatible
(i.e. dom(Ai)=dom(Bi) for i=1, 2, ..., n)
CS3402 30
Properties of UNION, INTERSECT, and
DIFFERENCE
Notice that both union and intersection are commutative operations;
that is
R S = S R, and R S = S R
CS3402 31
Relational Algebra Operations from Set
Theory: CARTESIAN PRODUCT
CARTESIAN (or CROSS) PRODUCT Operation
This operation is used to combine tuples from two relations in a
combinatorial fashion
Denoted by R(A1, A2, . . ., An) x S(B1, B2, . . ., Bm)
Result is a relation Q with degree n + m attributes:
Q(A1, A2, . . ., An, B1, B2, . . ., Bm), in that order
The resulting relation state has one tuple for each combination
of tuples - one from R and one from S
Hence, if R has nR tuples (denoted as |R| = nR ), and S has nS
tuples, then R x S will have nR * nS tuples
The two operands R and S do NOT have to be "type
compatible”
CS3402 32
Relational Algebra Operations from Set
Theory: CARTESIAN PRODUCT
Generally, CROSS PRODUCT is not a meaningful operation
Some tuples in the result do not exist in the mini-world
Can become meaningful when followed by other operations
CS3402 33
Figure 8.5 The CARTESIAN PRODUCT
(CROSS PRODUCT) operation
CS3402 34
Figure 8.5 The CARTESIAN PRODUCT
(CROSS PRODUCT) operation
CS3402 35
Figure 8.5 The CARTESIAN PRODUCT
(CROSS PRODUCT) operation
CS3402 36
Relational Algebra Operations from Set
Theory: CARTESIAN PRODUCT
To keep only combinations where the DEPENDENT is related to
the EMPLOYEE, we add a SELECT operation as follows
Example (meaningful):
FEMALE_EMPS SEX=’F’(EMPLOYEE)
EMPNAMES FNAME, LNAME, SSN (FEMALE_EMPS)
EMP_DEPENDENTS EMPNAMES x DEPENDENT
ACTUAL_DEPS SSN=ESSN(EMP_DEPENDENTS)
RESULT FNAME, LNAME, DEPENDENT_NAME (ACTUAL_DEPS)
RESULT will now contain the name of female employees and their
dependents
CS3402 37
The CARTESIAN PRODUCT (CROSS PRODUCT)
operation
CS3402 38
Binary Relational Operations: JOIN
JOIN Operation (denoted by )
The sequence of CARTESIAN PRODUCT followed by SELECT
is used quite commonly to identify and select related tuples from
two relations
A special operation, called JOIN combines this sequence into a
single operation
The general form of a join operation on two relations R(A1, A2, .
. ., An) and S(B1, B2, . . ., Bm) is:
R <join condition>S
CS3402 39
Binary Relational Operations: JOIN
Example: Suppose that we want to retrieve the name of the manager
of each department
To get the manager’s name, we need to combine each
DEPARTMENT tuple with the EMPLOYEE tuple whose SSN value
matches the MGRSSN value in the department tuple.
DEPT_MGR DEPARTMENT MGRSSN=SSN EMPLOYEE
CS3402 40
Figure 8.6 Result of the JOIN operation
CS3402 41
Some properties of JOIN
Consider the following JOIN operation:
R(A1, A2, . . ., An) S(B1, B2, . . ., Bm)
R.Ai=S.Bj
Result is a relation Q with degree n + m attributes:
Q(A1, A2, . . ., An, B1, B2, . . ., Bm), in that order
The resulting relation state has one tuple for each combination
of tuples: r from R and s from S, but only if they satisfy the join
condition r[Ai]=s[Bj]
Hence, if R has nR tuples, and S has nS tuples, then the join
result will generally have less than nR x nS tuples.
CS3402 42
Theta-join
The general case of JOIN operation is called a Theta-join:
R <conditions> S
The join condition is called theta
Theta can be any general boolean expression on the attributes of R
and S; for example:
R.Ai < S.Bj AND (R.Ak=S.Bl OR R.Ap<S.Bq)
CS3402 43
EQUIJOIN
EQUIJOIN Operation
The most common use of join involves join conditions with equality
comparisons only
CS3402 44
NATURAL JOIN Operation
NATURAL JOIN Operation
Another variation of JOIN called NATURAL JOIN — denoted by
* was created to get rid of the second (superfluous) attribute in
an EQUIJOIN condition
because one of each pair of attributes with identical values
is superfluous
The standard definition of natural join requires that the two join
attributes, or each pair of corresponding join attributes, have
the same name in both relations.
e.g. Q R(A,B,C,D) * S(C,D,E)
The implicit join condition includes each pair of attributes with
the same name, “AND”ed together: R.C=S.C AND R.D=S.D
Result keeps only one attribute of each such pair:
CS3402 Q(A,B,C,D,E) 45
NATURAL JOIN
Example: Suppose we want to combine each PROJECT tuple with
the DEPARTMENT controlling it.
The attribute Dnum is called the join attribute for NATURAL JOIN,
because it is the attribute with the same name in both relations.
CS3402 46
Example of NATURAL JOIN operation
CS3402 47
Binary Relational Operations: DIVISION
DIVISION Operation
The division operation is applied to two relations
R(Z) S(X), where X is a subset of Z
Let Y = Z - X (and hence Z = X Y); that is, let Y be the set of
attributes of R that are not attributes of S
CS3402 48
Binary Relational Operations: DIVISION
Example: retrieve the Social Security numbers of employees who
work on all the projects that ‘John Smith’ works on.
First, retrieve the list of project numbers that ‘John Smith’ works on
in the intermediate relation SMITH_PNOS:
SMITH ← σ Fname=‘John’ AND Lname=‘Smith’ (EMPLOYEE)
SMITH_PNOS ← π Pno(WORKS_ON Essn=SsnSMITH)
Next, we create a relation that includes a tuple <Essn, Pno> for all
employees:
SSN_PNOS ← π Essn, Pno(WORKS_ON)
SSNS(Ssn) ← SSN_PNOS
÷SMITH_PNOS
CS3402 50
Complete Set of Relational Operations
The set of operations including SELECT , PROJECT , UNION
, DIFFERENCE - , RENAME , and CARTESIAN PRODUCT X is
called a complete set because any other relational algebra
expression can be expressed by a combination of these SIX
operations
For example:
R S = (R S ) – ((R - S) (S - R))
R <join condition>S = <join condition> (R X S)
CS3402 51
Table 8.1 Operations of Relational
Algebra
CS3402 52
Table 8.1 Operations of Relational
Algebra
CS3402 53
Additional operators:Grouping and
Aggregate Functions
We can define an AGGREGATE FUNCTION operation, using the
symbol (pronounced script F) as follows:
CS3402 54
References
6e
Ch. 6, p. 141-157, 167-170
CS3402 55