0% found this document useful (0 votes)
13 views

Module 2 Part2

The document discusses various relational algebra operations like select, project, rename, cartesian product, join, union, intersection and set difference. It explains each operation in detail along with examples.

Uploaded by

gautham
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

Module 2 Part2

The document discusses various relational algebra operations like select, project, rename, cartesian product, join, union, intersection and set difference. It explains each operation in detail along with examples.

Uploaded by

gautham
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 76

MODULE 2: RELATIONAL ALGEBRA

PART 2
SYLLABUS
• Structure of Relational Databases - Integrity
Constraints, Synthesizing ER diagram to
relational schema
• Introduction to Relational Algebra - select,
project, cartesian product operations, join -
Equi-join, natural join. Query examples
• Introduction to Structured Query Language
(SQL), Data Definition Language (DDL), Table
definitions and operations – CREATE, DROP,
ALTER, INSERT, DELETE, UPDATE.
Introduction to Relational Algebra
• Select
• Project
• Rename
• Cartesian product operations
• Join - Equi-join
• Natural join
• Query examples
Relational algebra
• Relational algebra is a procedural query
language, which takes instances of relations as
input and yields instances of relations as output.
• It uses operators to perform queries. An
operator can be either unary or binary.
• They accept relations as their input and yield
relations as their output.
• Relational algebra is performed recursively on a
relation and intermediate results are also
considered as relations.
Unary Relational Operations:
1. SELECT
2. PROJECT
3. RENAME
SELECT Operation
• The SELECT operation is used to choose a subset of
the tuples from a relation that satisfies a selection
condition.
• Horizontal partition of the relation into two sets
of tuples.
• Those tuples that satisfy the condition are
selected, and those tuples that do not satisfy the
condition are discarded.
• The SELECT operation is denoted by

• R is generally a relational algebra expression


whose result is a relation—the simplest such
expression is just the name of a database
relation.
• The relation resulting from the SELECT
operation has the same attributes as R.
• Select the EMPLOYEE tuples whose department
is 4.
• Select the EMPLOYEE tup whose salary is
greater than $30,000
• The result of a SELECT operation can be
determined as follows:
• The <selection condition> is applied
independently to each individual tuple t in R.
• This is done by substituting each occurrence of
an attribute Ai in the selection condition with its
value in the tuple t[Ai].
• If the condition evaluates to TRUE, then tuple t
is selected.
• All the selected tuples appear in the result of the
SELECT operation.
• The Boolean expression specified in <selection
condition> is made up of a number of clauses
of the form

• Clauses can be connected by the standard


Boolean operators and, or, and not to form a
general selection condition
• Select the tuples for all employees who either
work in department 4 and make over $25,000
per year, or work in department 5 and make over
$30,000
Interpretation of Boolean conditions
AND, OR, and NOT
• (cond1 AND cond2) is TRUE
▫ if both (cond1) and (cond2) are TRUE;
▫ otherwise, it is FALSE.
• (cond1 OR cond2) is TRUE
▫ if either (cond1) or (cond2) or both are
TRUE;
▫ otherwise, it is FALSE.
• (NOT cond) is TRUE
▫ if cond is FALSE;
▫ otherwise, it is FALSE.
Degree of the relation from SELECT
operation
• Its number of attributes
• It is same as the degree of R.
• The number of tuples in the resulting relation is
always less than or equal to the number of
tuples in R.
Commutative Property of SELECT
• SELECT operation is commutative

• A sequence of SELECTs can be applied in any


order.
• We can always combine a cascade
(sequence) of SELECT operations into
a single SELECT operation with a
conjunctive (AND) condition; that is,
Properties of SELECT
• Select operation produces a relation S that has
the same schema as R.
• Select is commutative
• Select operations may be applied in any order.
• Cascade of select operations may be replaced by a
single selection with a conjunction of all the
conditions.
• The number of tuples in the result of a select
operation is less than or equal to the number of
tuples in the input relation R.
PROJECT Operation
• PROJECT operation selects certain columns
from the table and discards the other columns.
• If we are interested in only certain attributes of a
relation, we use the PROJECT operation to
project the relation over these attributes only
• Vertical partition of the relation into two
relations
▫ one has the needed columns (attributes) and
contains the result of the operation, and
▫ the other contains the discarded columns.
• The general form of PROJECT

• Example: List each employee’s last name, first


name and salary

• The result of the PROJECT operation has only


the attributes specified in <attribute list> in the
same order as they appear in the list.
• Hence, its degree is equal to the number of
attributes in <attribute list>.
Properties
• The PROJECT operation removes any duplicate
tuples, so the result of the PROJECT operation is
a set of distinct tuples, and hence a valid relation.
• The number of tuples in a relation resulting from
a PROJECT operation is always less than or equal
to the number of tuples in R.
• Commutativity does not hold on PROJECT.
Sequences of Operations
• For most queries, we need to apply several
relational algebra operations one after the other.
• Either we can write the operations as a single
relational algebra expression by nesting the
operations, or
• we can apply one operation at a time and create
intermediate results.
• we must give names to the relations that hold the
intermediate results.
Example: Retrieve the first name, last
name, and salary of all employees who work
in department number 5.

• We must apply a SELECT and a PROJECT


operation
Example- Intermediate Relation

• It is simpler to break down a complex sequence


of operations by specifying intermediate result
than to write a single relational algebra
expression.
RENAME Operation
• To rename the attributes in a relation, we simply
list the new attribute names in parentheses,
• If no renaming is applied, the names of the
attributes in the resulting relation of a SELECT
operation are the same as those in the original
relation and in the same order.
• a PROJECT operation with no renaming, the
resulting relation has the same attribute names
as those in the projection list and in the same
order in which they appear in the list.
RENAME Operation General form
• RENAME operation when applied to a relation R of
degree n is denoted by any of the following three
forms

• symbol ρ (rho) is used to denote the RENAME


operator,
• S is the new relation name, and
• B1, B2, ..., Bn are the new attribute names.
▫ The first expression renames both the relation and its
attributes,
▫ the second renames the relation only, and
▫ the third renames the attributes only.
• If the attributes of R are (A1, A2, ..., An) in that
The UNION, INTERSECTION, and
MINUS Operations
• UNION:
▫ The result of this operation, denoted by R ∪ S, is a
relation that includes all tuples that are either in R or
in S or in both R and S.
▫ Duplicate tuples are eliminated.
• INTERSECTION:
▫ The result of this operation, denoted by R ∩ S, is a
relation that includes all tuples that are in both R and
S.
• SET DIFFERENCE (or MINUS):
▫ The result of this operation, denoted by R – S, is a
relation that includes all tuples that are in R but not
Example
• Retrieve the Social Security Numbers of all
employees who either work in department 5 or
directly supervise an employee who works in
department 5, we can use the UNION operation
as follows

• UNION operation produces the tuples that are in


either RESULT1 or RESULT2 or both, while
eliminating any duplicates.
DEP5_EMPS
• UNION, INTERSECTION, and SET
DIFFERENCE are binary operations;
▫that is, each is applied to two sets
Union compatibility or
Type compatibility
• Two relations R(A1, A2, ..., An) and S(B1, B2, ...,
Bn) are said to be union compatible (or type
compatible)
▫ if they have the same degree n and if dom(Ai) =
dom(Bi) for 1 <=i<=n.
• This means that the two relations have the same
number of attributes and each corresponding
pair of attributes has the same domain.
(a) Two union-compatible relations.

(b) STUDENT ∪
INSTRUCTOR

(c) STUDENT ∩ INSTRUCTOR


(d) STUDENT − INSTRUCTOR

(e) INSTRUCTOR − STUDENT


Properties
• UNION and INTERSECTION are commutative
& associative.

• The MINUS operation is not commutative.

• INTERSECTION can be expressed in terms of


union and set difference as follows:
CARTESIAN PRODUCT
(CROSS PRODUCT) OR
CROSS JOIN
• denoted by ×
• This is also a binary set operation, but the relations
on which it is applied do not have to be union
compatible.
• the result of R(A1, A2, ..., An) × S(B1, B2, ..., Bm) is
a relation Q with degree n + m attributes Q(A1, A2,
..., An, B1, B2, ..., Bm), in that order.
• The resulting relation Q has one tuple for each
combination of tuples—one from R and one from S.
Example
• We want to retrieve a list of names of each
female employee’s dependents
• The CARTESIAN PRODUCT creates tuples with
the combined attributes of two relations.

• We can SELECT related tuples only from the


two relations by specifying an appropriate
selection condition after the Cartesian product.

• Because this sequence of CARTESIAN


PRODUCT followed by SELECT is quite
commonly used to combine related tuples from
two relations, a special operation, called JOIN,
was created to specify this sequence as a single
operation.
The JOIN Operation
• The JOIN operation, denoted by is used to
combine related tuples from two relations into
single “longer” tuples.

• JOIN operation on two relations R(A1, A2, ...,


An) and S(B1, B2, ..., Bm) is

• The result of the JOIN is a relation Q with n +


m attributes Q(A1, A2, ..., An, B1, B2, ... , Bm)
in that order.
• Q has one tuple for each combination of tuples—
one from R and one from S—whenever the
Example
• Retrieve the name of the manager of each
department.
▫ To get the manager’s name, we need to combine
each department tuple with the employee tuple
whose Ssn value matches the Mgr_ssn value in the
department tuple.
▫ We do this by using the JOIN operation and then
projecting the result over the necessary attributes,
as follows
RESULT
Dname Lname Fname
Research Wong Franklin
Administration Wallance Jennifer
Headquaters Borg James
• The JOIN operation can be specified as a
CARTESIAN PRODUCT operation followed by a
SELECT operation.

• These two operations can be replaced with a


single JOIN operation as follows

• In JOIN, only combinations of tuples


satisfying the join condition appear in the
result, whereas in the CARTESIAN PRODUCT
all combinations of tuples are included in the
result.
Different Types of JOINs
• INNER JOIN
▫ Returns records that have matching values in both
tables
1. Theta join
2. EQUI join
3. Natural join
• OUTER JOIN
▫ In an outer join, along with tuples that satisfy the
matching criteria, we also include some or all tuples that
do not match the criteria.
1. Left Outer JOIN
2. Right Outer Join
3. Full Outer Join
THETA
JOIN
• where each <condition> is of the form Ai θ Bj, Ai
is an attribute of R, Bj is an attribute of S, Ai and
Bj have the same domain, and
• θ (theta) is one of the comparison operators {=, <,
≤, >, ≥, ≠}.
• Tuples whose join attributes are NULL or for which
the join condition is FALSE do not appear in the
result
EQUIJOIN
• A JOIN, where the only comparison operator
used is =, is called an EQUIJOIN
• In the result of an EQUIJOIN we always have
one or more pairs of attributes that have
identical values in every tuple
NATURAL JOIN

• NATURAL JOIN requires that the two join


attributes (or each pair of join attributes) have
the same name in both relations.
• If this is not the case, a renaming operation is
applied first.
• General form: Q R*S
Example
• Combine each PROJECT tuple with the
DEPARTMENT tuple that controls the project
▫first we rename the Dnumber attribute of
DEPARTMENT to Dnum
▫ so that it has the same name as the Dnum attribute
in PROJECT—and then we apply NATURAL JOIN

• The same query can be done in two steps by


creating an intermediate table DEPT
• In the PROJ_DEPT relation, each tuple
combines a PROJECT tuple with the
DEPARTMENT tuple for the department that
controls the project, but only one join attribute
value is kept.
• If the attributes on which the natural join is
specified already have the same names in both
relations, renaming is unnecessary
• to apply a natural join on the Dnumber
attributes of DEPARTMENT and
DEPT_LOCATIONS
• The join condition for NATURAL JOIN is
constructed by equating each pair of join
attributes that have the same name in the two
relations and combining these conditions with
AND.
• More general, but nonstandard definition
for NATURAL JOIN is

<list1> specifies a list of attributes from R, and


<list2> specifies a list of attributes from S.
Left Outer Join
• In the left outer join, operation allows keeping all tuple
in the left relation.
• if there is no matching tuple is found in right relation,
then the attributes of right relation in the join result are
filled with null values.
Right Outer Join
• In the right outer join, operation allows keeping all tuple
in the right relation.
• However, if there is no matching tuple is found in the
left relation, then the attributes of the left relation in the
join result are filled with null values.
Full Outer Join
• In a full outer join, all tuples from both relations are
included in the result, irrespective of the matching
condition.
DIVISION Operation
DIVISION Operation
• Retrieve the names of employees who work on all the projects that
‘John Smith’ works on.
▫ query using the DIVISION operation, proceed as follows.
▫ First, retrieve the list of project numbers that ‘John Smith’ works on
in
the intermediate relation SMITH_PNOS:

▫ Next, create a relation that includes a tuple <Pno, Essn> whenever the
employee whose Ssn is Essn works on the project whose number is Pno
in the intermediate relation SSN_PNOS:
• Finally, apply the DIVISION operation to the
two relations, which gives the desired
employees’ Social Security numbers:
AGGREGATE FUNCTIONS & GROUPING
• Used to specify mathematical functions:
• SUM
• AVERAGE
• MAXIMUM
• MINIMUM
• COUNT
AGGREGATE FUNCTIONS & GROUPING
• Example: To find the total number of employees and their average
salary:

• Example: To retrieve each department number, the number of


employees in the department, and their average salary:
• Query 1. Retrieve the name and address of all
employees who work for the ‘Research’ department.
• Query 1. Retrieve the name and address of all
employees who work for the ‘Research’ department.

• This query could be specified in other ways; for


example, the order of the JOIN and SELECT
operations could be reversed, or
• the JOIN could be replaced by a NATURAL
JOIN after renaming one of the join attributes
to match the other join attribute name.
• Query 2. For every project located in
‘Stafford’, list the project number, the
controlling department number, and the
department manager’s last name, address,
and birth date.
• Query 3. Make a list of project numbers for
projects that involve an employee whose last
name is ‘Smith’, either as a worker or as a
manager of the department that controls the
project.
• Query 3. Make a list of project numbers for
projects that involve an employee whose last
name is ‘Smith’, either as a worker or as a
manager of the department that controls the
project.
PART 2 ENDS

You might also like