Relation Algebra Anshul
Relation Algebra Anshul
1
Introduction
• Query languages are specialized languages for
asking questions or queries, that involve the data
in a database.
• Queries in algebra are composed of a collection of
operators.
• Every operator in relational algebra accepts (one
or two) relation instances as arguments and returns
a relation instance as the result.
– A relational algebra expression is recursively defined to
be a relation.
– Relational algebra is a procedural query language
• Define a step-by-step procedure for computing the desired
answer.
2
Basic Operators
• There are six basic operators in relation
algebra:
– select ( )
– project ( )
– union ( U )
– set different ( - )
– Cartesian product ( x )
– rename ( )
3
• Selection
– The selection operation specifies the tuples to
retain through a selection condition.
– Selection condition is a Boolean combination
(an expression using logical connectives and
) of terms that have the form
• attribute op constant
• attribute1 op attribute2
(op is one of the comparison operators , , , ,
, ).
4
Relation R
a=b d>5 ( R )
a b c d
1 7 a b c d
5 7 1 7
12 3 23 10
23 10
5
• Projection
– The projection operator allows us to extract
columns from a relation.
– The subscript specifies the fields to be retained.
(The other fields are ‘projected out’).
– The schema of the result of a projection is
determined by the fields that are projected.
– Duplicated row will be eliminated in the final
result.
• This follows from the definition of a relation as a set
of tuples.
6
Relation R a,c ( R )
a b c a c a c
10 1 1 1
20 1 1 Eliminate
1
Duplicated
30 1 1 rows
2
40 2 2
7
Employee
f_name l_name id sex salary superid dno
Joseph Chan 999999 M 29500 654321 4
Victor Wong 001100 M 30000 888555 5
Carrie Kwan 898989 F 26000 654321 4
Joyce Fong 345345 F 12000 777888 4
8
1. Find the employee names and department number of all
employees
9
– Compositing operation gives relational-algebra
expression.
– The result of a relational-algebra expression is
always a relation.
f_name l_name
Joseph Chan
Carrie Kwan
10
• Union
– R S returns a relation instance containing all tuples
that occur in either relation R or relation instance S (or
both).
– The union operation is commutative: R S = S R
– Duplicate tuples are eliminated.
– R and S must be union-compatible.
• They have the same number of fields,
• The corresponding fields have the same domains.
– The field names are not important
– The schema of the result is defined to be identical to the
schema of R.
• The field of R S inherit names from R.
11
a b
a b
a b 1
1
R S 2 RUS 2
2
3 1
1
3
12
• Set Difference
– Set difference R – S returns a relation instance
containing all the tuples that occur in R but not
in S.
– The set difference is not commutative: in
general, R – S S – R.
– The relations R and S must be union-
compatible.
– The schema of the result is defined to be
identical to the schema of R.
13
a b
a b a b
1
R S 2 R-S 1
2
3 1
1
14
• Intersection
– R S returns a relation instance containing all
tuples that occur in both R and S.
– The relations R and S must be union-
compatible.
– The schema of the result is defined to be
identical to the schema of R.
– Intersection is not considered a basic operation,
as it can be derived from the basic operations as
shown below:
R S R ( R S )
15
a b
a b a b
1
R S 2 RS 2
2
3
1
16
• Rename ()
– ( R(F), E ) or ( R, E )
• where E is an arbitrary relation algebra expression
– result: relation named R,
– R = E except that the fields may be renamed
according to F.
– F is called the renaming list:
– oldname newname or position newname.
17
Example
( C ( sid identity ), E )
Example
( C ( 3 identity ), E )
Result is C
3rd attribute in E is renamed as “identity” in C
18
• Cartesian product (Cross product)
– R S returns a relation instance whose schema
contains all the fields of R followed by all the
fields of S.
– The result contains one tuple <r,s>
(concatenation of tuples r and s) for each pair
of tuples r R, s S.
19
RS
a b c d e
S 1 10 +
R c d e 1 10 +
a b 10 + 1 20 -
1 10 + 1 10 -
2 20 - 2 10 +
10 - 2 10 +
2 20 -
2 10 -
20
Query: To retrieve for each female employee a list
of the names of her dependents.
EMPLOYEE
DEPENDENT
21
Female_emps sex=‘F’ ( Employee )
Empnames f_name,l_name,id ( Female_emps )
Female_emps
f_name l_name id bdate address sex salary superid dno
Alicia Chan 998877 2-Jul-70 231, Cai Road, HK F 9500 654321 4
Jennifer Wong 654321 20-June-60 342, Cheung Road, HK F 30000 888555 4
Joyce Fong 345345 19-Dec-80 23, Young Road, HK F 12000 777888 5
f_name l_name id
Empnames
Alicia Chan 998877
Jennifer Wong 654321
Joyce Fong 345345
22
Emp_dependents Empnames x Dependents
Emp_dependents
f_name l_name id eid dep_name sex bdate relationship
Alicia Chan 998877 334455 Alice F 5-Apr-90 Daughter
Alicia Chan 998877 334455 Theodore M 3-Mar-92 Son
Alicia Chan 998877 654321 Abner M 29-Feb-94 Son
Alicia Chan 998877 123456 Alice F 2-Nov-97 Daughter
Jennifer Wong 654321 334455 Alice F 5-Apr-90 Daughter
Jennifer Wong 654321 334455 Theodore M 3-Mar-92 Son
Jennifer Wong 654321 654321 Abner M 29-Feb-94 Son
Jennifer Wong 654321 123456 Alice F 2-Nov-97 Daughter
Joyce Fong 345345 334455 Alice F 5-Apr-90 Daughter
Joyce Fong 345345 334455 Theodore M 3-Mar-92 Son
Joyce Fong 345345 654321 Abner M 29-Feb-94 Son
Joyce Fong 345345 123456 Alice F 2-Nov-97 Daughter
23
Query: To retrieve for each female employee a list of the names of her dependents
Actual_dependents
Result
24
• Join
– Because the sequence of operations, × followed by is
quite common, a special operations, called the “join”
operation ( ) was created to specify these as a single
operation.
– Join can be defined as a cross-product followed by
selections and sometimes with projections.
– The result of a cross-product is typically much larger
than the result of a join, and it is very important to
recognize joins and implement them without
materializing the underlying cross-product.
25
• Condition Join
– The most general version of join operation
accepts a join condition c.
– The join condition is identical to a selection
condition in form.
– The operation is defined as follows:
R C S C ( R S )
26
S R
sid sname rating age sid bid day
22 Dustin 7 45.0 22 101 10/10/96
31 Lubber 8 55.5 58 103 11/12/96
58 Rusty 10 35.5
27
S R
sid sname rating age sid bid day
22 Dustin 7 45.0 22 101 10/10/96
31 Lubber 8 55.5 58 103 11/12/96
58 Rusty 10 35.5
28
Example
l_name,f_name ( Employee ) -
Employee.l_name, Employee.f_name (
29
• Equi-join
– A special case of the join operation is when the
join condition consists solely of equalities
(connected by ) of the form
R.name1 = S.name2
– In the resulting relation, S.name2 will be
dropped by an additional projection operation.
30
R S R R.b S .b S
a b c b e f a b c e f
1 1 X 1 X
5 3 X 1 X
4 1 X 1 X
1 2 Y 1 X
2 3 Y 2 Y
31
R S R R.b S .b S
a b c b e f a b c e f
1 1 X 1 X
5 3 X 1 X
4 1 X 1 X
1 2 Y 1 X
2 3 Y 2 Y
32
• Natural Join
– A further special case of the join operation R S
is an equijoin in which equalities
are specified on all fields having the same
names in R and S.
– We can simply omit the join condition.
– The resulting schema contains the attributes of
R followed by the attributes in S that are not in
R.
– If the two relations
R S have no attributes in
common, is simply the cross-product.
33
R S
a b c d b d e R S
1 X 1 X a b c d e
2 X 3 X 1 X
4 Y 1 X 1 X
1 Y 2 Y 2 Y
2 Y 3 Y
34
R S
a b c d b d e R S
1 X 1 X a b c d e
2 X 3 X 1 X
4 Y 1 X 1 X
1 Y 2 Y 2 Y
2 Y 3 Y
35
• Division
– The division operation is useful for expressing
certain kinds of queries, for example, “find the
names of sailors who have reserved all boats.
– Example
• Consider two relations A and B.
• A has exactly two fields x and y.
• B has just one field y, with the same domain as in A.
• The division operation A/B is the set of all x values
(in the form of unary tuples) such that for every y
value in a tuple of B, there is a tuple <x,y> in A.
36
x y y x
1 B 1 A/B
2 2
3
1
A 1 The division operation A/B is the set of all x
1 values (in the form of unary tuples) such that
3 for every y value in a tuple of B, there is a
tuple <x,y> in A.
4
1
2
37
x y y x
1 B 1 A/B
2 2
3
1
A 1 Another way to understand division is as
1 follows:
3 For each x value in A, consider the set of y
values that appear in tuples of A with that
4 x value. If this set contains all y values in
1 B, the x value is in the result of A/B.
2
38
x y y x
1 B 1 A/B
2 2
3
1 An analogy with integer division may also help
to understand division. For integers A and B,
A 1 A/B is the largest integer Q such that Q * B A .
1 for relation instances A and B, A/B is the
3 largest relation instance Q such that Q B A
2 x (( x ( A) B ) A)
Thus, A/B is
x ( A) x (( x ( A) B) A)
39
The division above can be written in terms of basic operations:
Attributes in A but
A B not in B. Note that it is not a
x y y Temp1 A-B ( A )valid notation in relational algebra.
1 1 Temp2 A-B ( ( Temp1 B ) – A )
2 2 Result = Temp1 – Temp2
3
1 ( Temp1 B ) – A
x
Temp1 x x y
1 A/B
1 2
3 2
4 2
1
2 Temp2
40
Employee fname lname id bdate address salary sid dno
Works_on id pno
Query: Retrieve the names of employees who work on all the projects
that `John Sung’ works on.
41
4.8 More examples
Sailors ( sid, sname, age )
42
Sailors ( sid, sname, age )
Boats ( bid, bname, color )
Reserves ( sid, bid, date )
- Solution 3:
sname ( bid=103 ( Reserves Sailors ) )
43
Sailors ( sid, sname, age )
Boats ( bid, bname, color )
Reserves ( sid, bid, date )
- Solution 1:
sname ( ( color=‘red’ Boats) Reserves Sailors )
- Solution 2:
sname ( sid ( ( bid color=‘red’ Boats ) Reserves ) Sailors )
44
Sailors ( sid, sname, age )
Boats ( bid, bname, color )
Reserves ( sid, bid, date )
-Solution 1:
( Tempboats, ( color=‘red’ V color=‘green’ Boats ) )
sname ( Tempboats Reserves Sailors )
- Solution 2:
sname ( ( color=‘red’ Boats ) Reserves ) Sailors ) U
sname ( ( color=‘green’ Boats )
Reserves ) Sailors )
46
Sailors ( sid, sname, age )
Boats ( bid, bname, color )
Reserves ( sid, bid, date )
Note that sid is a key for Sailors, but sname is not a key.
47
Sailors ( sid, sname, age )
Boats ( bid, bname, color )
Reserves ( sid, bid, date )
What if we simply do
Reserves / ( bid Boats ) ?
date sid bid
2-3-2002 007 A
3-7-2002 007 B
48
Sailors ( sid, sname, age )
Boats ( bid, bname, color )
Reserves ( sid, bid, date )
49
Remarks
• Only standard relational algebra is covered
in the lecture
– Relation is assumed to be a set of records (no
duplicated records)
– Extended operators such as sorting operators
and aggregation operators are not covered.
50