0% found this document useful (0 votes)
4 views

Relation Algebra Anshul

Uploaded by

Abhishek Chauhan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Relation Algebra Anshul

Uploaded by

Abhishek Chauhan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 50

Relational Algebra

BY: Dr. PARUL MADAN

1
Introduction
• Query languages are specialized languages for
asking questions or queries, that involve the data
in a database.
• Queries in algebra are composed of a collection of
operators.
• Every operator in relational algebra accepts (one
or two) relation instances as arguments and returns
a relation instance as the result.
– A relational algebra expression is recursively defined to
be a relation.
– Relational algebra is a procedural query language
• Define a step-by-step procedure for computing the desired
answer.
2
Basic Operators
• There are six basic operators in relation
algebra:
– select (  )
– project (  )
– union ( U )
– set different ( - )
– Cartesian product ( x )
– rename (  )

3
• Selection
– The selection operation  specifies the tuples to
retain through a selection condition.
– Selection condition is a Boolean combination
(an expression using logical connectives  and
) of terms that have the form
• attribute op constant
• attribute1 op attribute2
(op is one of the comparison operators , , , ,
, ).

4
Relation R
 a=b d>5 ( R )
a b c d
  1 7 a b c d
  5 7   1 7
  12 3   23 10
  23 10

5
• Projection
– The projection operator  allows us to extract
columns from a relation.
– The subscript specifies the fields to be retained.
(The other fields are ‘projected out’).
– The schema of the result of a projection is
determined by the fields that are projected.
– Duplicated row will be eliminated in the final
result.
• This follows from the definition of a relation as a set
of tuples.

6
Relation R  a,c ( R )

a b c a c a c
 10 1  1  1
 20 1  1 Eliminate
 1
Duplicated
 30 1  1 rows
 2
 40 2  2

7
Employee
f_name l_name id sex salary superid dno
Joseph Chan 999999 M 29500 654321 4
Victor Wong 001100 M 30000 888555 5
Carrie Kwan 898989 F 26000 654321 4
Joyce Fong 345345 F 12000 777888 4

Find all employees who works in department 4 and whose


salary is greater than 25000.

 dno=4 salary>25000 (Employee)


f_name l_name id sex salary superid dno
Joseph Chan 999999 M 29500 654321 4
Carrie Kwan 898989 F 26000 654321 4

8
1. Find the employee names and department number of all
employees

2. Find the sex and department number of all employees

 f_name,l_name,dno ( Employee )  sex,dno ( Employee )


f_name l_name dno sex dno
Joseph Chan 4 M 4
Victor Wong 5 M 5
Carrie Kwan 4 F 4
Joyce Fong 4

9
– Compositing operation gives relational-algebra
expression.
– The result of a relational-algebra expression is
always a relation.

 f_name,l_name (  dno=4^salary>25000 ( Employee ) )

f_name l_name
Joseph Chan
Carrie Kwan

10
• Union
– R  S returns a relation instance containing all tuples
that occur in either relation R or relation instance S (or
both).
– The union operation is commutative: R  S = S  R
– Duplicate tuples are eliminated.
– R and S must be union-compatible.
• They have the same number of fields,
• The corresponding fields have the same domains.
– The field names are not important
– The schema of the result is defined to be identical to the
schema of R.
• The field of R  S inherit names from R.

11
a b
a b
a b  1
 1
R S  2 RUS  2
 2
 3  1
 1
 3

12
• Set Difference
– Set difference R – S returns a relation instance
containing all the tuples that occur in R but not
in S.
– The set difference is not commutative: in
general, R – S  S – R.
– The relations R and S must be union-
compatible.
– The schema of the result is defined to be
identical to the schema of R.

13
a b
a b a b
 1
R S  2 R-S  1
 2
 3  1
 1

14
• Intersection
– R  S returns a relation instance containing all
tuples that occur in both R and S.
– The relations R and S must be union-
compatible.
– The schema of the result is defined to be
identical to the schema of R.
– Intersection is not considered a basic operation,
as it can be derived from the basic operations as
shown below:
R  S R  ( R  S )
15
a b
a b a b
 1
R S  2 RS  2
 2
 3
 1

16
• Rename ()
–  ( R(F), E ) or  ( R, E )
• where E is an arbitrary relation algebra expression
– result: relation named R,
– R = E except that the fields may be renamed
according to F.
– F is called the renaming list:
– oldname  newname or position  newname.

17
Example
 ( C ( sid  identity ), E )

We may rename more fields:


 ( C ( sid  identity, child  dependent ), E )

Example
 ( C ( 3  identity ), E )

Result is C
3rd attribute in E is renamed as “identity” in C

18
• Cartesian product (Cross product)
– R  S returns a relation instance whose schema
contains all the fields of R followed by all the
fields of S.
– The result contains one tuple <r,s>
(concatenation of tuples r and s) for each pair
of tuples r  R, s  S.

19
RS
a b c d e
S  1  10 +
R c d e  1  10 +
a b  10 +  1  20 -
 1  10 +  1  10 -
 2  20 -  2  10 +
 10 -  2  10 +
 2  20 -
 2  10 -

20
Query: To retrieve for each female employee a list
of the names of her dependents.

EMPLOYEE

f_name l_name id bdate addr sex salary super_id dno

DEPENDENT

eid dependent-name sex bdate relationship

21
Female_emps   sex=‘F’ ( Employee )
Empnames   f_name,l_name,id ( Female_emps )
Female_emps
f_name l_name id bdate address sex salary superid dno
Alicia Chan 998877 2-Jul-70 231, Cai Road, HK F 9500 654321 4
Jennifer Wong 654321 20-June-60 342, Cheung Road, HK F 30000 888555 4
Joyce Fong 345345 19-Dec-80 23, Young Road, HK F 12000 777888 5

f_name l_name id
Empnames
Alicia Chan 998877
Jennifer Wong 654321
Joyce Fong 345345

Dependents eid dep_name sex bdate relationship


334455 Alice F 5-Apr-90 Daughter
334455 Theodore M 3-Mar-92 Son
654321 Abner M 29-Feb-94 Son
123456 Alice F 2-Nov-97 Daughter

22
Emp_dependents  Empnames x Dependents
Emp_dependents
f_name l_name id eid dep_name sex bdate relationship
Alicia Chan 998877 334455 Alice F 5-Apr-90 Daughter
Alicia Chan 998877 334455 Theodore M 3-Mar-92 Son
Alicia Chan 998877 654321 Abner M 29-Feb-94 Son
Alicia Chan 998877 123456 Alice F 2-Nov-97 Daughter
Jennifer Wong 654321 334455 Alice F 5-Apr-90 Daughter
Jennifer Wong 654321 334455 Theodore M 3-Mar-92 Son
Jennifer Wong 654321 654321 Abner M 29-Feb-94 Son
Jennifer Wong 654321 123456 Alice F 2-Nov-97 Daughter
Joyce Fong 345345 334455 Alice F 5-Apr-90 Daughter
Joyce Fong 345345 334455 Theodore M 3-Mar-92 Son
Joyce Fong 345345 654321 Abner M 29-Feb-94 Son
Joyce Fong 345345 123456 Alice F 2-Nov-97 Daughter

23
Query: To retrieve for each female employee a list of the names of her dependents

Actual_dependents   id=eid ( Emps_dependents )


Result   f_name,l_name,dep_name ( Actual_dependents )

Actual_dependents

f_name l_name id eid dep_name sex bdate relationship


Jennifer Wong 654321 654321 Abner M 29-Feb-94 Son

Result

f_name l_name dep_name


Jennifer Wong Abner

24
• Join
– Because the sequence of operations, × followed by  is
quite common, a special operations, called the “join”
operation ( ) was created to specify these as a single
operation.
– Join can be defined as a cross-product followed by
selections and sometimes with projections.
– The result of a cross-product is typically much larger
than the result of a join, and it is very important to
recognize joins and implement them without
materializing the underlying cross-product.

25
• Condition Join
– The most general version of join operation
accepts a join condition c.
– The join condition is identical to a selection
condition in form.
– The operation is defined as follows:
R  C S  C ( R S )

26
S R
sid sname rating age sid bid day
22 Dustin 7 45.0 22 101 10/10/96
31 Lubber 8 55.5 58 103 11/12/96
58 Rusty 10 35.5

S S .sid R.sid R


(sid) sname rating age (sid) bid day
22 Dustin 7 45.0 58 103 11/12/96
31 Lubber 8 55.5 58 103 11/12/96

27
S R
sid sname rating age sid bid day
22 Dustin 7 45.0 22 101 10/10/96
31 Lubber 8 55.5 58 103 11/12/96
58 Rusty 10 35.5

S S .sid R.sid R


(sid) sname rating age (sid) bid day
22 Dustin 7 45.0 58 103 11/12/96
31 Lubber 8 55.5 58 103 11/12/96

28
Example

Query: Find the names of employees with the highest salary.

 l_name,f_name ( Employee ) -
 Employee.l_name, Employee.f_name (

Employee Employee.salary<F.salary  ( F, Employee ) )

Note: We assume that <l_name,f_name> is a key in this relation.

29
• Equi-join
– A special case of the join operation is when the
join condition consists solely of equalities
(connected by ) of the form
R.name1 = S.name2
– In the resulting relation, S.name2 will be
dropped by an additional projection operation.

30
R S R R.b S .b S

a b c b e f a b c e f
 1  1 X   1  X 
 5  3 X   1  X 
 4  1 X   1  X 
 1  2 Y   1  X 
 2  3 Y   2  Y 

31
R S R R.b S .b S

a b c b e f a b c e f
 1  1 X   1  X 
 5  3 X   1  X 
 4  1 X   1  X 
 1  2 Y   1  X 
 2  3 Y   2  Y 

32
• Natural Join
– A further special case of the join operation R S
is an equijoin in which equalities
are specified on all fields having the same
names in R and S.
– We can simply omit the join condition.
– The resulting schema contains the attributes of
R followed by the attributes in S that are not in
R.
– If the two relations
R S have no attributes in
common, is simply the cross-product.
33
R S
a b c d b d e R S
 1  X 1 X  a b c d e
 2  X 3 X   1  X 
 4  Y 1 X   1  X 
 1  Y 2 Y   2  Y 
 2  Y 3 Y 

34
R S
a b c d b d e R S
 1  X 1 X  a b c d e
 2  X 3 X   1  X 
 4  Y 1 X   1  X 
 1  Y 2 Y   2  Y 
 2  Y 3 Y 

35
• Division
– The division operation is useful for expressing
certain kinds of queries, for example, “find the
names of sailors who have reserved all boats.
– Example
• Consider two relations A and B.
• A has exactly two fields x and y.
• B has just one field y, with the same domain as in A.
• The division operation A/B is the set of all x values
(in the form of unary tuples) such that for every y
value in a tuple of B, there is a tuple <x,y> in A.

36
x y y x
 1 B 1 A/B 
 2 2 
 3
 1
A  1 The division operation A/B is the set of all x
 1 values (in the form of unary tuples) such that
 3 for every y value in a tuple of B, there is a
tuple <x,y> in A.
 4
 1
 2

37
x y y x
 1 B 1 A/B 
 2 2 
 3
 1
A  1 Another way to understand division is as
 1 follows:
 3 For each x value in A, consider the set of y
values that appear in tuples of A with that
 4 x value. If this set contains all y values in
 1 B, the x value is in the result of A/B.
 2

38
x y y x
 1 B 1 A/B 
 2 2 
 3
 1 An analogy with integer division may also help
to understand division. For integers A and B,
A  1 A/B is the largest integer Q such that Q * B  A .
 1 for relation instances A and B, A/B is the
 3 largest relation instance Q such that Q B  A

 4 In fact we can compute the disqualified tuples


 1 using the following algebra expression:

 2  x (( x ( A) B )  A)
Thus, A/B is
 x ( A)   x (( x ( A) B)  A)
39
The division above can be written in terms of basic operations:
Attributes in A but
A B not in B. Note that it is not a
x y y Temp1   A-B ( A )valid notation in relational algebra.
 1 1 Temp2   A-B ( ( Temp1  B ) – A )
 2 2 Result = Temp1 – Temp2
 3
 1 ( Temp1  B ) – A
x
Temp1 x x y
 1 A/B 

 1  2


 3  2

 4  2

 1

 2 Temp2
40
Employee fname lname id bdate address salary sid dno

Works_on id pno

Query: Retrieve the names of employees who work on all the projects
that `John Sung’ works on.

Sung   fname=“John” ^ lname=“Sung” ( Employee )


Sung_pnos   pno ( Works_on Sung )
fname lname id bdate address salary sid dno pno

Result_id  Works_on / Sung_pnos


Result   fname,lname ( Result_id 
Employee )

41
4.8 More examples
Sailors ( sid, sname, age )

Boats ( bid, bname, color )

Reserves ( sid, bid, date )

Consider the above schemas, the primary key fields are


underlined.

42
Sailors ( sid, sname, age )
Boats ( bid, bname, color )
Reserves ( sid, bid, date )

Query 1: Find the names of sailors who have


reserved boat with bid = 103.
More - Solution 1:
efficient
 sname ( (  bid=103 Reserves ) Sailors )
- Solution 2:
 ( Temp1,  bid=103 Reserves )
 ( Temp2, Temp1 Sailors )
 sname ( Temp2 )

- Solution 3:
 sname (  bid=103 ( Reserves Sailors ) )
43
Sailors ( sid, sname, age )
Boats ( bid, bname, color )
Reserves ( sid, bid, date )

Query 2: Find the names of sailors who have reserved at


least a red boat.

- Solution 1:
 sname ( (  color=‘red’ Boats) Reserves Sailors )

- Solution 2:
 sname (  sid ( (  bid  color=‘red’ Boats ) Reserves ) Sailors )

44
Sailors ( sid, sname, age )
Boats ( bid, bname, color )
Reserves ( sid, bid, date )

Query 3: Find the names of sailors who have reserved at


least a red or a green boats.
We can identify all red or green boats, then find sailors who have
reserved one of these boats.

-Solution 1:
 ( Tempboats, (  color=‘red’ V color=‘green’ Boats ) )
 sname ( Tempboats Reserves Sailors )

What happens if V is replaced by


^ in this query?
45
Sailors ( sid, sname, age )
Boats ( bid, bname, color )
Reserves ( sid, bid, date )

- Solution 2:
 sname ( (  color=‘red’ Boats ) Reserves ) Sailors ) U
 sname ( (  color=‘green’ Boats ) 
Reserves ) Sailors )

Query 4: Find the names of sailors who have reserved at least a


red boat and at least a green boat.

The previous solution 1 would not work. We must identify sailors


who have reserved red boats, sailors who have reserved green
boats, then find the intersection.

46
Sailors ( sid, sname, age )
Boats ( bid, bname, color )
Reserves ( sid, bid, date )

Note that sid is a key for Sailors, but sname is not a key.

 ( Tempred,  sid ( (  color=‘red’ Boats )  Reserves ) )


 ( Tempgreen,  sid ( (  color=‘green’ Boats ) 
Reserves ) )
 sname ( ( Tempred  Tempgreen ) Sailors )

47
Sailors ( sid, sname, age )
Boats ( bid, bname, color )
Reserves ( sid, bid, date )

Query 5: Find the names of sailors who have reserved


all boats.

 ( Tempsids, (  sid,bid Reserves ) / (  bid Boats ) )


 sname ( Tempsids Sailors )

What if we simply do
Reserves / (  bid Boats ) ?
date sid bid
2-3-2002 007 A
3-7-2002 007 B

48
Sailors ( sid, sname, age )
Boats ( bid, bname, color )
Reserves ( sid, bid, date )

Query 6: To find sailors who have reserved all red boats:

 ( Tempsids, (  sid,bid Reserves ) / (  bid (  color=‘red’ Boats ) )


 sname ( Tempsids Sailors )

49
Remarks
• Only standard relational algebra is covered
in the lecture
– Relation is assumed to be a set of records (no
duplicated records)
– Extended operators such as sorting operators
and aggregation operators are not covered.

50

You might also like