0% found this document useful (0 votes)
71 views

Relational Algebra: Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1

The document summarizes Relational Algebra, which is a mathematical query language that forms the basis for SQL and database implementation. Relational Algebra uses set operations like selection, projection, union, and join to manipulate relations and retrieve data. It operates on tables of data and outputs new tables based on the specified operations. The key operators and their functions are defined.

Uploaded by

rajendrag
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
71 views

Relational Algebra: Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1

The document summarizes Relational Algebra, which is a mathematical query language that forms the basis for SQL and database implementation. Relational Algebra uses set operations like selection, projection, union, and join to manipulate relations and retrieve data. It operates on tables of data and outputs new tables based on the specified operations. The key operators and their functions are defined.

Uploaded by

rajendrag
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 33

Relational Algebra

Chapter 4

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1


Relational Query Languages
 Query languages: Allow manipulation and retrieval
of data from a database.
 Relational model supports simple, powerful QLs:
 Strong formal foundation based on algebra/logic.
 Allows for much optimization.

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 2


Formal Relational Query Languages
 Two mathematical Query Languages form
the basis for “real” languages (e.g. SQL), and
for implementation:
 Relational Algebra: More operational, very useful
for representing execution plans.
 Relational Calculus: Lets users describe what they
want, rather than how to compute it. (Non-
operational, declarative.) Not covered in cours.

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 3


Overview

 Notation
 Relational Algebra
 Relational Algebra basic operators.
 Relational Algebra derived operators.

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 4


Preliminaries
 A query is applied to relation instances, and the
result of a query is also a relation instance.
 Schemas of input relations for a query are fixed
 The schema for the result of a given query is also
fixed! Determined by definition of query language
constructs.

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 5


Preliminaries

 Positional vs. named-attribute notation:


 Positional notation
• Ex: Sailor(1,2,3,4)
• easier for formal definitions

 Named-attribute notation
• Ex: Sailor(sid, sname, rating,age)
• more readable
 Advantages/disadvantages of one over the
other?
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 6
R1 sid bid day
Example Instances 22 101 10/10/96
58 103 11/12/96
 “Sailors” and “Reserves”
S1 sid sname rating age
relations for our examples.
 We’ll use positional or 22 dustin 7 45.0
named field notation. 31 lubber 8 55.5
 Assume that names of fields 58 rusty 10 35.0
in query results are
inherited from names of S2 sid sname rating age
fields in query input 28 yuppy 9 35.0
relations.
31 lubber 8 55.5
44 guppy 5 35.0
58 rusty 10 35.0
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 7
Relational Algebra

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 8


Algebra
 In math, algebraic operations like +, -, x, /.
 Operate on numbers: input are numbers,
output are numbers.
 Can also do Boolean algebra on sets, using
union, intersect, difference.
 Focus on algebraic identities, e.g.
 x (y+z) = xy + xz.
 (Relational algebra lies between propositional and 1st-order logic.)

3
7
4
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 9
Relational Algebra
 Every operator takes one or two relation
instances
 A relational algebra expression is recursively
defined to be a relation
 Result is also a relation
 Can apply operator to
• Relation from database
• Relation as a result of another operator

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 10


Relational Algebra Operations
 Basic operations:
 Selection ( ) Selects a subset of rows from relation.
 Projection ( ) Deletes unwanted columns from relation.
 
Cross-product ( ) Allows us to combine two relations.
 
Set-difference ( ) Tuples in reln. 1, but not in reln. 2.
 Union (  ) Tuples in reln. 1 and in reln. 2.
 Additional derived operations:
 Intersection, join, division, renaming.
Not essential, but very useful.
 Since each operation returns a relation, operations
can be composed!
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 11
Basic Relational Algebra Operations

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 12


sname rating
Projection yuppy 9
lubber 8
 Deletes attributes that are not in guppy 5
projection list. rusty 10
Schema of result contains exactly
 sname,rating(S2)

the fields in the projection list,
with the same names that they
had in the (only) input relation.
 Projection operator has to
eliminate duplicates! (Why??) age
35.0
55.5
 age(S2)
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 13
sid sname rating age
Selection 28 yuppy 9 35.0
58 rusty 10 35.0
 Selects rows that satisfy selection


condition.
No duplicates in result! (Why?)  rating 8(S2)
 Schema of result identical to
schema of (only) input relation.
 Selection conditions:
 simple conditions comparing
attribute values (variables) sname rating
and / or constants or
yuppy 9
 complex conditions that
combine simple conditions rusty 10
using logical connectives
AND and OR.
 sname,rating( rating 8(S2))
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 14
Union, Intersection, Set-Difference
sid sname rating age
 All of these operations take 22 dustin 7 45.0
two input relations, which 31 lubber 8 55.5
must be union-compatible: 58 rusty 10 35.0
 Same number of fields. 44 guppy 5 35.0
 “Corresponding” fields 28 yuppy 9 35.0
have the same type. S1 S2
 What is the schema of result?
sid sname rating age
sid sname rating age 31 lubber 8 55.5
22 dustin 7 45.0 58 rusty 10 35.0
S1 S2 S1 S2
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 15
Exercise on Union
Num shape holes Num shape holes
ber ber
1 round 2 4 round 2
2 square 4 5 square 4
3 rectangle 8 6 rectangle 8

Blue blocks (BB) Yellow blocks(YB)


bottom top 1. Which tables are union-
compatible?
Stacked(S) 4 2 2. What is the result of the
4 6 possible unions?
6 2
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 16
Cross-Product
 Each row of S1 is paired with each row of R1.
 Result schema has one field per field of S1 and R1,
with field names inherited if possible.
 Conflict: Both S1 and R1 have a field called sid.
(sid) sname rating age (sid) bid day
22 dustin 7 45.0 22 101 10/10/96
22 dustin 7 45.0 58 103 11/12/96
31 lubber 8 55.5 22 101 10/10/96
31 lubber 8 55.5 58 103 11/12/96
58 rusty 10 35.0 22 101 10/10/96
58 rusty 10 35.0 58 103 11/12/96

 Renaming operator:  (C(1 sid1, 5  sid 2), S1 R1)


Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 17
Exercise on Cross-Product
Num shape holes Num shape holes
ber ber
1 round 2 4 round 2
2 square 4 5 square 4
3 rectangle 8 6 rectangle 8

Blue blocks (BB)


bottom top 1. Write down 2 tuples in
BB x S.
Stacked(S) 4 2 2. What is the cardinality
4 6 of BB x S?
6 2
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 18
Derived Operators
Join and Division

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 19


Joins
 Condition Join: R  c S   c ( R  S)
(sid) sname rating age (sid) bid day
22 dustin 7 45.0 58 103 11/12/96
31 lubber 8 55.5 58 103 11/12/96
S1 R1
S1.sid  R1.sid
 Result schema same as that of cross-product.
 Fewer tuples than cross-product, might be able to compute
more efficiently. How?
 Sometimes called a theta-join.
 Π-σ-x = SQL in a nutshell.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 20
Exercise on Join
Num shape holes Num shape holes
ber ber
1 round 2 4 round 2
2 square 4 5 square 4
3 rectangle 8 6 rectangle 8

Blue blocks (BB) Yellow blocks(YB)

BB YB
BB.holes YB.holes
Write down 2 tuples in this join.

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 21


Joins
 Equi-Join: A special case of condition join where
the condition c contains only equalities.
sid sname rating age bid day
22 dustin 7 45.0 101 10/10/96
58 rusty 10 35.0 103 11/12/96
S1 R1
R.sid =S.sid
 Result schema similar to cross-product, but only
one copy of fields for which equality is specified.
 Natural Join: Equijoin on all common fields.
Without specified condition A B
means the natural join of A and B.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 22
Example for Natural Join
Num shape holes shape holes
ber
round 2
1 round 2
square 4
2 square 4
rectangle 8
3 rectangle 8
Blue blocks (BB) Yellow blocks(YB)

What is the natural join of BB and YB?

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 23


Join Examples

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 24


Find names of sailors who’ve reserved boat #103

 Solution 1:  sname(( Reserves)  Sailors)


bid 103

 Solution 2:  (Temp1,  Re serves)


bid  103

 ( Temp2, Temp1  Sailors)


 sname (Temp2)

 Solution 3:  sname ( (Re serves  Sailors))


bid 103
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 25
Exercise: Find names of sailors who’ve reserved
a red boat

 Information about boat color only available in


Boats; so need an extra join:
 sname (( Boats)  Re serves  Sailors)
color ' red '

 A more efficient solution:


 sname ( ((  Boats)  Re s)  Sailors)
sid bid color ' red '

A query optimizer can find this, given the first solution!


Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 26
Find sailors who’ve reserved a red or a green boat
 Can identify all red or green boats, then find
sailors who have reserved one of these boats:
 (Tempboats, ( Boats))
color ' red '  color ' green '
 sname(Tempboats  Re serves  Sailors)

 Can also define Tempboats using union! (How?)


 What happens if  is replaced by  in this query?
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 27
Exercise: Find sailors who’ve reserved a red and a
green boat
 Previous approach won’t work! Must identify
sailors who’ve reserved red boats, sailors
who’ve reserved green boats, then find the
intersection (note that sid is a key for Sailors):
 (Tempred,  (( Boats)  Re serves))
sid color ' red '
 (Tempgreen,  (( Boats)  Re serves))
sid color ' green'

 sname((Tempred  Tempgreen)  Sailors)

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 28


Division
 Not supported as a primitive operator, but useful for
expressing queries like:
Find sailors who have reserved all boats.
 Typical set-up: A has 2 fields (x,y) that are foreign key
pointers, B has 1 matching field (y).
 Then A/B returns the set of x’s that match all y values
in B.
 Example: A = Friend(x,y). B = set of 354 students.
Then A/B returns the set of all x’s that are friends
with all 354 students.

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 29


Examples of Division A/B
sno pno pno pno pno
s1 p1 p2 p2 p1
s1 p2 p4 p2
s1 p3 B1 p4
s1 p4 B2
s2 p1 sno B3
s2 p2 s1
s3 p2 s2 sno
s4 p2 s3 s1 sno
s4 p4 s4 s4 s1

A A/B1 A/B2 A/B3


Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 30
Find the names of sailors who’ve reserved all boats

 Uses division; schemas of the input relations


to / must be carefully chosen:

 (Tempsids, ( Re serves) / ( Boats))


sid, bid bid
 sname (Tempsids  Sailors)

 To find sailors who’ve reserved all ‘red boats:


..... /p (s Boats)
bid color='red'
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 31
Division in General
 In general, x and y can be any lists of fields; y is the
list of fields in B, and (x,y) is the list of fields of A.
 Then A/B returns the set of all x-tuples such that for
every y-tuple in B, the tuple (x,y) is in A.

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 32


Summary

 The relational model supports rigorously


defined query languages that are simple and
powerful.
 Relational algebra is more operational.
 Useful as internal representation for query
evaluation plans.
 Several ways of expressing a given query; a
query optimizer should choose the most
efficient version.
 Book has lots of query examples.

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 33

You might also like