Lecture 2
Lecture 2
Lecture 2
Relational Model
• Basic Notions
• Fundamental Relational Algebra Operations
• Additional Relational Algebra Operations
• Extended Relational Algebra Operations
• Null Values
• Modification of the Database
• Views
• Bags and Bag operations
Basic Structure
• Formally, given sets D1, D2, …. Dn a relation r is a subset of
D1 x D 2 x … x Dn
Thus, a relation is a set of n-tuples (a1, a2, …, an) where each ai Di
• Example:
customer_name = {Jones, Smith, Curry, Lindsay}
customer_street = {Main, North, Park}
customer_city = {Harrison, Rye, Pittsfield}
Then r = { (Jones, Main, Harrison),
(Smith, North, Rye),
(Curry, North, Rye),
(Lindsay, Park, Pittsfield) }
is a relation over
customer_name , customer_street, customer_city
Attribute Types
• Each attribute of a relation has a name
• The set of allowed values for each attribute is called the domain
of the attribute
• Attribute values are (normally) required to be atomic; that is,
indivisible
– Note: multivalued attribute values are not atomic ({secretary.
clerk}) is example of multivalued attribute position
– Note: composite attribute values are not atomic
• The special value null is a member of every domain
• The null value causes complications in the definition of many
operations
– We shall ignore the effect of null values in our main
presentation and consider their effect later
Relation Schema
• A1, A2, …, An are attributes
customer
Database
• A database consists of multiple relations
• Information about an enterprise is broken up into parts, with
each relation storing one part of the information
• Select
• Project
• Union
• Set Difference (or Substract or minus)
• Cartesian Product
Select Operation
• Notation: p(r)
• p is called the selection predicate
• Defined as:
p(r) = {t | t r and p(t)}
Where p is a formula in propositional calculus consisting of terms
connected by : (and), (or), (not)
Each term is one of:
<attribute> op <attribute> or <constant>
where op is one of: =, , >, . <.
• Example of selection:
Account(account_number, branch_name,balance)
branch-name=“Perryridge”(account)
Select Operation – Example
• Relation r A B C D
1 7
5 7
12 3
23 10
1 7
23 10
Project Operation
• Notation:
A B C
10 1
• Relation r:
20 1
30 1
40 2
A,C (r) A C A C
That is, the projection of
1 1 a relation on a set of
1 = 1 attributes is a set of tuples
1 2
2
Union Operation
• Consider relational schemas:
Depositor(customer_name, account_number)
Borrower(customer_name, loan_number)
• For r s to be valid.
1. r, s must have the same number of attributes
2. The attribute domains must be compatible (e.g., 2nd
column of r deals with the same type of values as does the
2nd column of s)
Find all customers with either an account or a loan
customer-name (depositor) customer-name (borrower)
Union Operation
• Notation: r s
• Defined as:
r s = {t | t r or t s}
Union Operation – Example
A B A B
• Relations r, s: 1 2
2 3
1 s
r
A B
1
r s: 2
1
3
Set Difference Operation
• Notation r – s
• Defined as:
r – s = {t | t r and t s}
• Set differences must be taken between compatible
relations.
– r and s must have the same number of attributes
– attribute domains of r and s must be compatible
Set Difference Operation – Example
A B A B
• Relations r, s:
1 2
2 3
1 s
r
A B
r – s: 1
1
Cartesian-Product Operation
• Notation r x s
• Defined as:
r x s = {t q | t r and q s}
• Assume that attributes of r(R) and s(S) are disjoint. (That
is, R S = ).
• If attributes of r(R) and s(S) are not disjoint, then renaming
must be used.
Cartesian-Product Operation-Example
Relations r, s: A B C D E
1 10 a
10 a
2 20 b
10 b
r
s
r x s:
A B C D E
1 10 a
1 10 a
1 20 b
1 10 b
2 10 a
2 10 a
2 20 b
2 10 b
Composition of Operations
• Can build expressions using multiple operations
• Example: A=C(r s)
• rs
A B C D E
1 10 a
1 10 a
1 20 b
1 10 b
2 10 a
2 10 a
2 20 b
2 10 b
A=C(r s)
A B C D E
1 10 a
2 20 a
2 20 b
Rename Operation
• Allows us to name, and therefore to refer to, the results of
relational-algebra expressions.
• Allows us to refer to a relation by more than one name.
Example:
X (E)
returns the expression E under the name X
If a relational-algebra expression E has arity n, then
(A1, A2, …, An) (E)
xx E under the name X, and with
returns the result of expression
the attributes renamed to A1, A2, …., An.
Banking Example
Superkeys
Candidate
keys K
Primary key
Example Queries
• Find the loan number for each loan of an amount greater than
$1200
• Set intersection
• Natural join
• Division
• Assignment
Set-Intersection Operation
• Notation: r s
• Defined as:
• r s ={ t | t r and t s }
• Assume:
– r, s have the same arity
– attributes of r and s are compatible
• Note: r s = r - (r - s)
Set-Intersection Operation - Example
A B
• Relation r, s: 1
A B
2 2
1 3
r s
• rs A B
2
Natural-Join Operation
Notation: r s
• Let r and s be relations on schemas R and S respectively.
Then, r s is a relation on schema R S obtained as follows:
– Consider each pair of tuples tr from r and ts from s.
– If tr and ts have the same value on each of the attributes in R
S, add a tuple t to the result, where
• t has the same value as tr on r
• t has the same value as ts on s
• Example:
R = (A, B, C, D)
S = (E, B, D)
– Result schema = (A, B, C, D, E)
– r s is defined as:
r.A, r.B, r.C, r.D, s.E (r.B = s.B r.D = s.D (r x s))
Natural Join Operation – Example
• Relations r, s:
A B C D B D E
1 a 1 a
2 a 3 a
4 b 1 a
1 a 2 b
2 b 3 b
r s
r s
A B C D E
1 a
1 a
1 a
1 a
2 b
Natural Join Example
sid sname rating age
sid bid day 22 dustin 7 45.0
22 101 10/10/96 31 lubber 8 55.5
58 103 11/12/96 58 rusty 10 35.0
R1
S1
R1 S1 =
Notation: rs
• Suited to queries that include the phrase “for all”.
• Let r and s be relations on schemas R and S respectively where
– R = (A1, …, Am, B1, …, Bn)
– S = (B1, …, Bn)
The result of r s is a relation on schema
R – S = (A1, …, Am)
r s = { t | t R-S(r) u s ( tu r ) }
Division Operation – Example
Relations r, s: A B B
1
1
2
3 2
1 s
1
1
3
4
6
1
2
r s: A r
Another Division Example
Relations r, s:
A B C D E D E
a a 1 a 1
a a 1 b 1
a b 1 s
a a 1
a b 3
a a 1
a b 1
a b 1
r
r s: A B C
a
a
Division Operation
• Definition in terms of the basic algebra operation
Let r(R) and s(S) be relations, and let S R
To see why
R-S,S(r) simply reorders attributes of r
• Generalized Projection
• Outer Join
• Aggregate Functions
Generalized Projection
A B C
• Relation r: 7
7
3
10
sum-C
g sum(c) (r)
27
Aggregate Operation – Example
• Relation account grouped by branch-name:
Relation borrower
customer-name loan-number
Jones L-170
Smith L-230
Hayes L-155
Outer Join
• An extension of the join operation that avoids loss of
information.
• Computes the join and then adds tuples form one relation
that does not match tuples in the other relation to the result
of the join.
• Uses null values:
– null signifies that the value is unknown or does not exist
– All comparisons involving null are (roughly speaking)
false by definition.
• We shall study precise meaning of comparisons with
nulls later
Left Outer Join
• Join
loan Borrower
Outer Join
loan borrower
loan-number branch-name amount customer-name
L-170 Downtown 3000 Jones
L-230 Redwood 4000 Smith
L-260 Perryridge 1700 null
L-155 null null Hayes
Null Values
• It is possible for tuples to have a null value, denoted by
null, for some of their attributes
• null signifies an unknown value or that a value does not
exist.
• The result of any arithmetic expression involving null is
null.
• Aggregate functions simply ignore null values
• For duplicate elimination and grouping, null is treated like
any other value, and two nulls are assumed to be the same
Null Values
• Comparisons with null values return the special truth value
unknown
– If false was used instead of unknown, then not (A < 5)
would not be equivalent to A >= 5
• Three-valued logic using the truth value unknown:
– OR: (unknown or true) = true,
(unknown or false) = unknown
(unknown or unknown) = unknown
– AND: (true and unknown) = unknown,
(false and unknown) = false,
(unknown and unknown) = unknown
– NOT: (not unknown) = unknown
• Result of select predicate is treated as false if it evaluates to
unknown
Modification of the Database
• Each Fi is either
– the i th attribute of r, if the ith attribute is not updated, or,
– if the attribute is to be updated Fi is an expression,
involving only constants and the attributes of r, which
gives the new value for the attribute
Update Examples
balance(account) - account.balance
Query 1
CN(BN=“Downtown”(depositor account))
CN(BN=“Uptown”(depositor account))
depositor account
Relational Algebra on Bags
A B C A B
Projection on A, B
1 2 3 1 2
1 2 5 1 2
2 3 7 2 3
Other Bag Operations
• An element in the union appears the number of times it appears in both
bags
• Example: {1, 2, 3, 1} UNION {1, 1, 2, 3, 4, 1} =
{1, 1, 1, 1, 1, 2, 2, 3, 3, 4}
• An element appears in the intersection of two bags is the minimum of the
number of times it appears in either.
• Example (con’t): {1, 2, 3, 1} INTERSECTION
{1, 1, 2, 3, 4, 1} = {1, 1, 2, 3}
• An element appears in the difference of two bags A and B as it appears in
A minus the number of times it appears in B but never less that 0 times
Bag Laws
• Not all laws for set operations are valid for bags:
• Commutative law for union does hold for bags:
R UNION S = S UNION R
• However S union S = S for sets and it is not equal to S if S
is a bag
•
Examples
Reserves Sailors
sid bid day sid sname rating age
22 101 10/10/96 22 dustin 7 45.0
58 103 11/12/96 31 lubber 8 55.5
• 58 rusty 10 35.0
Boats
bid bname color
101 Interlake Blue
102 Interlake Red
103 Clipper Green
104 Marine Red
Find names of sailors who’ve reserved boat
#103
• Solution 1: sname (( bid 103 Re serves) Sailors)