Relational Model
Relational Model
Introduction
Basic concepts
Relation
Attribute
Key
Integrity constraint
Database
Relational data manipulation
languages
Relational Algebra
Basic operators
Additional operators
Extended operations
Views and Snapshots
Tuple Relational Calculus
Domain Relational Calculus
3.1
Introduction to Relational Model
3.2
Relation
Informally, a relation is a table; a tuple corresponds to
a row of such a table and an attribute to a column; the
number of tuples is called the cardinality and the
number of attributes is called the degree. Order of
tuples is irrelevant (tuples may be stored in an
arbitrary order); duplicate tuples are not allowed in a
relation.
Formally, given sets D1, D2, …. Dn a relation r is a
subset of
D1 x D2 x … x Dn attributes
Thus a relation is a set of n-tuples (a1, a2, …, an) where
customer-name
customer-streetcustomer-city
ai Di
Jones Main Harrison
Example: Smith North Rye tuples
Curry North Rye
Lindsay Park Pittsfield
3.3
Attribute
Each attribute of a relation has a name
The set of allowed values for each attribute is called
the domain of the attribute. A domain is a data type,
which is a set of values associated with operators. E.g.,
the type INTEGER is the set of all integers with
operators +,-,*,/, etc. Data types can be system-defined
or user-defined.
Attribute values are (normally) required to be atomic,
that is, indivisible
E.g. multivalued attribute values are not atomic
E.g. composite attribute values are not atomic
The special value null is a member of every domain if
allowed
The null value causes complications in the definition of
many operations
we shall ignore the effect of null values in our main
presentation and consider their effect later
3.4
Relation Schema and Instance
3.5
Keys (1)
Let R be a relation schema, r(R) be any relation on R,
and K R
K is a superkey of R if values for K are sufficient to
identify a unique tuple of each possible relation r(R).
By “possible r” we mean a relation r that could exist
in the enterprise we are modeling.
Example: {customer-name, customer-street} and
{customer-name}
are both superkeys of Customer, if no two
customers can possibly have the same name.
K is a candidate key if K is minimal superkey
Example: {customer-name} is a candidate key for
Customer, since it is a superkey (assuming no two
customers can possibly have the same name), and
no subset of it is a superkey.
3.6
Keys (2)
A primary key is a candidate key which is chosen by
the database designer as the principal means of
identifying tuples within a relation; the other
candidate keys are called alternate keys.
A foreign key K is a set of attributes of one relation
R1 whose values are required to match values of
some candidate key of another relation R2. K is
used in R1 for referencing R2.
3.7
Integrity Constraints
Integrity constraints are used to ensure the correctness
of the data
Type constraint: specify the legal values of a type
Attribute constraint: declare that a specific attribute
is of a specific type
Not-Null constraint: null value is not allowed for an
attribute if so specified
Entity constraints: no primary key value can be null.
Key constraints: all tuples in a relation are distinct.
Referential integrity constraints: ensure that a value
of a foreign key in relation R1 also appears in the
corresponding candidate key of the referenced
relation R2 (or null)
Some other constraints will be discussed later
3.8
Database (1)
A database consists of relations and integrity constraints
Information about an enterprise is broken up into parts,
with each relation storing one part of the information
3.10
E-R Diagram for the Banking
Enterprise
3.11
Schema Diagram for the Banking
Enterprise
3.12
Relational Data Manipulation
Languages
Abstract (Mathematical) languages
a) Relational Algebra --- procedural
b) Tuple Relational Calculus --- nonprocedural
c) Domain Relational Calculus --- nonprocedural
The three languages are equivalent, I.e., they have
same expressing power
Commercial languages
SQL --- Structured (Standard) Query Language; based
on (a)(b)(c)
QBE --- Query By Example; based on (c)
QUEL --- Query Language; based on (b)
ISBL --- Information System Base Language; based on
(a)
….
3.13
Relational Algebra
Basically, the relational algebra is a set of operators
that take relations as their operands and return a
relation as their result.
Six basic operators
Select (called “Restrict” in the textbook)
project
union
set difference
Cartesian product
rename
The operators take two or more relations as inputs
and give a new relation as a result.
3.14
Select Operation – Example
• Relation r A B C D
1 7
5 7
12 3
23 10
1 7
23 10
3.15
Select Operation
Notation: p(r)
p is called the selection predicate
Defined as:
p(r) = {t | t r and p(t)}
Where p is a formula in propositional calculus
consisting of terms connected by : (and), (or),
(not)
Each term is one of:
<attribute>op <attribute> or <constant>
where op is one of: =, , >, . <.
Example of selection:
branch-name=“Perryridge”(account)
Important properties.
3.16
Project Operation – Example
Relation r: A B C
10 1
20 1
30 1
40 2
A,C (r) A C A C
1 1
1 = 1
1 2
2
3.17
Project Operation
Notation:
3.18
Union Operation – Example
Relations r, s:
A B A B
1 2
2 3
1 s
r
r s:
A B
1
2
1
3
3.19
Union Operation
Notation: r s
Defined as:
r s = {t | t r or t s}
For r s to be valid.
column of s)
E.g. to find all customers with either an account or a loan
customer-name (depositor) customer-name (borrower)
Properties.
3.20
Set Difference Operation –
Example
Relations r, s:
A B A B
1 2
2 3
1 s
r
r – s:
A B
1
1
3.21
Set Difference Operation
Notation r – s
Defined as:
r – s = {t | t r and t s}
Set differences must be taken between compatible
relations.
r and s must have the same arity
attribute domains of r and s must be compatible
Property.
3.22
Cartesian-Product Operation-
Example
Relations r, s: A B C D E
1 10 a
10 a
2 20 b
r 10 b
s
r x s:
A B C D E
1 10 a
1 10 a
1 20 b
1 10 b
2 10 a
2 10 a
2 20 b
2 10 b
3.23
Cartesian-Product Operation
Notation r x s
Defined as:
r x s = {t q | t r and q s}
Assume that attributes of r(R) and s(S) are disjoint.
(That is,
R S = ).
If attributes of r(R) and s(S) are not disjoint, then
renaming must be used.
3.24
Rename Operation
Allows us to name, and therefore to refer to, the
results of relational-algebra expressions.
Allows us to refer to a relation by more than one
name.
Example:
x (E )
returns the expression E under the name X
If a relational-algebra expression E has arity n, then
x (A1, A2, …, An) (E)
returns the result of expression E under the name X,
and with the
attributes renamed to A1, A2, …., An.
3.25
Composition of Operations
Can build expressions using multiple operations
Example: A=C(r x s)
rxs A B C D E
1 10 a
1 19 a
1 20 b
1 10 b
2 10 a
2 10 a
2 20 b
2 10 b
A=C(r x s)
A B C D E
1 10 a
2 20 a
2 20 b
3.26
Expressions
A basic expression in the relational algebra consists
of either one of the following:
A relation in the database
A constant relation
Let E and E be relational-algebra expressions; the
1 2
following are all relational-algebra expressions:
E1 E2
E1 - E2
E1 x E2
p (E1), P is a predicate on attributes in E1
s(E1), S is a list consisting of some of the attributes in
E1
x (E1), x is the new name for the result of E1
3.27
Banking Example
3.28
Example Queries
3.29
Example Queries
3.30
Example Queries
Find the names of all customers who have a loan at the
Perryridge branch.
customer-name (branch-name=“Perryridge”
(borrower.loan-number = loan.loan-number(borrower x loan)))
Find the names of all customers who have a loan at the
Perryridge branch but do not have an account at any branch
of the bank.
(borrower.loan-number = loan.loan-number(borrower x
loan)))
– customer-name(depositor)
3.31
Example Queries
Find the names of all customers who have a loan at the
Perryridge branch.
Query 1
customer-name(branch-name = “Perryridge”
(borrower.loan-number = loan.loan-number(borrower x
loan)))
Query 2
customer-name(loan.loan-number = borrower.loan-number(
(branch-name = “Perryridge”(loan)) x
borrower)
)
3.32
Example Queries
Find the largest account balance
Rename account relation as d
The query is:
balance(account) - account.balance
(account.balance < d.balance (account x d
(account)))
3.33
Additional Operators
We define additional operators that do not add any
power to the relational algebra, but that simplify
common queries. That’s to say, the additional
operators can be expressed by the six basic operators
Set intersection
Natural join
Division
Assignment
3.34
Set-Intersection Operation
Notation: r s
Defined as:
r s ={ t | t r and t s }
Assume:
r, s have the same arity
attributes of r and s are compatible
Note: r s = r - (r - s)
3.35
Set-Intersection Operation -
Example
Relation r, s:
A B A B
1 2
2 3
1
r s
rs A B
2
3.36
Natural-Join Operation
Notation: r s
Let r and s be relations on schemas R and S respectively.The
result is a relation on schema R S which is obtained by
considering each pair of tuples tr from r and ts from s.
If tr and ts have the same value on each of the attributes in R
S, a tuple t is added to the result, where
t has the same value as t on r
r
t has the same value as t on s
s
Example:
R = (A, B, C, D)
S = (E, B, D)
Result schema = (A, B, C, D, E)
r s is defined as:
r.A, r.B, r.C, r.D, s.E (r.B = s.B r.D = s.D (r x s))
3.37
Natural Join Operation – Example
Relations r, s:
A B C D B D E
1 a 1 a
2 a 3 a
4 b 1 a
1 a 2 b
2 b 3 b
r s
r s
A B C D E
Theta join operation
1 a
1 a
1 a
1 a
2 b
3.38
Division Operation
rs
Suited to queries that include the phrase “for all”.
Let r and s be relations on schemas R and S
respectively where
R = (A1, …, Am, B1, …, Bn)
S = (B1, …, Bn)
The result of r s is a relation on schema
R – S = (A1, …, Am)
r s = { t | t R-S(r) u s ( tu r ) }
3.39
Division Operation – Example
Relations r, s: A B B
1
1
2
3 2
1 s
1
1
3
4
6
1
2
r s: A r
3.40
Another Division Example
Relations r, s:
A B C D E D E
a a 1 a 1
a a 1 b 1
a b 1 s
a a 1
a b 3
a a 1
a b 1
a b 1
r
r s: A B C
a
a
3.41
Division Operation (Cont.)
Property
Let q = r s
Then q is the largest relation satisfying q x s r
Definition in terms of the basic algebra operation
Let r(R) and s(S) be relations, and let S R
To see why
R-S,S(r) simply reorders attributes of r
3.42
Assignment Operation
The assignment operation () provides a convenient way to
express complex queries, write query as a sequential program
consisting of a series of assignments followed by an
expression whose value is displayed as a result of the query.
Assignments can also be used to update the database.
Example: Write r s as
3.43
Example Queries
Find all customers who have an account from at
least the “Downtown” and the Uptown” branches.
Query 1
CN(BN=“Downtown”(depositor account))
CN(BN=“Uptown”(depositor account))
where CN denotes customer-name and BN denotes
branch-name.
Query 2
customer-name, branch-name (depositor account)
temp(branch-name) ({(“Downtown”), (“Uptown”)})
3.44
Example Queries
Find all customers who have an account at all
branches located in Brooklyn city.
3.45
Extended Operators
Generalized Projection
Aggregate Functions
Outer Join
3.46
Generalized Projection
Extends the projection operation by allowing
arithmetic functions to be used in the projection
list.
F1, F2, …, Fn (E )
E is any relational-algebra expression
Each of F1, F2, …, Fn are are arithmetic expressions
involving constants and attributes in the schema of
E.
Given relation credit-info(customer-name, limit,
credit-balance), find how much more each person
can spend:
customer-name, limit – credit-balance (credit-info)
3.47
Aggregate Functions and
Operations
Aggregation function takes a collection of values and
returns a single value as a result.
avg: average value
min: minimum value
max: maximum value
sum: sum of values
count: number of values
Aggregate operation in relational algebra
3.48
Aggregate Operation – Example
Relation r:
A B C
7
7
3
10
sum-C
g sum(c) (r)
27
3.49
Aggregate Operation – Example
branch-nameaccount-number balance
Perryridge A-102 400
Perryridge A-201 900
Brighton A-217 750
Brighton A-215 750
Redwood A-222 700
3.50
Aggregate Functions (Cont.)
Result of aggregation does not have a name
Can use rename operation to give it a name
For convenience, we permit renaming as part of
aggregate operation
3.51
Outer Join
An extension of the join operation that avoids loss
of information.
Computes the join and then adds tuples form one
relation that does not match tuples in the other
relation to the result of the join.
Uses null values:
null signifies that the value is unknown or does not
exist
All comparisons involving null are (roughly speaking)
false by definition.
Will study precise meaning of comparisons with
nulls later
3.52
Outer Join – Example
Relation loan
Relation borrower
customer-nameloan-number
Jones L-170
Smith L-230
Hayes L-155
3.53
Outer Join – Example
Inner Join
loan Borrower
3.54
Outer Join – Example
Right Outer Join
loan borrower
3.55
Null Values
It is possible for tuples to have a null value, denoted by null,
for some of their attributes
null signifies an unknown value or that a value does not
exist.
The result of any arithmetic expression involving null is null.
Comparisons with null values return false.
Aggregate functions simply ignore null values
Is an arbitrary decision. Could have returned null as result
instead.
We follow the semantics of SQL in its handling of null values
For duplicate elimination and grouping, null is treated like
any other value, and two nulls are assumed to be the same
Alternative: assume each null is different from each other
Both are arbitrary decisions, so we simply follow SQL
3.56
Views
In some cases, it is not desirable for all users to see
the entire logical model (i.e., all the actual relations
stored in the database.)
Consider a person who needs to know a customer’s
loan number but has no need to see the loan
amount. This person should see a relation
described, in the relational algebra, by
customer-name, loan-number (borrower loan)
Any relation that is not of the conceptual model but
is made visible to a user as a “virtual relation” is
called a view.
3.57
View Definition
A view is defined using the create view statement
which has the form
3.58
View Examples
Consider the view (named all-customer) consisting
of branches and their customers.
create view all-customer as
branch-name, customer-name (depositor account)
branch-name, customer-name (borrower
loan)
3.59
Views Defined Using Other Views
One view may be used in the expression defining
another view
A view relation v1 is said to depend directly on a
view relation v2 if v2 is used in the expression
defining v1
A view relation v1 is said to depend on view relation
v2 if either v1 depends directly to v2 or there is a
path of dependencies from v1 to v2
A view relation v is said to be recursive if it
depends on itself.
3.60
Snapshots
Any relation that is not of the conceptual model but
is made visible to a user to reflect the status of the
database at the moment when the relation is
created is called a snapshot.
A view is like a “mirror” always reflecting the
current status of the database; it changes when the
database is updated. A snapshot reflects just the
status when the snapshot is created; once it’s
created, it doesn’t change.
A snapshot is defined using the create snapshot
statement which has the form
3.61
Tuple Relational Calculus
A nonprocedural query language, where each query is
of the form
{t | P (t) }
It is the set of all tuples t such that predicate P is true
for t
t is a tuple variable, t[A] denotes the value of tuple t
on attribute A
t r denotes that tuple t is in relation r
P is a formula similar to that of the predicate calculus
3.62
Predicate Calculus Formula
1. Set of attributes and constants
2. Set of comparison operators: (e.g., , , , , , )
3. Set of connectives: and (), or (v)‚ not ()
4. Implication (): x y, if x if true, then y is true
x y x v y
5. Set of quantifiers:
t r (Q(t)) ”there exists” a tuple in t in relation r
such that predicate Q(t) is true
t r (Q(t)) Q is true “for all” tuples t in relation r
3.63
Banking Example
branch (branch-name, branch-city, assets)
customer (customer-name, customer-street,
customer-city)
account (account-number, branch-name, balance)
loan (loan-number, branch-name, amount)
depositor (customer-name, account-number)
borrower (customer-name, loan-number)
3.64
Example Queries
Find the loan-number, branch-name, and amount
for loans of over $1200
{t | t loan t [amount] 1200}
or {t | loan (t) t .amount 1200}
3.65
Example Queries
Find the names of all customers having a loan, an
account, or both at the bank
{t | s borrower(t[customer-name] = s[customer-
name])
u depositor(t[customer-name] = u[customer-
name])
{t | s borrower(t[customer-name] = s[customer-
name])
u depositor(t[customer-name] = u[customer-
name])
3.66
Example Queries
Find the names of all customers having a loan at the
Perryridge branch
{t | s borrower(t[customer-name] = s[customer-name]
u loan(u[branch-name] = “Perryridge”
u[loan-number] = s[loan-number]))}
3.67
Example Queries
Find the names of all customers having a loan from
the Perryridge branch, and the cities they live in
{t | s loan(s[branch-name] = “Perryridge”
u borrower (u[loan-number] = s[loan-
number]
t [customer-name] = u[customer-name])
v customer (u[customer-name] =
v[customer-name]
t[customer-city] =
v[customer-city])))}
3.68
Example Queries
Find the names of all customers who have an account at
all branches located in Brooklyn:
3.69
Universal and Existential Quantifiers
(x) ((P(x)) <==> not (x)(not (P(x)))
( x) ((P(x)) <==> not ( x)(not (P(x)))
(x) ((P(x) and Q(x)) <==> not (x)(not (P(x)) or not (Q(x)))
(x) ((P(x) or Q(x)) <==> not (x)(not (P(x)) and not (Q(x)))
( x) ((P(x) or Q(x)) <==> not ( x)(not (P(x)) and not (Q(x)))
( x) ((P(x) and Q(x)) <==> not ( x)(not (P(x)) or not (Q(x)))
3.70
Examples
Find the names of employees who have no
dependents.
{e.name | e employee and (not (d (d dependent
and
e.ssn = d.essn))}
or
{e.name | e employee and (( d (not (d
dependent) or
not (e.ssn = d.essn)))}
(notice that:
d (d dependent) ==> not (e.ssn =
d.essn) )
3.71
Domain Relational Calculus
A nonprocedural query language equivalent in
power to the tuple relational calculus
Each query is an expression of the form:
3.72
Example Queries
Find the branch-name, loan-number, and amount for loans
of over $1200
{ l, b, a | l, b, a loan a > 1200}
Find the names of all customers who have a loan of over
$1200
3.73
Example Queries
Find the names of all customers having a loan, an
account, or both at the Perryridge branch:
{ c | l ({ c, l borrower
b,a( l, b, a loan b =
“Perryridge”))
a( c, a depositor
b,n( a, b, n account b =
“Perryridge”))}
3.74