0% found this document useful (0 votes)
39 views

Relational Algebra Updated

The document discusses relational algebra and its operators. It defines relational algebra and its domains, operands, and operators. It then explains classical relational algebra operators such as selection, projection, join, and set operations. It provides examples of each operator and how they map to SQL counterparts.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views

Relational Algebra Updated

The document discusses relational algebra and its operators. It defines relational algebra and its domains, operands, and operators. It then explains classical relational algebra operators such as selection, projection, join, and set operations. It provides examples of each operator and how they map to SQL counterparts.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 67

IN3020/4020 – Database Systems Spring

2020, Week 3.2 & 4.1

RELATIONAL ALGEBRA
Calculating with relations

Dr. M. Naci Akkøk, Chief Architect, Oracle Nordics


Based upon slides by E. Thorstensen from Spring 2019
Relational Algebra…
o Defines a set of operations on relations
o Gives us a language to describe questions about the content
of relations
o Is a procedural language: We say how the answer should be
calculated. (The alternative is declarative query languages
like SQL where we only say what the answer will fulfill)
o It is the theoretical basis for processing SQL queries on
relational databases
Algebra…
o Domain (collection of values)
o (Atomic) operands
o Constants
(represents concrete values in the domain)
o Variables
(represents arbitrary values from the domain)
o Operators
o Takes operands as arguments – Delivers an operand as the result
o Expressions
o Built up of operands with operators and brackets
Example: Integer Algebra
o Domain: Integers
o Operands:
o Constants: ..., -3, -2, -1, 0, 1, 2, 3, ...
o Variables: x, y, z, ...
o Operators: +, −, ×, /

o Expression examples:
2+5
((2 − x) × 5) + (y / z)
Classical Relational Algebra
o Domain: Finite relations o Operators:
1. (Set) Union
o Operands:
2. (Set) Difference
o Constants: All finite relations
3. (Set) Intersection
o Variables: Arbitrary finite 4. Projection
relations 5. Selection
«Relations are sets, so we can apply set-theoretic 6. Cartesian product
operators. However, we want the results to be (Cross Product or Cross Join)
relations (that is, homogeneous sets of tuples). It is
therefore meaningful to only apply union, intersection,
o Join (all the others)
difference to pairs of relations defined over the same o Renaming
attributes»
o Division
Before we look at Relational Algebra and its
operators
o Relational Algebra and its operators are there for at
least two reasons
o We will need the algebra, its operators and its laws
in query compilation & optimization (next weeks)
o The counterparts of the operations actually exist in
SQL (but not always implemented with the same
name)
Practical counterparts of the operators
o We will use examples from w3resource
(https://www.w3resource.com/) or w3schools
(https://www.w3schools.com/sql/)
o w3resource has tutorials and examples for 2003 standard ANSI SQL
(use that primarily), as well as MySQL, PostgreSQL, Oracle etc., and for
NoSQL, GraphQL and others that you will need later in this course (and
in life)
o w3schools let you “Try it Yourself” that can help understand (the green
button)
o There are very many SQL help & tutorials, also on each DBMS´ own site
Set Operations
o Union: R∪S
o Intersection: R∩S
o Difference: R−S

o R and S must have forms with identical count of attributes and


identical domains.
o Before performing the operation, S must be arranged so that the
attributes are in the same order as in R.
Set Operations: UNION
o R ∪ S is a relation where
o All tuples in R or in S or in both R and S are in R ∪ S.
o If t is in both R and S, is t still only once in in R ∪ S
(because a relation is a set).
o No other tuples are in R ∪ S

o Example of regular set union:


{a, b, c} ∪ {a, c, d} = {a, b, c, d} R R ∪ S S
Set Operations: INTERSECTION
o R ∩ S is a relation where
o Only those tuples that are in both R and S are in R ∩ S
o No other tuples appear in R ∩ S

o Example of regular set intersection:


{a, b, c} ∩ {a, c, d} = {a, c}
R R∩S S
Set Operations: DIFFERENCE
o R – S is a relation where
o All tuples that are in R but not in S are in R – S
o No other tuples appear in R – S

o Example of regular set difference:


{a, b, c} – {a, c, d} = {b}
R–S
R S
Operators that remove parts of a relation
o Selection: ! C(R)
o Projection: " L(R)
SELECTION (!)
o !C(R) is the relation obtained from R by selecting the tuples in R that
satisfy the condition C
o C is any Boolean expression made up of atoms of the form
op1 " op2, where
o The operator " # { =, ≠, <, >, <=, >=, LIKE}
o Operands op1 and op2 are
o either two attributes in R with same domain
o or one attribute in R and a constant from the domain of the attribute
o In (A LIKE e), e is a constant or a regular expression
SELECTION !C(R) Example
Tutorials
ID site tutorial topic
1 w3schools SQL_2003STD Database
2 w3schools HTML_5 WebDev
3 w3schools CSS_3 WebDev !topic = "Database" (Tutorials)
4 w3resource SQL_2003STD Database
5 w3resource MySQL Database

ID site tutorial topic


1 w3schools SQL_2003STD Database
4 w3resource SQL_2003STD Database
5 w3resource MySQL Database
PROJECTION (!)
o ! L(R), where R is a relation and L is a list of attributes in R, is
the relation obtained from R is by selecting the columns of
the attributes in L
o The relation has a schema with the attributes in L
o No tuples can occur more than once in ! L(R)
PROJECTION !L(R) Example
Tutorials
ID site tutorial topic
1 w3schools SQL_2003STD Database
2 w3schools HTML_5 WebDev !site, topic(Tutorials)
3 w3schools CSS_3 WebDev
4 w3resource SQL_2003STD Database
5 w3resource MySQL Database

site topic
w3schools Database
w3schools WebDev
w3schools WebDev
w3resource Database
w3resource Database
RENAMING (!)
o "S(A1,A2,...,An)(R) renames R to a relation S with name S and
attributes A1, A2, ..., An

o "S(R) renames R to a relation with name S. Attribute names


from R are kept as-is.

o In certain cases (operations on “self”), renaming the


relation (giving it another name) may be necessary to avoid
semantic misinterpretation
WE OFTEN USE THESE TOGETHER
o We often use selection, projection in the same SFW
(Select-From-Where) construct
o And we often use renaming as well.

o Sometimes we have to use renaming, remember?


(Next two slides from the 2nd week SQL lectures)
Self-join
o We want to “print name of all employess and each
employee´s manager” given Employee(Id, Name),
Manager(empId, mgrId).
o THIS IS WORNG:
SELECT e.Name, e.Name FROM Employee e
JOIN Manager ON empId=Id AND mgrId=Id;
o There is obviously something very wrong with this query.
We need TWO names!
Self-join continued
o Try again: “print name of all employess and each
employee´s manager” given Employee(Id, Name),
Manager(empId, mgrId)
o We need an extra copy of Employee:
o This is correct:
SELECT e.Name, s.Name FROM Employee e
JOIN Manager ON empId=e.Id
JOIN Employee s ON s.Id = mgrId;
Operators that split tuples
o Cartesian product: R×S
o Natural join: R ⋈ S
o Theta-join: R ⋈θ S
Cartesian Product R×S
o R×S s the relation obtained from R and S by forming all possible
combinations of one tuple from R and one tuple from S
o We often say that one tuple t from R and one tuple u fra S is
concatanated into a tupple v = tu in R×S
o In the resulting schema (form), any name similarity between
attributes in R and S is resolved by qualifying the names with
the origin relation: R.A, S.A
o If R and S are the same relations, one of them must first be
renamed
Cartesian Product R×S Visual Example
{ 1, a }
{ 1, b }
R S
{ 1, c }
1
{ 2, a } a
EACH by EACH,
{ 2, b } 2 b
ALL TUPLES
{ 2, c } 3 c
{ 3, a }
{ 3, b }
{ 3, c }
Cartesian Product R×S SQL Example
SID site HQ CID topic SID site HQ CID topic
101 w3schools USA
× 501 Database
→ 101 w3schools USA 501 Database
102 Udemy UK 503 Language 102 Udemy UK 501 Database
103 Folkeuniversitetet NO 103 Folkeuniversitetet NO 501 Database
101 w3schools USA 503 Language
102 Udemy UK 503 Language
103 Folkeuniversitetet NO 503 Language

o NOTE that it would is usually wise to use some condition (at least so
that the result makes sense).
o In SQL, the cartesian product is a CROSS JOIN
The Joy of Joins
o There are in practice two types of joins:

o EQUI-joins
Joins that allow for conditions only with equality in the
condition

o NON-EQUI-joins
All the other joins that allow for comparisons other than
equality (all the other kind of conditions)
Natural Join
o R⋈S is the relation obtained from R and S by forming all
possible mergers of one tuple from R with one from S where
the tuples are to match all attributes with matching names
o Common attributes occur only once in the merged
attributes
o The resulting schema has the attributes in R followed by
those attributes in S that do not also occur in R
o Natural join is an EQUI-JOIN!
Dangling Tuple
o A dangling tuple is a tuple in one of the relations that has
no matching tuple in the other relations
o Dangling tuples are not represented in the result relation
after a join
foods

Natural Join R ⋈ S
Practical Example

company

SELECT * FROM foods


NATURAL JOIN company;
Dangling?

NOTE that it is an “equi-join”


on COMPANY_ID
Theta Join (⋈θ)
o Generalization of a natural join
o The relation R⋈S where θ is a condition (boolean expression) is
calculated as follows:
1. Calculate R⋈S
2. Pick the tuples that satisfy the condition θ
o The constituents (atoms) in θ have the form A " B where A and B are
attributes in R and S, A and B respectively have the same domain,
and " # { =, ≠, <, >, <=, >= } + LIKE ´RegularExpression´ in practice!
o AGAIN, NOTE that a theta join links tables based on a relationship
other than «natural» equality, but the condition can use equality!
foods

Theta Join (⋈θ)


Practical Example
Start with the Natural Join
as suggested
company

SELECT * FROM foods


NATURAL JOIN company;

Dangling?

Then add a condition θ,


like ITEM_ID < 3
Equi-join (special case of theta Join)
Special case of a theta-join ⋈θ where condition θ satisfies
following requirements:
1. θ contains no other boolean operators than AND, i.e., θ has
the form θ1 AND θ2 AND ... AND θm
2. Where θk for 1 ≤ k ≤ m is in the form A = B there A is an
attribute in R and B is an attribute in S with A and B having
the same domain
(In other words: When theta join uses only “=“)
Theta Join (⋈θ) – An EQUI-JOIN E<xample
name manufacturer SELECT * FROM Beers B JOIN
Efes Pilsen Anadolu Gurubu Likes L ON B.name = L.beer;
Ringnes Pils Ringnes
name manufacturer drinker
Ringnes Lite Ringnes
Efes Pilsen Anadolu Gurubu Ada
⋈ Ringnes Pils Ringnes Naci
drinker beer
Ringnes Lite Ringnes Bjørn
Ada Efes Pilsen
Naci Efes Pilsen
Bjørn Ringnes Lite
Division
o Let R(A,B) and S(B) be two relations, and A, B be disjoint sets of
attributes
o R div S is all tuples t from !"(R) such that {t} × S is contained in R
o In other words: R contains contains one tuple tu for each u from S
o Division is the inverse of cartesian product (R’ × S’) div S’ = R’
o NOTE that the opposite is not valid (R div S) × S ≠R

o Queries with “all” often indicate division


o No division operator in SQL STD 2003! You need a technique.
You can derive division using projection,
cartesian product and difference

R(A,B) div S(B) =


! A(R) –
! A((! A(R) × S) – R)

Note:
(! A(R) × S) – R are those that do NOT satisfy the condition
Typical steps for computation of R(A,B) div S(B):
o Find out all possible combinations of S(B) with R(A) by computing
R(A) × S(B). Call it R1
o Subtract actual R(A,B) from R1. Call it R2
o A in R2 are those that are not associated with every value in S(B);
therefore R(A)-R2(A) gives us the A that are associated with all values
in S
o Take a look at the examples here:
o https://www.geeksforgeeks.org/sql-division/
o https://www.studytonight.com/dbms/division-operator.php
Division: Example of use
o Romutstyr (room equipment)
show the equipment that
exists
o Aktivitetskrav (for activity)
shows the kind of equipment
needed for a given activity
Room that covers all equipment needs
o Let R = Romutstyr (room equipment) and A = Aktivitetskrav (for
activity)
o Room that covers all equipment requirements for MUS1225-
hørelære (hearing training):
R div !utstyr("aktivitet = MUS1225-hørelære(A))
o Room that covers all equipment requirements for MUS1225,
i.e., both hørelære (hearing training) and musikkprosuksjon
(music production):
R div !utstyr("aktivitet LIKE ´MUS1225%´(A))
Result of the division
Other derivable operators
o R∩S = R – (R – S)
o which means that we don´t /001 ∩ (we can derive it)
o R ⋈θ S =3θ(R × S)
o which means that we don´t /001 ⋈θ
o R⋈S= 5L(3θ(R × S)),
where L is the list of of attributes in R followed by the attributes in
S that don´t occur in R, and θ is R.A1 = S.A1 AND ... AND R.Ak = S.Ak
der A1, ... , Ak are all the attributes that occur in both R and S
o which means that we don´t /001 ⋈ J
The minimal set of operators
o Operators in the set {∪, –, ", #, ×, %} can not be expressed
using the other operators in the set
o They are the minimal indpendent set of operators

o We still wish to keep the other operators because there are


effective algorithms for them and because it is often simpler
to formulate queries using them
Bag
o Commercial DBMSs use Bag (multiset) and not Set (quantity) as
basic type for realizing relations
o Set (D):
Each element in D occurs at most once. The order of the
elements does not matter
{a, b, c} = {a, c, b} = {a, a, b, c} = {c, a, b, a}
o Bag (D):
Each element in D can occur more than once. The order of the
elements does not matter
{a, b, c} = {a, c, b} ≠ {a, a, b, c} = {c, a, b, a}
Why Bag and not Set?
o Bag provides more efficient union and projection
calculations than Set
o In aggregation, we need Bag functionality

o But: Bag is more space consuming than Set


Relational operators on Bags
o The definitions become slightly different

o Not all algebraic laws that hold for Sets hold for Bags
Example: (R ∪ S) – T = (R – T) ∪ (S –T)

o When we later in the lectures mention «bag relation», we


mean a relation-schema and an instance that is a bag
Bag Union
o Let R and S be bag relations

If t is a tuple that occurs n times in R and m times in S, then


t occurs n + m times in the bag relation R ∪ S

o Example of typical Bag union:


{a, a, b, c, c} ∪ {a, c, c, c, d} = {a, a, a, b, c, c, c, c, c, d}
Bag Intersection
o Let R and S be bag relations

If t is a tuple that occurs n times in R and m times in S, then


t occurs min(n,m) times in the bag relation R∩S

o Example of typical Bag intersection:


{a, a, b, c, c} ∩ {a, c, c, c, d} = {a, c, c}
Bag Difference
o Let R and S be bag relations

If t is a tuple that occurs n times in R and m times in S, then


t occurs max (0, n-m) times in the bag relation R−S

o Example of typical Bag difference:


{a, a, b, c, c} – {a, c, c, c, d} = {a, b}
Bag Selection
o If R is a bag relation, then !" (R) is a bag relation obtained
from R by applying " to each tuple individually and selecting
the tuples in R that satisfy the condition "
Bag Projection
o If R is a bag relation and L is a (non-empty) list of attributes,
then ! L(R) is the bag relation obtained from R by selecting
the columns of the attributes in L

o ! L(R) has as many tuples as R


Cartesian Product of Bags
o R × S is the bag relation obtained from the bag relations R
and S by forming all possible concatenations of one tuple
from R and one tuple from S

o If R has n tuples and S has m tuples, there will be nm


doubles in R × S
Theta-Join of Bags
o If R and S are bag relations, the bag relation R⋈θS where θ
is a condition is formed as follows:

1. Calculate R × S (Cartesian bag product)


2. Select the tuples that satisfy the condition
Natural Join of Bags
o If R and S are bag relations, then R⋈S is the bag relation
obtained by merging matching tuples in R and S individually
Additional operators in relational algebra
o Duplicate elimination
o Aggregation operators
o Grouping
o Sorting
o Extended projection
o Outer Join
Duplicate elimination
o !(R) removes multiple occurences of tuples from the bag
relation R
o The result is a set
Aggregation operations
o Used on bags of atomic values for an attribute A
o Used in combination with the grouping operator
Standard aggregation operations #1
o SUM(A):
o Sums all values in the columns of A
o A´s domain must be numeric values
o AVG(A):
o Calculates the average of the values in the column of A
(The column must have at least one value)
o A´s domain must be numeric values
Standard aggregation operations #2
o MIN (A), MAX (A):
o Selects the smallest / largest value in the column of A
(The column must have at least one value)
o The domain of A must have an order relation
o For numeric values this is <
o Lexicographic arrangement is used for strings
o COUNT (A):
o Counts the number of tuples in the relation with values in the
column of A
o (so doubles where A is nil are not counted)
Grouping
o Used when we want to apply an aggregation operator to groups of values
o Form: !L(R), where L is a list of items with all the items in the list different. The
elements are in one of the following two forms:
o A
o A is an attribute in R
o A is called a grouping attribute
o AGG (A) → AggRes
o AGG is an aggregation operator
o AggRes is an unused attribute name
o A is called an aggregation attribute
The resulting relation after grouping
Given %L(R), the result relation is constructed as follows
1. Partition R in groups, one group for each collection of tuples
that are equal in all grouping attributes in L
2. For each group, produce a tuple consisting of
i. The values of the grouping attributes in the group
ii. For each aggregation attribute in L, the aggregation over all
the tuples in the group
The result relation gets as many attributes as there are elements in
L, and attribute names as specified by L. The result instance
contains one tuple per group.
Grouping use
& example
SELECT MAX (mycount) FROM
(SELECT
agent_code, COUNT(agent_code)mycount
FROM orders
GROUP BY agent_code
);
Sorting
o !L(R), where R is a relation and L a list of attributes
A1, A2, ..., Ak, results in a list of tuples sorted first by A1, then
by A2 internally in each batch of equal A1 values, etc.
o The attributes that are not included in the list are randomly
arranged
o Result is a list, so the operation is meaningful only as a last,
final operation on relations
Extended projection
o ! L(R), classic: L is a list of attributes in R
o ! L( R), extended: L is a list where each item can be
i. A simple attribute in R
ii. An expression A → B, where A is an attribute in R and B
is an unused attribute name
Renames A in R to B in the result relation
iii. An expression E → B, where E is an expression built up of
attributes in R, constants, arithmetic operators and
string operators, and B is an unused attribute name
Extended projection – The result relation
The result relation ! L(R) is obtained from R as follows:
o Consider each tuple in R separately
o Substitute the tuple´s values for the attribute names in L
and calculate the expressions in L
o The result relation is a bag with as many attributes as items
in L, and with names as given in L
Outer Join
o Outer join is used when you want to preserve dangling tuples
from natural join
o R⋈OS, outer join:
o Start with R⋈S
o Add dangling tuples from R and S
o Missing attribute values are filled in with ⊥ (nil)
o R⋈OLS Left outer join: Only dangling tuples from R are added
o R⋈ORS Right outer join : Only dangling tuples from S are added
Outer Join Example
https://www.w3resource.com/sql/joins/perform-an-outer-join.php
foods company

SELECT company.company_name,
company.company_id,
foods.company_id,
foods.item_name,
foods.item_unit
FROM company, foods
WHERE company.company_id =
foods.company_id(+);
Relations and rules of integrity
o We can express referential integrity, functional
dependencies and multi-value dependencies - and also
other classes of integrity rules - in relational algebra!
Examples of integrity rules in classical
relational algebra
o If E is an expression in relational algebra, then E=∅ is an integrity rule
that says that E does not have any tuples
o If E1 and E2 are expressions in relational algebra, then
E1 ⊆ E2 is an integrity rule that says that each tupple in E1 shall also
be in E2
o Note that E1 ⊆ E2 and E1 – E2= ∅ are equivalent.
Also E= ∅ og E ⊆ ∅. Thus, only one of the forms above is sufficient
o Strictly speaking, ∅ is not a relational algebra expression. We could
have written R – R instead (for an arbitrary relation R with same
schema as E)
Examples of integrity rules in classical
relational algebra
o Referential integrity: ”A is foreign key for S”, where B is primary key in S:
!("A(R)) ⊆ "B(S)
o FDs: ”A1 A2 ... An→B1 B2 ... Bm” in R:
%&('R1(R)×'R2(R))=∅
o where & is the expression
R1.A1 = R2.A1 AND ... AND R1.An =
R2.An AND (R1.B1 ≠ R2.B1 OR ... OR R1.Bm ≠ R2.Bm)
o Domain constraints:
% A≠’F’ AND A≠’M’(R) = ∅

You might also like