0% found this document useful (0 votes)
29 views

Lecture 12

The document discusses functional dependencies and relational schema design. It covers topics like relational decomposition, normal forms like BCNF and 3NF, and relational algebra operators like selection, projection, join, and renaming.

Uploaded by

Karen Joy Claro
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views

Lecture 12

The document discusses functional dependencies and relational schema design. It covers topics like relational decomposition, normal forms like BCNF and 3NF, and relational algebra operators like selection, projection, join, and renaming.

Uploaded by

Karen Joy Claro
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 44

Lecture 11: Functional Dependencies

January 31st, 2003

Outline
Relational decomposition Normal forms Begin relational algebra

Relational Schema Design


Recall set attributes (persons with several phones):
Name Fred Fred Joe Joe SSN 123-45-6789 123-45-6789 987-65-4321 987-65-4321 PhoneNumber 206-555-1234 206-555-6543 908-555-2121 908-555-1234 City Seattle Seattle Westfield Westfield

SSN

Name, City,

but not SSN

PhoneNumber

Anomalies:
Redundancy = repeat data Update anomalies = Fred moves to Bellvue Deletion anomalies = Fred drops all phone numbers: what is his city ?

Relation Decomposition
Break the relation into two:
Name Fred Joe SSN 123-45-6789 987-65-4321 City Seattle Westfield

SSN 123-45-6789 123-45-6789 987-65-4321 987-65-4321

PhoneNumber 206-555-1234 206-555-6543 908-555-2121 908-555-1234

Relational Schema Design


name

Conceptual Model:
price

Product

buys name

Person ssn

Relational Model: plus FDs

Normalization: Eliminates anomalies

Decompositions in General
R(A1, ..., An) Create two relations R1(B1, ..., Bm) and R2(C1, ..., Cp) such that: B1, ..., Bm C1, ..., Cp = A1, ..., An and: R1 = projection of R on B1, ..., Bm R2 = projection of R on C1, ..., Cp

Incorrect Decomposition
Sometimes it is incorrect:
Name Gizmo OneClick DoubleClick Price 19.99 24.99 29.99 Category Gadget Camera Camera

Decompose on : Name, Category and Price, Category

Incorrect Decomposition

Name Gizmo OneClick DoubleClick

Category Gadget Camera Camera

Price 19.99 24.99 29.99

Category Gadget Camera Camera

Name Gizmo

Price 19.99 24.99 29.99 24.99 29.99

Category Gadget Camera Camera Camera Camera

When we put it back: Cannot recover information

OneClick OneClick DoubleClick DoubleClick

Normal Forms
First Normal Form = all attributes are atomic Second Normal Form (2NF) = old and obsolete Third Normal Form (3NF) = this lecture Boyce Codd Normal Form (BCNF) = this lecture Others...

Boyce-Codd Normal Form


A simple condition for removing anomalies from relations: A relation R is in BCNF if:

Whenever there is a nontrivial dependency A1, ..., An in R , {A1, ..., An} is a key for R

In English (though a bit vague): Whenever a set of attributes of R is determining another attribute, should determine all the attributes of R.

Example
Name Fred Fred Joe Joe SSN 123-45-6789 123-45-6789 987-65-4321 987-65-4321 PhoneNumber 206-555-1234 206-555-6543 908-555-2121 908-555-1234 City Seattle Seattle Westfield Westfield

What are the dependencies? SSN Name, City What are the keys? {SSN, PhoneNumber} Is it in BCNF?

Decompose it into BCNF


Name Fred Joe SSN 123-45-6789 987-65-4321 City Seattle Westfield

SSN

Name, City

SSN 123-45-6789 123-45-6789 987-65-4321 987-65-4321

PhoneNumber 206-555-1234 206-555-6543 908-555-2121 908-555-1234

Summary of BCNF Decomposition


Find a dependency that violates the BCNF condition: A1, A2, A n B1, B2, B m Heuristics: choose B , B2, Bmas large as possible 1 Decompose: Others Is there a 2-attribute relation that is not in BCNF ? As Bs Continue until there are no BCNF violations left.

R1

R2

Example Decomposition
Person(name, SSN, age, hairColor, phoneNumber) SSN name, age age hairColor Decompose in BCNF (in class): Step 1: find all keys

Step 2: now decompose

Other Example
R(A,B,C,D) A B, B C

Keys: Violations of BCNF:

Correct Decompositions
A decomposition is lossless if we can recover: R(A,B,C)
Decompose

R1(A,B)

R2(A,C)
Recover

R(A,B,C) should be the same as R(A,B,C)


R is in general larger than R. Must ensure R = R

Correct Decompositions
Given R(A,B,C) s.t. A B, the decomposition into R1(A,B), R2(A,C) is lossless

3NF: A Problem with BCNF


Unit Company Product FDs: Unit p Company; Company, Product p Unit So, there is a BCNF violation, and we decompose. Unit Company Unit p Company Unit Product No FDs

So Whats the Problem?


Unit Galaga99 Bingo Company UW UW Unit Galaga99 Bingo Product databases databases

No problem so far. All local FDs are satisfied. Lets put all the data back into a single table again: Unit Galaga99 Bingo Company UW UW Product databases databases

Violates the dependency: company, product -> unit!

Solution: 3rd Normal Form (3NF)


A simple condition for removing anomalies from relations: A relation R is in 3rd normal form if : Whenever there is a nontrivial dependency A1, A2, ..., An p B for R , then {A1, A2, ..., An } a super-key for R, or B is part of a key.

Relational Algebra
Formalism for creating new relations from existing ones Its place in the big picture:
Declartive query language SQL, relational calculus

Algebra Relational algebra Relational bag algebra

Implementation

Relational Algebra
Five operators:
Union: Difference: Selection: W Projection: 4 Cartesian Product: v

Derived or auxiliary operators:


Intersection, complement Joins (natural,equi-join, theta join, semi-join) Renaming: V

1. Union and 2. Difference


R1 R2 Example:
ActiveEmployees RetiredEmployees

R1 R2 Example:
AllEmployees -- RetiredEmployees

What about Intersection ?


It is a derived operator R1 R2 = R1 (R1 R2) Also expressed as a join (will see later) Example
UnionizedEmployees RetiredEmployees

3. Selection
Returns all tuples which satisfy a condition Notation: Wc(R) Examples
WSalary > 40000 (Employee) Wname = Smithh (Employee)

The condition c can be =, <, e, >, u, <>

Selection Example Employee SSN 999999999 777777777 888888888

Name John Tony Alice

DepartmentID 1 1 2

Salary 30,000 32,000 45,000

Find all employees with salary more than $40,000. WSalary > 40000 (Employee)

SSN Name 888888888 Alice

DepartmentID 2

Salary 45,000

4. Projection
Eliminates columns, then removes duplicates Notation: 4A1,,An (R) Example: project social-security number and names:
4 SSN, Name (Employee) Output schema: Answer(SSN, Name)

Projection Example Employee SSN 999999999 777777777 888888888

Name John Tony Alice

DepartmentID 1 1 2

Salary 30,000 32,000 45,000

4 SSN, Name (Employee)


SSN 999999999 777777777 888888888 Name John Tony Alice

5. Cartesian Product
Each tuple in R1 with each tuple in R2 Notation: R1 v R2 Example:
Employee v Dependents

Very rare in practice; mainly used to express joins

Cartesian Product Example Employee Name John Tony Dependents EmployeeSSN 999999999 777777777

SSN 999999999 777777777

Dname Emily Joe

Employee x Dependents Name SSN EmployeeSSN John 999999999 999999999 John 999999999 777777777 Tony 777777777 999999999 Tony 777777777 777777777

Dname Emily Joe Emily Joe

Relational Algebra
Five operators:
Union: Difference: Selection: W Projection: 4 Cartesian Product: v

Derived or auxiliary operators:


Intersection, complement Joins (natural,equi-join, theta join, semi-join) Renaming: V

Renaming
Changes the schema, not the instance Notation: V B1,,Bn (R) Example:
VLastName, SocSocNo (Employee) Output schema: Answer(LastName, SocSocNo)

Renaming Example
Employee Name John Tony SSN 999999999 777777777

VLastName, SocSocNo (Employee)


LastName John Tony SocSocNo 999999999 777777777

Natural Join
Notation: R1 Meaning: R1 Where:
The selection WC checks equality of all common attributes The projection eliminates the duplicate common attributes

R2 R2 = 4A(WC(R1 v R2))

Natural Join Example Employee Name John Tony Dependents SSN 999999999 777777777 SSN 999999999 777777777

Dname Emily Joe

Employee Dependents = 4Name, SSN, Dname(W SSN=SSN2(Employee x VSSN2, Dname(Dependents)) Name John Tony SSN Dname 999999999 Emily 777777777 Joe

Natural Join
R=
A X X Y Z B Y Z Z V

S=

B Z V Z

C U W V

S=

A X X Y Y Z

B Z Z Z Z V

C U V U V W

Natural Join
Given the schemas R(A, B, C, D), S(A, C, E), what is the schema of R S ? Given R(A, B, C), S(D, E), what is R Given R(A, B), S(A, B), what is R S ? S ?

Theta Join
A join that involves a predicate R1 U R2 = W U (R1 v R2) Here Ucan be any condition

Eq-join
A theta join where Uis an equality R1 A=B R2 = WA=B (R1 v R2) Example:
Employee
SSN=SSN

Dependents

Most useful join in practice

Semijoin
R S = 4 A1,,An (R S) Where A1, , An are the attributes in R Example:
Employee Dependents

Semijoins in Distributed Databases


Semijoins are used in distributed databases
Dependents Employee
SSN ... Name ... SSN ... Dname Age ...

network

Employee
R = Employee T

ssn=ssn

(Wage>71 (Dependents))
T = 4 SSN Wage>71 (Dependents) Answer = R Dependents

Complex RA Expressions
4 name
buyer-ssn=ssn pid=pid

seller-ssn=ssn

4 ssn Wname=fred
Person Purchase Person

4 pid Wname=gizmo
Product

Operations on Bags
A bag = a set with repeated elements All operations need to be defined carefully on bags {a,b,b,c}{a,b,b,b,e,f,f}={a,a,b,b,b,b,b,c,e,f,f} {a,b,b,b,c,c} {b,c,c,c,d} = {a,b,b,d} WC(R): preserve the number of occurrences 4A(R): no duplicate elimination Cartesian product, join: no duplicate elimination Important ! Relational Engines work on bags, not sets !
Reading assignment: 5.3 5.4

Finally: RA has Limitations !


Cannot compute transitive closure
Name1 Fred Mary Mary Nancy Name2 Mary Joe Bill Lou Relationship Father Cousin Spouse Sister

Find all direct and indirect relatives of Fred Cannot express in RA !!! Need to write C program

You might also like