Normalization (2)
Normalization (2)
Outline
• Features of ER
• Features of Normalization
• Consider the new relation in_dep that combines the instructor and
department tables
– For no K, R
• Functional dependencies allow us to express constraints that cannot be expressed
using superkeys. Consider the schema:
in_dep (ID, name, salary, dept_name, building, budget ).
We expect these functional dependencies to hold:
dept_name building
ID building
but would not expect the following to hold:
dept_name salary
Definition
• This is the process which allows you to winnow out
redundant data within your database.
• This involves restructuring the tables to successively
meeting higher forms of Normalization.
• A properly normalized database should have the
following characteristics
– Scalar values in each fields
– Absence of redundancy.
– Minimal use of null values.
– Minimal loss of information.
Levels of Normalization
• Levels of normalization based on the amount
of redundancy in the database.
• Various levels of normalization are:
– First Normal Form (1NF)
Redundancy
– Second Normal Form (2NF)
– Third Normal Form (3NF)
Number of Tables
– Boyce-Codd Normal Form (BCNF)
–
Complexity
Fourth Normal Form (4NF)
– Fifth Normal Form (5NF)
– Domain Key Normal Form (DKNF)
Most
Mostdatabases
databasesshould
shouldbe
be3NF
3NFororBCNF
BCNFin inorder
orderto
toavoid
avoid
the
thedatabase
databaseanomalies.
anomalies.
Levels of Normalization
1NF
2NF
3NF
4NF
5NF
DKNF
Each
Eachhigher
higherlevel
levelisisaasubset
subsetof
ofthe
thelower
lowerlevel
level
First Normal Form
(1NF)
A table is considered to be in 1NF if all the fields
contain
only scalar values (as opposed to list of values).
Example (Not 1NF)
ISBN Title AuName AuPhone PubName PubPhone Price
Author
Authorand
andAuPhone
AuPhonecolumns
columnsare
arenot
notscalar
scalar
1NF - Decomposition
1. Place all items that appear in the repeating group
in a new table
2. Designate a primary key for each new table
produced.
3. Duplicate in the new table the primary key of the
table from which the repeating group was
extracted or vice versa.
ISBN AuName AuPhone
Example (1NF) 0-321-32132-1 Sleepy 321-321-1111
0-55-123456-9 Main Street Small House 714-000-0000 $22.95 0-55-123456-9 Jones 123-333-3333
Example 1
ISBN Title Price Table Scheme: {ISBN, Title, Price}
0-321-32132-1 Balloon $34.00 Functional Dependencies: {ISBN}
0-55-123456-9 Main Street $22.95 {Title}
0-123-45678-0 Ulysses $34.00
{ISBN}
{Price}
1-22-233700-0 Visual $25.00
Basic
Functional
Dependencies
Example 2
PubID PubName PubPhone Table Scheme: {PubID, PubName,
1 Big House 999-999-9999 PubPhone}
2 Small House 123-456-7890 Functional Dependencies: {PubId}
3 Alpha Press 111-111-1111 {PubPhone}
{PubId}
{PubName}
Example 3 {PubName, PubPhone}
AuID AuName AuPhone {PubID}
1 Sleepy 321-321-1111
Table Scheme: {AuID, AuName,
2 Snoopy 232-234-1234
AuPhone}
3 Grumpy 665-235-6532 Functional Dependencies: {AuId}
4 Jones 123-333-3333 {AuPhone}
5 Smith 654-223-3455 {AuId}
6 Joyce 666-666-6666 {AuName}
7 Roman 444-444-4444 {AuName, AuPhone}
{AuID}
FD – Example
Database to track reviews of papers submitted to an
academic conference. Prospective authors submit
papers for review and possible acceptance in the
published conference proceedings. Details of the entities
– Author information includes a unique author number, a
name, a mailing address, and a unique (optional) email
address.
– Paper information includes the primary author, the paper
number, the title, the abstract, and review status
(pending, accepted,rejected)
– Reviewer information includes the reviewer number, the
name, the mailing address, and a unique (optional) email
address
– A completed review includes the reviewer number, the
date, the paper number, comments to the authors,
comments to the program chairperson, and ratings
(overall, originality, correctness, style, clarity)
FD – Example
Functional Dependencies
– AuthNo AuthName, AuthEmail, AuthAddress
– AuthEmail AuthNo
– PaperNo Primary-AuthNo, Title, Abstract,
Status
– RevNo RevName, RevEmail, RevAddress
– RevEmail RevNo
– RevNo, PaperNo AuthComm, Prog-Comm,
Date, Rating1, Rating2, Rating3, Rating4,
Rating5
Second Normal Form
(2NF)
For a table to be in 2NF, there are two requirements
– The database is in first normal form
– All nonkey attributes in the table must be functionally
dependent on the entire primary key
Note: Remember that we are dealing with non-key attributes
Use
Useyour
yourown
ownjudgment
judgmentwhen
whendecomposing
decomposingschemas
schemas
BCNF - Decomposition
Example 2 (Convert to BCNF)
Old Scheme {MovieTitle, MovieID, PersonName, Role, Payment }
New Scheme {MovieID, PersonName, Role, Payment}
New Scheme {MovieTitle, PersonName}
• Loss of relation {MovieID} {MovieTitle}
New Scheme {MovieID, PersonName, Role, Payment}
New Scheme {MovieID, MovieTitle}
• We got the {MovieID} {MovieTitle} relationship back
Example 3 (Convert to BCNF)
Old Scheme {Client, Problem, Consultant}
New Scheme {Client, Consultant}
New Scheme {Client, Problem}
Fourth Normal Form
•
(4NF)
Fourth normal form eliminates independent many-to-
one relationships between columns.
• To be in Fourth Normal Form,
– a relation must first be in Boyce-Codd Normal Form.
– a given relation may not contain more than one multi-
valued attribute.
2. Each manager can have more than one childMary NULL Adam
3. Each manager can supervise more than one employee
4. 4NF Violated