0% found this document useful (0 votes)
16 views57 pages

Normalization

Uploaded by

joy.kavulunze
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views57 pages

Normalization

Uploaded by

joy.kavulunze
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 57

Logical Design

Normalization

Compiled by E.Maina SCT 307 Slides 1


Objectives
• At the of the topic the student should be able to:
1. define the term Normalization and explain why
normalization is done
2. Explain how to gather and list the unnormalized
data set
3. Explain how to identify key attributes and
commonly used normal forms such as 1NF, 2NF and
3NF
4. Explain how to do step-by-step normalization to 3NF

Compiled by E.Maina SCT 307 Slides 2


What is normalisation?

• A process by which a flat file (list) of data may be


converted into a set of well-structured relations.
– contains the minimum amount of data redundancy
– allows users to modify rows in a table without producing
anomalies, errors or inconsistencies
• Work through a series of stages, called normal
forms:
– first normal form (1NF); second normal form (2NF); third normal
form (3NF); etc

Compiled by E.Maina SCT 307 Slides 3


Why Normalise?
• To reduce/minimize data redundancy in database
tables.
• To convert data in forms & spreadsheets into well-
structured relational database tables/relations.
• To ensure that the transformation of data from such
sources is carried out systematically, thus to
eliminate any possible anomalies.

Compiled by E.Maina SCT 307 Slides 4


Data Anomalies

• There are three main types of data


anomalies:
– Insertion Anomalies
– Deletion Anomalies
– Update Anomalies

Compiled by E.Maina SCT 307 Slides 5


Insertion Anomalies

An insertion anomaly occurs where to add a new row into a database/table


requires duplication of data that already exists

StudentNo Student Name Unit Code Unit Name

0972343 Eric Cartman SCT1207 Systems & Database Design


0982342 Kyle Broflowski SCTI2441 Application Development
2013442 Stan Marsh SCTI2441 Application Development
3992342 Kenny McCormack SCTI2441 Application Development
2303232 Wendy Testabutger SCTI2441 Application Development

The addition of a new unit would require the addition


of a student details to complete the row.

Compiled by E.Maina SCT 307 Slides 6


Deletion Anomalies
A deletion anomaly occurs when the deletion of a single piece of data results
in a loss of valid data on the same row

StudentNo Student Name Unit Code Unit Name


0972343 Eric Cartman SCT1207 Systems & Database Design
0982342 Kyle Broflowski SCTI2441 Application Development
2013442 Stan Marsh SCTI2441 Application Development
3992342 Kenny McCormack SCTI2441 Application Development
2303232 Wendy Testabutger SCTI2441 Application Development

To remove the student 0972343 will remove all information


of the unit SCT1207 thus this table suffers from a deletion
anomaly.
Compiled by E.Maina SCT 307 Slides 7
Update Anomalies
An update anomaly occurs when the same information is stored multiply in one
table and thus any update to that information requires multiple changes.

StudentNo Student Name Unit Code Unit Name


0972343 Eric Cartman SCT1207 Systems & Database Design
0982342 Kyle Broflowski SCTI2441 Application Development
2013442 Stan Marsh SCTI2441 Application Development
3992342 Kenny McCormack SCTI2441 Application Development
2303232 Wendy Testabutger SCTI2441 Application Development

To rename the unit code SCTI2441 to SCT1208 would require multiple


updates.

Compiled by E.Maina SCT 307 Slides 8


Keys

• The term “key” refers to attributes in any data


set used to:
– identify a row uniquely in a given data set - known as
a Primary Key.
– link two columns in two or more separate data sets
together - known as a Foreign Key.
---- The identification of the correct keys is almost the sole purpose
of normalisation; as if the keys are identified correctly then
there is no chance of any of the three previously mentioned
side effects occurring.

Compiled by E.Maina SCT 307 Slides 9


Primary Keys
• An attribute/s in the data set used to uniquely
identify a row.
• Usually denoted by underlining (or bold) the name of
the primary key attribute/s, e.g.,
(StudentNo, Student Name, Address)
• Once primary keys have been declared, duplicate
information is not allowed in the primary key
column/s.

Compiled by E.Maina SCT 307 Slides 10


Compound (Primary) Keys
If a primary key consists of multiple columns, it is collectively known as a
“compound key”.

StudentNo Student Name Unit Code Unit Name

0972343 Eric Cartman SCT1207 Systems & Database Design


0982342 Kyle Broflowski SCT1207 Systems & Database Design
2013442 Stan Marsh SCTI2441 Application Development
0972343 Eric Cartman SCTI2441 Application Development
0982342 Kyle Broflowski SCTI2441 Application Development
In this example, two columns are needed to uniquely identify a row, thus
(StudentNo, Student Name, Unit Code, Unit Name)

Compiled by E.Maina SCT 307 Slides 11


Foreign Key

• An attribute/s in a data set that link/s to


the primary key of a related data set.
– Unlike primary keys, foreign keys are not
required to contain unique values and exist
independently of each other.
• Foreign keys are usually denoted by
italicizing the name of their attributes.

Compiled by E.Maina SCT 307 Slides 12


Relationship between P- & F-Keys
An F-key usually has the same attribute/s of the P-key of related table:
Student = (StudentNo, Student Name, Address, DOB) Unit = (Unit Code, Unit Name)
StudentNo Student Name Address DOB Unit Code Unit Name
0972343 Eric Cartman 1 Normalization Rd 12/10/1989 SCT1207 Systems & Database Design

0982342 Kyle Broflowski 12 Smith Street 30/03/1986 SCT1209 Application development

2013442 Stan Marsh 2A Evergreen Terrace 18/09/1990 SCT2441 SCTe Appreciation

0972343 Eric Cartman 42 Hitch Avenue 05/05/1985 SCT1234 Tavern Studies

Enrolment = (EnrolmentNo, StudentNo, Unit Code, Semester, Year)


EnrolmentN StudentN Unit Code Semester Year
o o
The enrolment data set contains
1 0972343 SCT1207 1 2009
the duplicate columns StudentNo
2 0982342 SCT1207 1 2009
and Unit Code which have been
2 0982342 SCT1209 1 2009
italicised (denoting them as foreign
3 2013442 SCT2441 1 2009
keys).
4 0972343 SCT1234 2 2009
Compiled by E.Maina SCT 307 Slides 13
Normalisation Process

• The normalisation process follows a standard


series of steps
– Gather the un-normalised data set (0NF)
– Convert to first normal form (1NF)
– Convert to second normal form (2NF)
– Convert to third normal form (3NF)
– (Could do more)

NOTE: Unless each step is carried out properly the next step will be
flawed, i.e., unless a data set is in first normal form it will never be in
a valid second or third normal form.

Compiled by E.Maina SCT 307 Slides 14


Stages of Normalisation

Unnormalised
(0NF)
Remove repeating groups
First normal form
(1NF)
Remove partial dependencies
Second normal form
(2NF)

Remove transitive dependencies


Third normal form
(3NF)
Remove remaining
dependency
Further normal forms

Compiled by E.Maina SCT 307 Slides 15


Normalization Process
• 2NF is better than 1NF; 3NF is better than 2NF
• For most business database design purposes,
3NF is higher we need to go in the
normalization process
• Highest level of normalization is not always
most desirable

Compiled by E.Maina SCT 307 Slides 16


Gather un-normalised data set
• Represents logically related data in levels – data of
lower levels are in nested format, called repeating
groups
– A repeating group is a set of attributes that can have more than one
value for a Primary key
• This step is often rushed, but is perhaps the most
critical of the entire normalisation process.
– If the un-normalised data set contains errors then it is more than likely
that these errors will be carried throughout the entire normalisation
resulting in possible data anomalies.

Compiled by E.Maina SCT 307 Slides 17


Gather un-normalised data set cont…
For example the un-normalised data set for
StudentNo Student Name Unit Code Unit Name

0972343 Eric Cartman SCT1207 Systems & Database Design


0982342 Kyle Broflowski SCT1207 Systems & Database Design
2013442 Stan Marsh SCTI2441 Application Development
0972343 Eric Cartman SCTI2441 Application Development
0982342 Kyle Broflowski SCTI2441 Application Development

Would be: Repeating group

R1 = (StudentNo, Student Name, {Unit Code, Unit Name})


or R2 = (Unit Code, Unit Name, {StudentNo, Student Name})
A repeating group is shown by a pair of brackets within the relational schema.

Compiled by E.Maina SCT 307 Slides 18


Gather un-normalised data set cont…
R1 = (StudentNo, Student Name, {Unit Code, Unit Name}) represents:
StudentNo Student Name Unit Code Unit Name
SCT1207 Systems & Database Design
0972343 Eric Cartman
SCTI2441 Application Development
SCT1207 Systems & Database Design
0982342 Kyle Broflowski
SCTI2441 Application Development
2013442 Stan Marsh SCTI2441 Application Development

R2 = (Unit Code, Unit Name, {StudentNo, Student Name}) represents:


Unit Code Unit Name StudentNo Student Name
0972343 Eric Cartman
SCT1207 Systems & Database Design
0982342 Kyle Broflowski
2013442 Stan Marsh
SCTI2441 Application Development 0972343 Eric Cartman
0982342 Kyle Broflowski

Compiled by E.Maina SCT 307 Slides 19


Gather un-normalised data set cont…

The following principles should be applied to


ensure an easier transition throughout the
process:
– Items that are likely primary keys should be
placed to the left of their dependant items.
– Repeating groups are usually placed to the right
in an un-normalised data set.

Compiled by E.Maina SCT 307 Slides 20


Gather un-normalised data set cont…
Example 2: Pilot Monthly Flight Log Report
Pilot Number: 410-26-4111 Name: Biggles Barton
Date Flown Aircraft No. Aircraft Type Hours Flown
03-Aug-08 0776 Boeing 757 10.1
04-Aug-08 7628 Boeing 757 7.5
23-Aug-08 7448 Boeing 767 4.5
25-Aug-08 7448 Boeing 767 4.5
7628 A340 5.0
28-Aug-08 2342 C-30 3.6

R3 = (Pilot Number, Name, {Date Flown, Aircraft Number, Aircraft Type, Hours Flown})
Single attribute dataset should be excluded, thus R4 is not a
Or
good choice
R4 = (Pilot Number, Name, {Date Flown, {Aircraft Number, Aircraft Type, Hours Flown}})

Compiled by E.Maina SCT 307 Slides 21


Gather un-normalised data set cont…

Issues:
 Single attribute/column dataset should not
appear in any repeating group/s
(i.e., 2nd dataset R3, not R4, in last slide)
 Calculated field should be excluded from any
repeating groups
 They cause data redundancy, if included;
 The data can be calculated/generated at the time when final report is
displayed/printed;

Compiled by E.Maina SCT 307 Slides 22


Gather un-normalised data set cont…

Example 3: Part sale report


InvoiceNo 12345 CustomerNo 78901
: : Fred Bloggs
Customer:
Date: 29/5/08 Contact Ph: 9370 6111
Address: 3 Uphill Rise, Ferndale, WA
6303
ItemNo Description Qty Unit Price GST Code GST Rate Tax Subtotal
9898 Bearing, Ball 25 $2.50 1 10% $0.25 $68.75
9999 Bearing, Roller 10 $5.00 1 10% $0.5 $55.00
8888 Seal, shaft 10 $3.00 1 10% $0.30 $33.00
777 Glasses, Safety 10 $10.00 0 0% $0.00 $100.00
1555 Punch, 5mm 1 $4.00 1
Total: $261.1510% $0.40 $4.40
 GST Rate, Tax, Subtotal and Total are calculated fields, thus should be
excluded from the dataset. Thus, dataset would be:
R5 = (InvoiceNo, Date, CustomerNo, Customer Name, Contact Ph,
Address, {ItemNo, Description, Qty, Unit Price, GST Code})

Compiled by E.Maina SCT 307 Slides 23


Gather un-normalised data set cont…
Example 4: Construction company managing several projects,
whose charges are dependent on employees’ position
Project Project Name Employee Employee Name Rate Rate
No. No. category
1203 Madagascar travel site 11 Jessica Brookes A $90
12 Andy Evans B $80
16 Max Fat C $70
1506 Online estate agency 11 Jessica Brookes A $90
17 Alex Branton B $80

Thus the Un-normalized dataset:


R1=(Project No., Project Name, {Employee No., Employee
Name, Rate Category, Rate})

Compiled by E.Maina SCT 307 Slides 24


Normalisation Process

• The normalisation process follows a


standard series of steps
– Gather the un-normalised data set (0NF)
• (Covered last week)
– Convert to first normal form (1NF)
– Convert to second normal form (2NF)
– Convert to third normal form (3NF)
Note: We could do higher normal forms, e.g., 4NF and 5NF,
etc, but they are more theoretical valued )

Compiled by E.Maina SCT 307 Slides 25


First Normal Form (1NF)

• 1NF: A relation is in 1NF if and only if all its underlying


attributes contain atomic values only (i.e., no repeating
groups)
• Once we have identified the un-normalised data set we
must convert it into first normal form (1NF).
• This is achieved through:
1) the removal of any repeating groups in the un-normalised
data set
2) the identification of primary keys in any resultant data
sets.
3) the introduction of foreign keys when datasets split

Compiled by E.Maina SCT 307 Slides 26


Steps from 0NF to 1NF
• How to split dataset with repeating group/s
1) Identify a P-key for the outermost dataset and each level of (nested)
repeating groups;
R1 = (StudentNo, Student Name, {Unit Code, Unit Name})
2) Remove the outermost repeating group (and any nested repeating
groups it may contain) and create a new relation (R11) to contain it;
R11 = (Unit Code, Unit Name)
2) Then add to this relation a copy of the P-Key of the relation
immediately enclosing it, as its F-key.
R12 = (StudentNo, Student Name)
R11 = (StudentNo, Unit Code, Unit Name)

Compiled by E.Maina SCT 307 Slides 27


Steps from 0NF to 1NF cont…

3) If the added F-keys do not depend on the P-key, re-define the P-


key by including the F-key attribute/s as part of the P-key (thus a
compound key);
R12 = (StudentNo, Student Name)
R11 = (StudentNo, Unit Code, Unit Name)

4) For each resultant dataset, repeat step 1) through to 3) until no


more repeating groups can be found.

Compiled by E.Maina SCT 307 Slides 28


First Normal Form Cont…
Outermost repeating group
Summary on the Example:

R1 = (StudentNo, Student Name, {Unit Code, Unit Name})


Becomes
R11 = (StudentNo, Student Name)
R12 = (StudentNo, Unit Code, Unit Name)
As there are no more remaining repeating groups the data set is now in 1NF.

Compiled by E.Maina SCT 307 Slides 29


First Normal Form Cont…

Example 2: The pilot flight log report example


0NF (Un-normalised normal form):
R1 = (Pilot Number, Name, {Date Flown, Aircraft Number, Aircraft Type, Hours Flown})

1NF (First Normal Form):


R11 = (Pilot Number, Name)
R12 = (Pilot Number, Date Flown, Aircraft Number, Aircraft Type, Hours Flown)

There are no more remaining repeating groups thus the data set is now in
1NF.

Compiled by E.Maina SCT 307 Slides 30


First Normal Form Cont…
Example 2 revisited: Why single attribute datasets should be excluded
Now look at the other form of 0NF we got for the same example
R1 = (Pilot Number, Name, {Date Flown, {Aircraft Number, Aircraft Type, Hours Flown}})

Becomes
R11 = (Pilot Number, Name)
R12 = (Pilot Number, Date Flown, {Aircraft Number, Aircraft Type, Hours Flown})
Then
R12 = (Pilot Number, Date Flown, {Aircraft Number, Aircraft Type, Hours Flown})
Becomes
R121 = (Pilot Number, Date Flown)
R122 = (Pilot Number, Date Flown, Aircraft Number, Aircraft Type, Hours Flown)

Compiled by E.Maina SCT 307 Slides 31


First Normal Form Cont…

Compiled by E.Maina SCT 307 Slides 32


First Normal Form Cont…

In Summary:
0NF (Un-normalised):
R1 = (Pilot Number, Name, {Date Flown, {Aircraft Number, Aircraft Type, Hours Flown}})
1NF (First Normal Form):

R11 = (Pilot Number, Name)


R121 = (Pilot Number, Date Flown)
R122 = (Pilot Number, Date Flown, Aircraft Number, Aircraft Type, Hours Flown)
Note: R12 is left out as it was not in 1NF and was further normalised into R121 and R122.

However this solution raises a common problem, where one of the data sets (e.g., R121) IS
PART OF another dataset (e.g., R122). This indicated a flaw in the normalisation process.
This problem can be solved by excluding the single attribute dataset in R1, and redo
normalisation.
Compiled by E.Maina SCT 307 Slides 33
Relational Symbol Notation
• We use relational symbol notation, e.g., R1, R2 … to
represent data sets (or relations), rather than relation
name/s, during normalization process
• When a relation (or dataset ), say R1, is split into two or
more datasets, the new datasets will be named as R11, R12,

– For each splitting, add one more level of subscript/s to the right of
R1. indicate they all originated from R1
– Similarly, if R1223 is split, then resultant datasets will be named
R12231, R12232, R12233, … – This makes it easy for us to track back
the dataset splitting when we find anything wrong in some late
stage of the normalization process.

Compiled by E.Maina SCT 307 Slides 34


Second Normal Form – 2NF

• 2NF: A relation is in 2NF if and only if it is in 1NF and every


non-key attribute is fully dependent on the (entire) primary
key
• Once in 1NF the resulting data sets are then revised further
to upgrade them to 2NF
• This is done by removing all partial functional dependencies.
• Partial functional dependency
– where an attribute is not wholly-dependant on the primary key (i.e.,
it depends only on part of the primary key).
– may only occur in data sets that have more than one primary key
attribute (i.e. a compound key).

Compiled by E.Maina SCT 307 Slides 35


Steps from 1NF to 2NF

1) Find all partial functional dependencies;


e.g., R12 = (StudentNo, Unit Code, Unit Name)
2) Remove the offending attributes that are only partially dependent on the
compound key, and place them in a new relation.
e.g., R121 = (Unit Name)
3) Add to this relation a copy of the attribute/s which are determinants of
these offending attributes. These will automatically become the P-key of the
new relation
e.g., R121 = (Unit Code, Unit Name)
4) Do step 2) through to 3) for each partial functional dependency
5) The original P-key and other remaining attributes in the original dataset
form another (new) relation, in which each determinant becomes a F-key.
e.g., R122 = (StudentNo, Unit Code)

Compiled by E.Maina SCT 307 Slides 36


Second Normal Form Cont…
Example: Now Looking at the Enrolment example (in 1NF):

R11 = (StudentNo, Student Name)


R12 = (StudentNo, Unit Code, Unit Name)
R11 is already in 2NF. However in R12 Unit Name is wholly dependant on the Unit Code,
and has no dependency on StudentNo (i.e., partial dependency exists). This means that
R12 is not in 2NF and needs to be altered creating two new sets as follows:

R121 = (StudentNo, Unit Code)


R122 = (Unit Code, Unit Name)

Compiled by E.Maina SCT 307 Slides 37


Second Normal Form Cont…

Thus 2NF (Second Normal Form) is:


R11 = (StudentNo, Student Name)
R121 = (StudentNo, Unit Code)
R122 = (Unit Code, Unit Name)

Compiled by E.Maina SCT 307 Slides 38


Second Normal Form Cont…
Pilot example (in 1NF):
R11 = (Pilot Number, Name)
R12 = (Pilot Number, Date Flown, Aircraft Number, Aircraft Type, Hours Flown)
ARE THERE ANY PARTIAL FUNCTIONAL DEPENDANCIES???

Date Flown Aircraft Aircraft Type Hours Flown


No.
03-Aug-08 0776 Boeing 757 10.1
04-Aug-08 7628 Boeing 757 7.5
23-Aug-08 7448 Boeing 767 4.5
25-Aug-08 7448 Boeing 767 4.5
7628 A340 5.0
28-Aug-08 2342 C-30 3.6

Aircraft Type is dependent on Aircraft No.

Compiled by E.Maina SCT 307 Slides 39


Second Normal Form Cont…

R11 doesn’t have a compound key so it is already in 2NF.


However there is a partial dependency in R12 between Aircraft Type and Aircraft
Number. Thus we create:
R121 = (Pilot Number, Date Flown, Aircraft Number, Hours Flown)
R122 = (Aircraft Number, Aircraft Type)

And we get a 2NF of:


R11 = (Pilot Number, Name)
R121 = (Pilot Number, Date Flown, Aircraft Number, Hours Flown)
R122 = (Aircraft Number, Aircraft Type)
Compiled by E.Maina SCT 307 Slides 40
Third Normal Form – (3NF)

• 3NF: A relation is in 3NF if and if only it is in 2NF and every


non-key attribute is mutually independent
• 3NF step removes all transitive (or hidden)
dependencies between non-key attributes.
– A transitive/hidden dependency exists where one or more
non-key attributes are also more/wholly dependant on
another non-key attribute(s), not just on the designated
primary key.
NOTE: Often 3NF is achieved without having to change any of the
existing data sets.

Compiled by E.Maina SCT 307 Slides 41


Steps from 2NF to 3NF

1) Find all partial functional dependencies;


2) Remove the offending attributes that are transitively
dependent on the non-key attribute(s), and place them in a
new relation.
3) Add to this relation a copy of the attribute/s which are
determinants of these offending attributes. These will
automatically become the P-key of this new relation
4) Do step 2) through to 3) for transitive dependency
5) The original P-key and other remaining attributes in the
original dataset form another (new) relation, in which
each determinant becomes a F-key.

Compiled by E.Maina SCT 307 Slides 42


Third Normal Form Cont…

Looking at the enrolment example:


R11 = (StudentNo, Student Name)
R121 = (StudentNo, Unit Code)
R122 = (Unit Code, Unit Name)

There are NO transitive dependencies thus it is already in 3NF.

Compiled by E.Maina SCT 307 Slides 43


Third Normal Form Cont…

Similarly the pilot example is also in 3NF:

R11 = (Pilot Number, Name)


R121 = (Pilot Number, Date Flown, Aircraft Number, Hours Flown)
R122 = (Aircraft Number, Aircraft Type)

Compiled by E.Maina SCT 307 Slides 44


Third Normal Form Cont…

Trying a new example from scratch create an un-normalised data set for:

Driver Number: 41 Driver Name: Eddie Vedder:

Race Race Car Car Car Car Owner


date Number Number Class Limit Owner Address
06 Jun 3 2476 Sedan 1800cc T Barnes 3 Bradford St
06 Jun 5 1973 Touring 2200cc J Gaden 12 Bourke St
13 Jun 1 2997 Touring 2200cc B Mills 24 Alexander Dr
13 Jun 4 1774 Rally 1100cc J Gaden 12 Bourke St
20 Jun 2 2476 Sedan 1800cc T Barnes 3 Bradford St

Compiled by E.Maina SCT 307 Slides 45


Third Normal Form Cont…

You should have got:

R1 = (Driver-Nr, Driver-Name, {Race-Date, Race-Nr, Car-Nr,


Car-Class, Class-Limit, Owner, O-Address})

Now attempt to convert into 1NF & 2NF

Compiled by E.Maina SCT 307 Slides 46


Third Normal Form Cont…

1NF:

R11 = (Driver-Nr, Driver-Name)


R12 = (Driver-Nr, Race-Nr, Race-Date, Car-Nr, Car-Class,
Class-Limit, Owner, O-Address)
2NF:

R11 = (Driver-Nr, Driver-Name)


R121= (Driver-Nr, Race-Nr, Car-Nr, Car-Class, Class-Limit,
Owner, O-Address)
R122 = (Race-Nr, Race-Date)

Compiled by E.Maina SCT 307 Slides 47


Third Normal Form Cont…

Compiled by E.Maina SCT 307 Slides 48


Third Normal Form Cont…

Compiled by E.Maina SCT 307 Slides 49


Third Normal Form Cont…

Resulting in 3NF of:


R11 = (Driver-Nr, Driver-Name)
R1211 = (Driver-Nr, Race-Nr, Car-Nr)
R12121 = (Car-Nr, Car-Class, Owner)
R12122 = (Car-Class, Class-Limit)
R12123 = (Owner, O-Address)
R122 = (Race-Nr, Race-Date)

Compiled by E.Maina SCT 307 Slides 50


Name the Resultant Data sets
• The final step in this process is to give each of the final
data sets a meaningful name.
• While this step is not mandatory it is useful especially
when we cover converting normalised data sets into E-R
modelling (see lecture in next week).
• The naming of the data sets is based on the type of data
they will be used to store.
• At first you may find it hard to think of a proper name but
through experience you will begin to see trends and gain a
better ‘feel’ for the purpose of each data set.

Compiled by E.Maina SCT 307 Slides 51


Name Resultant Data sets Cont…
For Student-unit example:

R11 = (StudentNo, Student Name)  Student


R121 = (StudentNo, Unit Code)  Enrolment
R122 = (Unit Code, Unit Name)  Unit

Thus makes relation schema:

Student (StudentNo, Student Name)


Enrolment (StudentNo, Unit Code)
Unit (Unit Code, Unit Name)

Compiled by E.Maina SCT 307 Slides 52


Name Resultant Data sets Cont…
For Pilot example:

R11 = (Pilot Number, Name)  Pilot


R121 = (Pilot Number, Aircraft Number,
Date Flown, Hours Flown)  Flight
R122 = (Aircraft Number, Aircraft Type)  Aircraft
Thus makes relation schema:

Pilot (Pilot Number, Name)


Flight ((Pilot Number, Aircraft Number, Date Flown, Hours Flown)
Aircraft (Aircraft Number, Aircraft Type)

Compiled by E.Maina SCT 307 Slides 53


Name Resultant Data sets Cont…
Now try the car racing problem on your own…
R11 = (Driver-Nr, Driver-Name)  ???
R1211 = (Driver-Nr, Race-Nr, Car-Nr)  ???
R12121 = (Car-Nr, Car-Class, Owner)  ???
R12122 = (Car-Class, Class-Limit)  ???
R12123 = (Owner, O-Address)  ???
R122 = (Race-Nr, Race-Date)  ???

Compiled by E.Maina SCT 307 Slides 54


Name Resultant Data sets Cont…

R11 = (Driver-Nr, Driver-Name)  Driver


R1211 = (Driver-Nr, Race-Nr, Car-Nr)  Race Entry
R12121 = (Car-Nr, Car-Class, Owner)  Car
R12122 = (Car-Class, Class-Limit)  Car Class
R12123 = (Owner, O-Address)  Owner
R122 = (Race-Nr, Race-Date)  Race

Compiled by E.Maina SCT 307 Slides 55


Summary

• Normalisation steps:
1. Gather the un-normalised data set (covered in week 01)
2. Remove the repeating groups and identify keys (1NF)
3. Remove all partial functional dependencies (2NF)
4. Remove all transitive dependencies (3NF)
5. Name the resultant data sets
Note: It is important to note that these steps MUST be
performed in order to ensure that the correct result is
reached.

Compiled by E.Maina SCT 307 Slides 56


Discussion Questions
• See attached word document which has some
normalization problems

Compiled by E.Maina SCT 307 Slides 57

You might also like