0% found this document useful (0 votes)
21 views

Database Normalization

The document discusses database normalization and different normal forms. It explains anomalies that can occur in databases and how normalization addresses them by breaking tables into smaller, less redundant tables through a multi-step process from 1NF to 3NF. The goal of normalization is to minimize redundancy and inconsistencies in databases.

Uploaded by

Hasitha Sanjaya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

Database Normalization

The document discusses database normalization and different normal forms. It explains anomalies that can occur in databases and how normalization addresses them by breaking tables into smaller, less redundant tables through a multi-step process from 1NF to 3NF. The goal of normalization is to minimize redundancy and inconsistencies in databases.

Uploaded by

Hasitha Sanjaya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 32

DATABASE

-NORMALIZATION
M.Y.A RAHMAN
ANOMALIES

• ANOMALY IS AN ERROR/INCONSISTENCY THAT MAY RESULT WHEN


A USER ATTEMPTS TO UPDATE A TABLE WITH REDUNDANT DATA(NO
LONGER NEEDED DATA).
• INSERTION ANOMALY: INABILITY TO ADD DATA TO THE DATABASE DUE TO
ABSENCE OF RELATED DATA.
• EG :NOT ALLOWED TO INSERT A NEW COURSE WITHOUT HAVING A STUDENT
FOLLOWING A COURSE
• UPDATE ANOMALY: DATA INCONSISTENCY RESULTING FROM DATA
REDUNDANCY AND PARTIAL UPDATE WHEN CHANGING A DATA ITEM NEED
TO CHANGE ALL THE OCCURRENCES OF THAT DATA ITEM
• EG : WHEN CHANGING THE COURSE TITLE COMPUTER ARCHITECTURE THAT
CHANGE MUST BE DONE TO ALL THE OCCURRENCE.

• DELETION ANOMALY: UNINTENDED LOSS OF DATA DUE TO DELETION OF


RELATED DATA.
• EG : WHEN DELETING STUDENT 12 DETAILS OF ACCOUNTING COURSE WILL
BE REMOVED FROM THE DATABASE(THAT STUDENT CAN BE FOLLOWING
MORE COURSES AND THAT DATA WILL BE DELETED TOO)
• INSERT ANOMALY:
• STUDENT DETAILS CAN NOT BE ADDED UNTIL A STUDENT STARTS TO FOLLOW A
COURSE.
• IF A NEW EMPLOYEE IS HIRED BUT NOT IMMEDIATELY ASSIGNED TO A
STUDENT_GROUP THEN THIS EMPLOYEE COULD NOT BE ENTERED INTO THE
• UPDATE ANOMALY:
DATABASE
• STUDENT DETAILS CAN NOT BE ADDED UNTIL A STUDENT STARTS TO FOLLOW A COURSE.
• IF A NEW EMPLOYEE IS HIRED BUT NOT IMMEDIATELY ASSIGNED TO A STUDENT_GROUP
THEN THIS EMPLOYEE COULD NOT BE ENTERED INTO THE DATABASE

• INSERT ANOMALY:
• STUDENT DETAILS CAN NOT BE ADDED UNTIL A STUDENT STARTS TO FOLLOW A COURSE.
• IF A NEW EMPLOYEE IS HIRED BUT NOT IMMEDIATELY ASSIGNED TO A STUDENT_GROUP
THEN THIS EMPLOYEE COULD NOT BE ENTERED INTO THE DATABASE
NORMALIZATION

• NORMALIZATION IS THE PROCESS OF DECIDING WHICH ATTRIBUTES SHOULD BE


GROUPED TOGETHER IN A RELATION.
• IT USUALLY INVOLVES DIVIDING LARGE TABLE IN TO SMALLER(LESS
REDUNDANT) TABLES.
WHY NORMALIZATION?

• AVOID UNNECESSARY DUPLICATION OF DATA.


• REDUCE DATA ANOMALIES
• REDUCE DISK SPACE
• AVOID FREQUENT RESTRUCTURING DATA
• SIMPLER TO MANIPULATE DATA
NORMALIZATION PROCESS
Table with repeating groups(UNF)

1st normal form(1NF)

2nd Normal Form(2NF)

3rd Normal Form(3NF)

Boyce codd Normal Form(BCNF)

4th Normal Form(4NF)

5th Normal Form(5NF)


Student_i Name University Major Course_I Course_tit Lec_name Lec_loc Mark
d D le

12 Aruni colombo Acct LA Com. Roshan X01 55


Archi

AA Accountin Balu Y02 78


g

17 Dasun Mortuwa Com.sc CA Comp. Anjala X01 80


Archi

DM Data Ravindu 052 65


mining

CS Comp. Solomn L03 73


Security
FIRST NORMALIZATION

• A RELATION IS 1NF IF IT DOES NOT CONTAIN REPEATING GROUP OR MULTIVALUED


ATTRIBUTES,
• EACH TABLE CELL SHOULD CONTAIN A SINGLE VALUE.
• EACH RECORD NEEDS TO BE UNIQUE. HENCE,

1. EXAMINE FOR REPEATING GROUPS.


2. REMOVE THE REPEATING GROUP FROM RELATION.
3. CREATE NEW RELATION(S) TO INCLUDE REPEATED DATA.
4. INCLUDE THE 0NF TO THE NEW RELATION
5. DETERMINE KEY OF THE NEW RELATION(S)
FIRST NORMALIZATION CONT..,

• KEY IN SQL IS A VALUE USED TO IDENTIFY RECORDS IN A TABLE UNIQUELY. AN SQL KEY IS A
SINGLE COLUMN OR COMBINATION OF MULTIPLE COLUMNS USED TO UNIQUELY IDENTIFY
ROWS OR TUPLES IN THE TABLE. SQL KEY IS USED TO IDENTIFY DUPLICATE INFORMATION,
AND IT ALSO HELPS ESTABLISH A RELATIONSHIP BETWEEN MULTIPLE TABLES IN THE
DATABASE
• COMPOSITE KEY: A COMPOSITE KEY IS A PRIMARY KEY COMPOSED OF MULTIPLE COLUMNS
USED TO IDENTIFY A RECORD UNIQUELY(STUDENT_ID AND COURSE_ID-> STUDENT ID IS
COMMON FOR ALL THE COURSES A STUDENT FOLLOWS, SO WE NEED THE COURSE ID TO
UNIQUELY IDENTIFY THE RECORD)
• A KEY ATTRIBUTE IS A DISTINCT CHARACTERISTIC OF AN OBJECT OFTEN
SPECIFIED IN TERMS OF THEIR PHYSICAL TRAITS, SUCH AS SIZE, SHAPE, WEIGHT,
AND COLOUR.
• A NON KEY ATTRIBUTE IN SQL SERVER IS A COLUMNS WHICH CAN NOT BE USED
TO IDENTIFY A RECORD UNIQUELY FOR EXAMPLE NAME OR AGE
BY FILLING THE EMPTY CELLS WITH DATA

Student_i Name Universit Major Course_I Course_ti Lec_nam Lec_loc Mark


d y D tle e

12 Aruni Colombo Acct LA Com. Roshan X01 55


Archi

12 Aruni Colombo Acct AA Accountin Balu Y02 78


g

17 Dasun Mortuwa Com.sc CA Comp. Anjala X01 80


Archi

17 Dasun Mortuwa Com.sc DM Data Ravindu z05 65


mining

17 Dasun Mortuwa Com.sc CS Comp. Solomn L03 73


Security
BY SEPARATING TABLE INTO TWO TABLES
• A TABLE CONTAINING SINGLE VALUED ATTRIBUTERS WITH A KEY.
Student_ID Name University Major

STUDENT(STUDENT_ID, NAME, UNIVERSITY, MAJOR)


• A TABLE CONTAINING MULTI VALUED ATTRIBUTE WITH A
COMPOSITE
Student_ID KEY Cour_title Lec_name Lec_loc
Course_ID Mark

FOLLOWER(STUDENT_ID, COURSE_ID, COURSE_TITLE, LEC_NAME,LEC_LOC,MARK)


SECOND NORMAL FORM(2NF)

• A RELATION IS IN 2NF IF THE RELATION IS IN 1NF AND EVERY NON-KEY ATTRIBUTE IS


FULLY FUNCTIONALLY DEPENDS ON THE PRIMARY KEY AND NO ONE KEY ATTRIBUTE IS
FUNCTIONALLY DEPENDS ON JUST A PART OF THE KEY ATTRIBUTE.(PARTIAL DEPENDENCY)
1. IDENTIFY PARTIAL DEPENDENCIES
2. SPLIT IN TO SET OF RELATION WHERE EACH RELATION IS HAVING UNIQUE
IDENTIFICATION
• RULE 1- BE IN 1NF
• RULE 2- SINGLE COLUMN PRIMARY KEY THAT DOES NOT FUNCTIONALLY DEPENDANT ON ANY SUBSET OF
CANDIDATE KEY RELATION
FUNCTIONAL DEPENDENCY
• A VALUE OF AN ATTRIBUTE IN A TUPLE DETERMINES A VALUE OF OTHER
ATTRIBUTE IN THE SAME TUPLE.
EG:- A AND B ARE THE ATTRIBUTES IN A RELATIONAL CALLER R
R(A,B)
B IS FUNCTIONALLY DEPENDS ON A.
• DETERMINANT:- IS AN ATTRIBUTE(S) ON THE LEFT HAND SIDE OF
FUNCTIONAL DEPENDENCY WHICH DETERMINES A THE VALUE OF OTHER
ATTRIBUTES IN A TUPLE
• DEPENDANT:- IS AN ATTRIBUTE(S) ON THE RIGHT HAND SIDE OF
FUNCTIONAL DEPENDENCY THAT DEPENDS ON DETERMINANT
FUNCTIONAL DEPENDENCY
• STUDENT_ID>NAME, UNIVERSITY,MAJOR
• COURSE_ID>COURSE_ID, LEC_NAME, LEC_LOCATION
• STUDENT_ID, COURSE_ID> MARK

Splitting in to relations

• Student(Student_ID,Name, university,Major)
• Course(Course_ID, Course_id, Lec_name, Lec_location)
• Follower(Student_ID, Course_ID, Mark)
TABLE NAME : Table Name : Course
STUDENT
Student Name Univers Major Course_ID Course_title Lec_name Lec_location
_ID ity
CA Comp.Archite A X01
12 Aruni Colomb Acct cture
o
AA Accounting B Y02
17 Dasun Moratu ComSc
wa DM Data Mining C Z05
CS Comp.Securit D L03
Table Name : Follower y
Student Course Mark
_ID _ID
12 CA 55

12 AA 78

17 CA 80

17 DM 65

17 CS 73
THIRD NORMAL FORM(3NF)

• A RELATION IS IN 3NF IF THE RELATION IS IN 2NF AND NO TRANSITIVE DEPENDENCIES EXIST.


• IDENTIFY TRANSITIVE DEPENDENCIES.
• BASED ON THAT SPLIT THE RELATION INTO TWO RELATION.

1. CHECK EACH NON KEY ATTRIBUTE FOR DEPENDENCY AGAINST OTHER NON-KEY FIELDS.
2. REMOVE ATTRIBUTE DEPENDED ON ANOTHER NON-KEY ATTRIBUTE FROM RELATION.
3. CREATE NEW RELATION COMPRISING THE ATTRIBUTE AND NON KEY ATTRIBUTE WHICH IT
DEPENDS ON
4. DETERMINE KEY OF THE NEW RELATION.
TRANSITIVE DEPENDENCY

• TRANSITIVE DEPENDENCY : THIS OCCURS WHEN NON-KEY ATTRIBUTE


FUNCTIONALLY DEPENDS ON NON-KEY ATTRIBUTE. A TRANSITIVE DEPENDENCY
IN A DATABASE IS AN INDIRECT RELATIONSHIP BETWEEN VALUES IN THE SAME
TABLE THAT CAUSES A FUNCTIONAL DEPENDENCY.
• Lec_name->Lec_location
• Course_ID->course_title, Lec_name
SPLITTING IN TWO RELATIONS
• LECTURE(LEC_NAME, LEC_LOCATION)
• COURSE(COURSE_ID, COURSE_TITLE, LEC_NAME)

Table Name - Course Table Name - Lecturer


Course_ID Course_title Lec_Name Lec_Name Lec_Location
CA Com.Archi A A X01
AA Acct B B Y02
DM Data.Mi C C Z05
CS Com.sec D D L03
SUMMARY

Table with repeating groups(UNF)


Removing repeating groups or multi valued attributes

1st normal form(1NF)


Remove partial dependency

2nd Normal Form(2NF)

Remove Transitive dependency

3rd Normal Form(3NF)


Remove remaining anomalies & make every determinant
as a key

Boyce codd Normal Form(BCNF)


1ST NORMAL FORM
Project number Project name Project start

Project Employee Employee Job class Cost Hours on Total


number number name hours project charge
2ND NORMAL FORM  composite key
 Functional dependency
Project Employe Hours
number e on  partial dependency
Number project  table having a many to many
relationships.

Employee name Job class Cost hours

Note: Total charge can be omitted as it is a derived attribute.


3RD NORMAL FROM
Employee Employee Hours on Job class
number name project

Job class Cost of hours


END

You might also like