Normalisation in DataBase
Normalisation in DataBase
• Normalization
7
A large database defined as a single relation may result in data duplication. This
repetition of data may result in:
• Making relations very large.
• It isn't easy to maintain and update data as it would involve searching many records in
relation.
• Wastage and poor utilization of disk space and resources.
• The likelihood of errors and inconsistencies increases.
8
Normalization
So to handle these problems, we should analyze and decompose the relations with
redundant data into smaller, simpler, and well-structured relations that are satisfy
desirable properties.
9
Normalization
• Normalization is the process of organizing the data in the database.
• Normalization divides the larger table into smaller and links them using relationships
• The normal form is used to reduce redundancy from the database table.
10
Data modification anomalies
Data modification anomalies can be categorized into three types:
• Insertion Anomaly: Insertion Anomaly refers to when one cannot insert a new tuple
into a relationship due to lack of data.
• Deletion Anomaly: The delete anomaly refers to the situation where the deletion of
data results in the unintended loss of some other important data.
• Updatation Anomaly: The update anomaly is when an update of a single data value
requires multiple rows of data to be updated.
11
Types of Normal Forms:
• Normalization works through a series of stages called Normal forms. The normal forms
apply to individual relations. The relation is said to be in particular normal form if it
satisfies constraints.
12
Types of Normal Forms:
Normal Form Description
• Network Model: The network database model allows each child to have multiple
1NF parents. A relation is in 1NF if it contains an atomic value.
• It helps you to address the need to model more complex relationships like as the
2NF A relation will be in 2NF if it is in 1NF and all non-key attributes are fully
orders/partsfunctional
many-to-many relationship.
dependent on the primary key.
• In this model, entities are organized in a graph which can be accessed through several
3NF paths. A relation will be in 3NF if it is in 2NF and no transition dependency exists.
13
Types of Normal Forms:
Normal Form Description
• Network Model: The network database model allows each child to have multiple
4NF parents. A relation will be in 4NF if it is in Boyce Codd's normal form and has no
multi-valued dependency.
• It helps you to address the need to model more complex relationships like as the
orders/parts many-to-many relationship.
• In this model, entities are organized in a graph which can be accessed through several
5NF A relation is in 5NF. If it is in 4NF and does not contain any join dependency,
paths. joining should be lossless
14
Advantages of Normalization
•Normalization helps to minimize data redundancy.
15
Disadvantages of Normalization
• You cannot start building the database before knowing what the user needs.
• The performance degrades when normalizing the relations to higher normal forms, i.e.,
4NF, 5NF.
16
First Normal Form (1NF)
• A relation will be 1NF if it contains an atomic value.
• It states that an attribute of a table cannot hold multiple values. It must hold only
single-valued attribute.
• First normal form disallows the multi-valued attribute, composite attribute, and their
combinations.
17
EMPLOYEE table:
14 John 7272826385, UP
9064738238
18
The decomposition of the EMPLOYEE table into 1NF has been shown below:
•MP_ID EMP_NAME
Entity: An Entity may EMP_PHONE
be an object with a physical EMP_STATE
existence – a particular person, car,
house, or employee – or it may be an object with a conceptual existence – a company, a
14 John 7272826385 UP
job, or a university course.
14 John 9064738238 UP
• Attributes: Attributes are the properties which define the entity type. For example,
20Roll_No, Name, DOB, Age, Address, Mobile_No
Harry are the attributesBihar
8574783832 which defines entity
type Student. In ER diagram, attribute is represented by an oval.
12 Sam 7390372389 Punjab
• Relationalship: One to one, one to many and many to one
12 Sam 8589830302 Punjab
19
Second Normal Form (2NF)
• In the 2NF, relational must be in 1NF.
• In the second normal form, all non-key attributes are fully functional dependent
on the primary key
• Example: Let's assume, a school can store the data of teachers and the subjects
they teach. In a school, a teacher can teach more than one subject.
20
TEACHER table
• TEACHER_ID
Attribute: Each column inSUBJECT
a Table. Attributes are the TEACHER_AGE
properties which define a
relation. e.g., Student_Rollno,
25 NAME,etc.
Chemistry 30
•25
Tables – In the Relational Biology
model the, relations are saved
30 in the table format. It is
stored along with its entities. A table has two properties rows and columns. Rows
47 English 35
represent records and columns represent attributes.
83 Math 38
• Tuple – It is nothing but a single row of a table, which contains a single record.
83 Computer 38
21
Relational Model Concepts
• To convert the given table into 2NF, we decompose it into two tables:
• TEACHER_DETAIL table:
TEACHER_ID TEACHER_AGE
25 30
47 35
83 38
22
TEACHER_SUBJECT table:
TEACHER_ID SUBJECT
25 Chemistry
25 Biology
47 English
83 Math
83 Computer
23
Third Normal Form (3NF)
• A relation will be in 3NF if it is in 2NF and not contain any transitive partial
dependency.
• 3NF is used to reduce the data duplication. It is also used to achieve the data
integrity.
• If there is no transitive dependency for non-prime attributes, then the relation
must be in third normal form.
• A relation is in third normal form if it holds atleast one of the following conditions
for every non-trivial function dependency X → Y.
1.X is a super key.
2.Y is a prime attribute, i.e., each element of Y is part of some candidate key.
24
Example:
Employee_detail table:
EMP_ID EMP_NAME EMP_ZIP EMP_STATE EMP_CITY
25
Employee_detail table:
• Super key in the table :
26
EMPLOYEE table:
•EMP_ID
Isolation EMP_NAME EMP_ZIP
• This property ensures that multiple transactions can occur concurrently without
leading to the inconsistencyHarry
222 of database state. 201010
• Transactions occur independently without interference.
333 Stephantransaction will not
• Changes occurring in a particular 02228
be visible to any other
transaction until that particular change in that transaction is written to memory or
444
has been committed. Lan 60007
• This property ensures that the execution of transactions concurrently will result in
555
a state that is equivalent toKatharine
a state achieved these were06389
executed serially in some
order.
666 John 462007
27
THANK YOU
28