0% found this document useful (0 votes)
2 views68 pages

Module2(ADBMS) (5)

The document discusses database design, focusing on normalization to reduce data redundancy and eliminate anomalies such as insertion, deletion, and updation issues. It outlines various normal forms, including Third Normal Form (3NF) and Boyce-Codd Normal Form (BCNF), and explains functional dependencies, including full, partial, and multivalued dependencies. The document emphasizes the importance of proper database structure for efficiency and data integrity.

Uploaded by

Haf hafeefa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views68 pages

Module2(ADBMS) (5)

The document discusses database design, focusing on normalization to reduce data redundancy and eliminate anomalies such as insertion, deletion, and updation issues. It outlines various normal forms, including Third Normal Form (3NF) and Boyce-Codd Normal Form (BCNF), and explains functional dependencies, including full, partial, and multivalued dependencies. The document emphasizes the importance of proper database structure for efficiency and data integrity.

Uploaded by

Haf hafeefa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 68

MODULE II

DATABASE DESIGN
● Database normalization is the process of organizing the attributes of the
database to reduce or eliminate data redundancy (having the same data
but at different places).
● Data redundancy unnecessarily increases the size of the database as the
same data is repeated in many places. Inconsistency problems also arise
during insert, delete, and update operations.
● In the relational model, there exist standard methods to quantify how
efficient a databases is. These methods are called normal forms and
there are algorithms to covert a given database into normal forms.
● Normalization generally involves splitting a table into multiple ones
which must be linked each time a query is made requiring data from the
The primary objective for normalizing the relations is to eliminate the below
anomalies.

● Insertion Anomalies: Insertion anomalies occur when it is not possible to


insert data into a database because the required fields are missing or
because the data is incomplete. For example, if a database requires that
every record has a primary key, but no value is provided for a particular
record, it cannot be inserted into the database.
● Deletion anomalies: Deletion anomalies occur when deleting a record from
a database and can result in the unintentional loss of data. For example, if a
database contains information about customers and orders, deleting a
customer record may also delete all the orders associated with that
customer.
● Updation anomalies: Updation anomalies occur when modifying data in a
Redundant Data Explanation:

● The branch, hod, and office_tel columns have the same values (CSE, Mr. X, 53337) for all students.
● This repetition indicates data redundancy because the information about the branch, HOD, and office contact
number is stored multiple times.

How to Fix It?

This can be normalized into two tables:

1. Students Table (rollno, name, branch_id)


2. Branch Table (branch_id, branch, hod, office_tel)

This structure will avoid redundancy and improve database efficiency.


Insertion Anomaly: If a tuple is inserted in referencing relation and referencing attribute value is
not present in referenced attribute, it will not allow insertion in referencing relation.
OR
An insertion anomaly occurs when adding a new row to a table leads to inconsistencies.
Example:If we try to insert a record into the STUDENT_COURSE table with STUD_NO = 7, it will not
be allowed because there is no corresponding STUD_NO = 7 in the STUDENT table.
Deletion and Updation Anomaly: If a tuple is deleted or updated from
referenced relation and the referenced attribute value is used by
referencing attribute in referencing relation, it will not allow deleting the
tuple from referenced relation.
Example: If we want to update a record from STUDENT_COURSE with
STUD_NO =1, We have to update it in both rows of the table.

If we try to delete a record from the STUDENT table with STUD_NO = 1, it


will not be allowed because there are corresponding records in the
STUDENT_COURSE table referencing STUD_NO = 1.
Deleting the record would violate the foreign key constraint, which
ensures data consistency between the two tables.
x=2 then y=1
x=2 then y=? (x cannot be determined the y value). In this case we
can say there is no functional dependency.That is if X value is same ,Y
value also should same.

How to define functional dependency?

FD X -> Y,then

If t1.X = t2.X (if t1.X != t2.X) ,then no need of


checking Y value.

Then t1.Y= t2.Y


Full Functional Dependency and Partial Functional Dependency

Fully Functional Dependency :


If X and Y are an attribute set of a relation, Y is fully functional dependent
on X, if Y is functionally dependent on X but not on any proper
subset of X.
Example –
In the relation ABC->D, attribute D is fully functionally dependent on ABC
and not on any proper subset of ABC. That means that subsets of ABC like
AB, BC, A, B, etc cannot determine D.
Ie, AB->D , not valid
BC-> D, not valid
Example

From the table, we can clearly see that neither


supplier_id nor item_id can uniquely determine the
price but both supplier_id and item_id together can
do so. So we can say that price is fully functionally
dependent on { supplier_id, item_id }. This
summarizes and gives our fully functional
dependency −
{ supplier_id , item_id } -> price
Partial Functional Dependency :
A functional dependency X->Y is a partial dependency if Y is functionally dependent on X
and Y can be determined by any proper subset of X.

For example, we have a relationship AC->B, A->D, and D->B.

Now if we compute the closure of {A+}=ADB

Here A is alone capable of determining B, which means B is partially dependent on AC.

Let us take another example –

Here, both the attributes name and roll_no


alone are able to uniquely identify a course.
Hence the relationship is partially
dependent.
2. Non-trivial functional dependency
In Non-trivial functional dependency, the dependent is strictly not a subset of the
determinant. i.e. If X → Y and Y is not a subset of X, then it is called Non-trivial
functional dependency.
Example 1 :

● Id -> Name
● Name -> DOB

Here, roll_no → name is a non-trivial functional dependency, since the dependent name
is not a subset of determinant roll_no. Similarly, {roll_no, name} → age is also a non-
3.Multivalued Functional Dependency

● In Multivalued functional dependency, entities of the dependent set are not dependent on each
other. i.e. If a → {b, c} and there exists no functional dependency between b and c, then it is
called a multivalued functional dependency.
In this table:

● X: bike_model
● Y: color
● Z: manuf_year

For each bike model (bike_model):

1. There is a group of colors (color) and a group of manufacturing years


(manuf_year).
2. The colors do not depend on the manufacturing year, and the manufacturing
year does not depend on the colors. They are independent.
3. The sets of color and manuf_year are linked only to bike_model.

That’s what makes it a multivalued dependency.


In this case these two columns are said to be multivalued dependent on
bike_model.
4. Transitive Functional Dependency
● In relational databases, transitive dependency occurs when one attribute indirectly depends
on another through an intermediate attribute. This problem means that the value of one
column can be determined indirectly from the values of two other columns.
Inference Rules(Properties of Functional Dependency)

● Inference rules( Properties of Functional Dependency)in databases are also


known as Armstrong’s Axioms in Functional Dependency. These rules
govern the functional dependencies in a relational database. From
inference rules a new functional dependency can be derived using other
FDs. These rules were introduced by William W. Armstrong.
● Reflexive Rule: According to this rule, if B is a subset of A then A logically
determines B. Formally, B ⊆ A then A → B.
● Example: Let us take an example of the Address (A) of a house, which
contains so many parameters like House no, Street no, City etc. These all
are the subsets of A. Thus, address (A) → House no. (B).
● Augmentation Rule: It is also known as Partial dependency. According to
this rule, If A logically determines B, then adding any extra attribute
doesn't change the basic functional dependency.
● Example: A → B, then adding any extra attribute let say C will give AC → BC
and doesn't make any change.
● Transitive rule: Transitive rule states that if A determines B and B
determines C, then it can be said that A indirectly determines B.
● Example: If A → B and B → C then A → C.
Secondary Rules –
These rules can be derived from the axioms.

● Union Rule: Union rule states that If A determines B and C, then A


determines BC.
○ Example: If A → B and A → C then A → BC.
● Decomposition Rule: It is perfectly reverse of the above Union rule.
According to this rule, If A determined BC then it can be decomposed as
A → B and A → C.
○ Example: If A → BC then A → B and A → C.
● Pseudo Transitive Rule: According to this rule, If A determined B and
BC determines D then AC determines D.
Minimal set of functional dependencies
Third Normal Form (3NF)

A relation is in the third normal form, if there is no transitive dependency for non-prime
attributes as well as it is in the second normal form. A relation is in 3NF if at least one of the
following conditions holds in every non-trivial functional dependency X –> Y.

● X is a super key.
● Y is a prime attribute (each element of Y is part of some candidate key).

In other words,
A relation that is in First and Second Normal Form and in which no non-primary-key
attribute is transitively dependent on the primary key, then it is in Third Normal Form
(3NF).

If A->B and B->C are two FDs then A->C is called transitive dependency. The
normalization of 2NF relations to 3NF involves the removal of transitive dependencies.

If a transitive dependency exists, remove the transitively dependent attribute(s) from


● First take the right side of every FDs.

ie, BCDEF
● The attribute that do not present in the right side is A.So A will be a key
or part of a key(No other attributes determines attribute A).
● Take A closure ,ie {A,B,C,D,E,F}
● A closure include all the attributes in a relation.That is A determines every
attributes.
● Thus conclude that A is a candidate key.
1.Find the candidate keys of the below relation.
Boyce-Codd Normal Form (BCNF)
● Boyce-Codd Normal Form (BCNF) is a stricter version of Third Normal Form
(3NF) that ensures a more simplified and efficient database design.
● It enforces that every non-trivial functional dependency must have a superkey
on its left-hand side.
● This approach addresses potential issues with candidate keys and ensures the
database is free from redundancy.
● BCNF eliminates redundancy more effectively than 3NF by strictly requiring
that all functional dependencies originate from super-keys.
● Rules for BCNF

● Rule 1: The table should be in the 3rd Normal Form.


● Rule 2: X should be a super-key for every functional dependency (FD)
X−>Y in a given relation.
● Note: To test whether a relation is in BCNF, identify all the determinants and
make sure that they are candidate keys.
● To determine the highest normal form of a given relation R with functional dependencies, the
first step is to check whether the BCNF condition holds. If R is found to be in BCNF, it can be
safely deduced that the relation is also in 3NF, 2NF, and 1NF.
Check whether the below table is in BCNF
Fourth Normal Form (4NF)
● The Fourth Normal Form (4NF) is a level of database
normalization where there are no non-trivial multivalued
dependencies other than a candidate key.
● Properties
● A relation R is in 4NF if and only if the following conditions are
satisfied:
1. It should be in the Boyce-Codd Normal Form (BCNF).
2. The table should not have any Multi-valued
Dependency.
● A table with a multivalued dependency violates the
normalization standard of the Fourth Normal Form (4NF)
Multivalued Dependency (MVD)

● In Database Management Systems (DBMS), multivalued


dependency (MVD) deals with complex attribute relationships in
which an attribute may have many independent values while yet
depending on another attribute or group of attributes.
● It improves database structure and consistency and is essential for
data integrity and database normalization.
● MVD or multivalued dependency means that for a single
value of attribute ‘a’ multiple values of attribute ‘b’ exist.
We write it as,
a --> --> b
● It is read as a is multi-valued dependent on b.
● Suppose a person named Geeks is working on 2 projects Microsoft and Oracle
and has 2 hobbies namely Reading and Music.
● This can be expressed in a tabular format in the following way.
What is Multivalued Dependency?

When one attribute in a database depends on another attribute and has many
independent values, it is said to have multivalued dependency (MVD). It supports
maintaining data accuracy and managing intricate data interactions.
Multi Valued Dependency (MVD)
We can say that multivalued dependency exists if the following conditions are met.

Conditions for MVD

Any attribute say a multiple define another attribute b; if any legal relation r(R), for all
pairs of tuples t1 and t2 in r, such that,
t1[a] = t2[a]
Then there exists t3 and t4 in r such that.
t1[a] = t2[a] = t3[a] = t4[a]
t1[b] = t3[b]; t2[b] = t4[b]
t1 = t4; t2 = t3
● Then multivalued (MVD) dependency exists. To check the MVD in given table, we
apply the conditions stated above and we check it with the values in the given
table.
● The table above, student with s_id 1 has opted for two courses, Science and Maths,
and has two hobbies, Cricket and Hockey.
● There is no relationship between the columns course and hobby. They are
independent of each other.
● So there is multi-value dependency, which leads to un-necessary repetition of data
and other anomalies as well.
To make the above relation satisfy the 4th normal form, we can decompose the
table into 2 tables.

Now this relation satisfies the fourth normal form.

You might also like