0% found this document useful (0 votes)
40 views38 pages

Unit II Database Design Completed

The document covers key concepts in database design, including ER diagrams, functional dependencies, and normalization techniques such as non-loss decomposition. It explains the components of ER diagrams, the significance of functional dependencies in maintaining data quality, and the advantages of non-loss decomposition for reducing redundancy and improving data integrity. Additionally, it outlines various types of functional dependencies and their implications in database management systems.

Uploaded by

prathaplisa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views38 pages

Unit II Database Design Completed

The document covers key concepts in database design, including ER diagrams, functional dependencies, and normalization techniques such as non-loss decomposition. It explains the components of ER diagrams, the significance of functional dependencies in maintaining data quality, and the advantages of non-loss decomposition for reducing redundancy and improving data integrity. Additionally, it outlines various types of functional dependencies and their implications in database management systems.

Uploaded by

prathaplisa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

UNIT II DATABASE DESIGN

ER Diagrams – Functional Dependencies – Non-Loss Decomposition Functional


Dependencies –First Normal Form – Second Normal Form – Third Normal Form
Dependency Preservation –Boyce/ Codd Normal Form – Multi-Valued Dependencies and
Fourth Normal Form – Join Dependencies and Fifth Normal Form

2.1 ER DIAGRAMS
 ER diagrams, short for Entity-Relationship diagrams, are a visual representation of the
entities and relationships within a database.
 They are widely used in database design to model the structure of a database, including
its tables, attributes, and the relationships between them.
Main components of an ER diagram:
1. Entities:
 These are the objects or concepts in the real world that are represented in the
database.
 Each entity is typically depicted as a rectangle in the diagram.
 For example, in a university database, entities might include Student, Course,
and Instructor.
2. Attributes:
 Attributes describe the properties or characteristics of an entity.
 They are represented as ovals connected to their respective entities.
 For instance, attributes of a Student entity might include Student ID, Name, and
Date of Birth.
3. Relationships:
 Relationships represent the associations between entities.
 They are typically depicted as lines connecting two entities, with a diamond
shape indicating the relationship type.
 For example, a relationship between Student and Course entities might be
"Enrolls In", indicating that a student can enroll in multiple courses.
4. Cardinality:
 Cardinality describes the number of instances of one entity that can be associated
with another entity through a relationship.

Prepared By: SHURITHI.S, AP/CYBERSECURITY, Mahendra Engineering College 1


 It is often denoted using symbols such as "1" for one instance or "N" for many
instances.
 For example, in a "One-to-Many" relationship between Student and Course, one
student can enroll in many courses, but each course can have only one student as
the instructor.
5. Primary Key:
 Primary keys are unique identifiers for each record in a database table.
 In an ER diagram, they are typically underlined to indicate their significance.
ER diagrams provide a clear and concise way to visualize the structure of a database,
making it easier for database designers to communicate and understand the database schema.
They serve as a blueprint for creating the actual database tables and relationships in a
relational database management system (RDBMS).

E-R DIAGRAM FOR LIBRARY MANAGEMENT SYSTEM

Prepared By: SHURITHI.S, AP/CYBERSECURITY, Mahendra Engineering College 2


E-R DIAGRAM FOR HOSPITAL MANAGEMENT SYSTEM

Prepared By: SHURITHI.S, AP/CYBERSECURITY, Mahendra Engineering College 3


E-R DIAGRAM FOR STUDENT MANAGEMENT SYSTEM

2.2 FUNCTIONAL DEPENDENCIES


 Functional Dependency (FD) is a constraint that determines the relation of one
attribute to another attribute in a DBMS.

Prepared By: SHURITHI.S, AP/CYBERSECURITY, Mahendra Engineering College 4


 Functional Dependency helps to maintain the quality of data in the database. It
plays a vital role to find the difference between good and bad database design.
A functional dependency is denoted by an arrow “→”. The functional dependency of X
on Y is represented by X → Y. Let’s understand Functional Dependency in DBMS with
example.
Example:
Employee Employee
Salary City
number Name
1 Dana 50000 San Francisco
2 Francis 38000 London
3 Andrew 25000 Tokyo

In this example, if we know the value of Employee number, we can obtain Employee
Name, city, salary, etc. By this, we can say that the city, Employee Name, and salary are
functionally depended on Employee number.
Key terms
Key terms for Functional Dependency in Database:
1. Axiom—
Axioms is a set of inference rules used to infer all the functional dependencies on
a relational database.
2. Decomposition—
It is a rule that suggests if you have a table that appears to contain two entities
which are determined by the same primary key then you should consider breaking
them up into two different tables.
3. Dependent—
It is displayed on the right side of the functional dependency diagram.
4. Determinant—
It is displayed on the left side of the functional dependency Diagram.
5. Union—
It suggests that if two tables are separate, and the PK is the same, you should
consider putting them. together

Prepared By: SHURITHI.S, AP/CYBERSECURITY, Mahendra Engineering College 5


Rules of Functional Dependencies
Below are the Three most important rules for Functional Dependency in Database:
1. Reflexive rule –
If X is a set of attributes and Y is_subset_of X, then X holds a value of Y.
2. Augmentation rule-
When x -> y holds, and c is attribute set, then ac -> bc also holds. That is adding
attributes which do not change the basic dependencies.
3. Transitivity rule-
This rule is very much similar to the transitive rule in algebra if x -> y holds and y
-> z holds, then x -> z also holds. X -> y is called as functionally that determines y.
Types of Functional Dependencies
There are mainly four types of Functional Dependency
 Multivalued Dependency
 Trivial Functional Dependency
 Non-Trivial Functional Dependency
 Transitive Dependency
 Fully Functional Dependency
Multivalued Dependency
 Multivalued dependency occurs in the situation where there are multiple independent
multivalued attributes in a single table.
 A multivalued dependency is a complete constraint between two sets of attributes in a
relation.
 It requires that certain tuples be present in a relation. Consider the following Multivalued
Dependency Example to understand.
Example:
Car_model Maf_year Color
H001 2017 Metallic
H001 2017 Green
H005 2018 Metallic
H005 2018 Blue
H010 2015 Metallic
H033 2012 Gray

Prepared By: SHURITHI.S, AP/CYBERSECURITY, Mahendra Engineering College 6


In this example, maf_year and color are independent of each other but dependent on
car_model.
In this example, these two columns are said to be multivalue dependent on car_model.
This dependence can be represented like this:
car_model -> maf_year
car_model-> colour
Trivial Functional Dependency
 The Trivial dependency is a set of attributes which are called a trivial if the set of
attributes are included in that attribute.
 So, X -> Y is a trivial functional dependency if Y is a subset of X. Let’s understand with
a Trivial Functional Dependency Example.
For example:
Emp_id Emp_name
AS555 Harry
AS811 George
AS999 Kevin

Consider this table of with two columns Emp_id and Emp_name.


{Emp_id, Emp_name} -> Emp_id is a trivial functional dependency as Emp_id is a subset of
{Emp_id,Emp_name}.

Non Trivial Functional Dependency


 Functional dependency which also known as a nontrivial dependency occurs
when A->B holds true where B is not a subset of A.
 In a relationship, if attribute B is not a subset of attribute A, then it is considered
as a non-trivial dependency.
Company CEO Age
Microsoft Satya Nadella 51
Google Sundar Pichai 46
Apple Tim Cook 57
Example:
(Company} -> {CEO} (if we know the Company, we knows the CEO name)
But CEO is not a subset of Company, and hence it’s non-trivial functional dependency.

Prepared By: SHURITHI.S, AP/CYBERSECURITY, Mahendra Engineering College 7


Transitive Dependency

 A Transitive Dependency is a type of functional dependency which happens when


“t” is indirectly formed by two functional dependencies.
 Let’s understand with the following Transitive Dependency Example.

Example:

Company CEO Age


Microsoft Satya Nadella 51
Google Sundar Pichai 46
Alibaba Jack Ma 54
{Company} -> {CEO} (if we know the compay, we know its CEO’s name)
{CEO } -> {Age} If we know the CEO, we know the Age
Therefore according to the rule of rule of transitive dependency:
{Company} -> {Age} should hold, that makes sense because if we know the company
name, we can know his age.
5. Fully Functional Dependency
 In full functional dependency an attribute or a set of attributes uniquely
determines another attribute or set of attributes.
 If a relation R has attributes X, Y, Z with the dependencies X->Y and X->Z
which states that those dependencies are fully functional.
6. Partial Functional Dependency
 In partial functional dependency a non key attribute depends on a part of the
composite key, rather than the whole key.
 If a relation R has attributes X, Y, Z where X and Y are the composite key and Z
is non key attribute. Then X->Z is a partial functional dependency in RBDMS.
Advantages of Functional Dependencies
 Data Normalization
 Query Optimization
 Consistency of Data
 Data Quality Improvement
 Maintain the quality of data in the database.

Prepared By: SHURITHI.S, AP/CYBERSECURITY, Mahendra Engineering College 8


 Expresses the facts about the database design.
 Helps in clearly defining the meanings and constraints of databases.
 Helps to identify bad designs.
 Removes data redundancy
 Without functional dependency, it's impossible to find candidate keys and normalize the
database.
Disadvantages of Functional Dependencies
 Complexity
 Normalization Overhead
 Performance Impact
 Update Anomalies
 Denormalization Trade-offs
 Dependency Maintenance
 Over-specification
 Learning Curve
2.3 NON-LOSS DECOMPOSITION FUNCTIONAL DEPENDENCIES
 Non-loss decomposition, in the context of functional dependencies (FDs) in database
normalization, refers to the process of breaking down a relation into smaller relations
(tables) without losing any of the functional dependencies that hold in the original
relation.
 Functional dependencies describe relationships between attributes in a relation, where the
value of one or more attributes uniquely determines the value of another attribute.
 When decomposing a relation, it's crucial to ensure that all functional dependencies are
preserved in the resulting smaller relations.
 This ensures that the original constraints and relationships between attributes are
maintained, even after decomposition.
 Non-loss decomposition is typically achieved through normalization techniques such as
Boyce-Codd Normal Form (BCNF) or Third Normal Form (3NF).
 These normalization forms ensure that each attribute in a relation is functionally
dependent on the primary key and that there are no non-trivial functional dependencies
between non-key attributes.

Prepared By: SHURITHI.S, AP/CYBERSECURITY, Mahendra Engineering College 9


 By decomposing relations into smaller, well-structured forms while preserving all
functional dependencies, non-loss decomposition helps reduce data redundancy and
maintain data integrity in a database schema.
Lossless join decomposition is a decomposition of a relation R into relations R1, and R2 such
that if we perform a natural join of relation R1 and R2, it will return the original relation R. This
is effective in removing redundancy from databases while preserving the original data.
 Only 1NF,2NF,3NF, and BCNF are valid for lossless join decomposition.
 In Lossless Decomposition, we select the common attribute and the criteria for selecting a
common attribute is that the common attribute must be a candidate key or super key in
either relation R1, R2, or both.
 Decomposition of a relation R into R1 and R2 is a lossless-join decomposition if at least
one of the following functional dependencies is in F+ (Closure of functional
dependencies)
Example of Lossless Decomposition
 Employee (Employee_Id, Ename, Salary, Department_Id, Dname)
Can be decomposed using lossless decomposition as,
 —Employee_desc(Employee_Id,Ename,Salary,Department_Id)
 —Department_desc(Department_Id,Dname)
Alternatively the lossy decomposition would be as joining these tables is not possible
so not possible to get back original data.
 –Employee_desc(Employee_Id,Ename,Salary)
 – Department_desc (Department_Id, Dname)
R1 ∩ R2 → R1

OR

R1 ∩ R2 → R2

 In a database management system (DBMS), a lossless decomposition is a process of


decomposing a relation schema into multiple relations in such a way that it preserves
the information contained in the original relation.
 Specifically, a lossless decomposition is one in which the original relation can be
reconstructed by joining the decomposed relations.

Prepared By: SHURITHI.S, AP/CYBERSECURITY, Mahendra Engineering College 10


 To achieve lossless decomposition, a set of conditions known as Armstrong’s axioms
can be used.
 These conditions ensure that the decomposed relations will retain all the information
present in the original relation.
 Specifically, the two most important axioms for lossless decomposition are the
reflexivity and the decomposition axiom.
 The reflexivity axiom states that if a set of attributes is a subset of another set of
attributes, then the larger set of attributes can be inferred from the smaller set.
 The decomposition axiom states that if a relation R can be decomposed into two
relations R1 and R2, then the original relation R can be reconstructed by taking the
natural join of R1 and R2.
 There are several algorithms available for performing lossless decomposition in DBMS,
such as the BCNF (Boyce-Codd Normal Form) decomposition and the 3NF (Third
Normal Form) decomposition.
 These algorithms use a set of rules to decompose a relation into multiple relations while
ensuring that the original relation can be reconstructed without any loss of information.
Non-Loss Decomposition Functional Dependencies advantages
1. Reduced Redundancy:
 By decomposing a relation into smaller relations while preserving functional
dependencies, non-loss decomposition helps eliminate or minimize data
redundancy.
 Redundant data can lead to inconsistencies and anomalies, such as update
anomalies, which can compromise data integrity.
 Non-loss decomposition ensures that each piece of data is stored in only one
place, reducing the risk of redundancy.
2. Improved Data Integrity:
 Functional dependencies define relationships between attributes in a relation,
ensuring that data remains consistent and accurate.
 By preserving these dependencies during decomposition, non-loss decomposition
helps maintain data integrity in the resulting smaller relations.

Prepared By: SHURITHI.S, AP/CYBERSECURITY, Mahendra Engineering College 11


 This ensures that data remains accurate and reflects the intended meaning of the
information stored in the database.
3. Simplified Data Maintenance:
 Smaller relations resulting from non-loss decomposition are typically easier to
maintain than larger, denormalized relations.
 Updates, inserts, and deletes can be performed more efficiently, as they only
affect the relevant smaller relations.
 This simplifies data maintenance tasks and reduces the likelihood of errors or
inconsistencies in the database.
4. Enhanced Query Performance:
 Well-structured smaller relations resulting from non-loss decomposition can
improve query performance.
 With properly defined indexes and reduced data redundancy, queries can be
executed more efficiently, leading to faster response times and improved overall
system performance.
 Additionally, query optimization techniques can be applied more effectively to
smaller relations, further enhancing performance.
5. Flexibility and Adaptability:
 Non-loss decomposition provides a flexible and adaptable database schema that
can easily accommodate changes in data requirements or business rules.
 As functional dependencies are preserved, the database schema remains resilient
to changes, allowing for modifications without compromising data integrity.
 This flexibility is particularly valuable in dynamic environments where
requirements may evolve over time.
6. Simplified Schema Design:
 Non-loss decomposition encourages a normalized database schema design, where
relations are organized based on functional dependencies.
 This result in a more intuitive and easier-to-understand schema structure, making
it simpler to design, implements, and maintain the database.
 A well-designed schema promotes better data organization and facilitates
efficient data retrieval and manipulation.

Prepared By: SHURITHI.S, AP/CYBERSECURITY, Mahendra Engineering College 12


7. Support for Data Analysis and Reporting:
 A normalized database schema resulting from non-loss decomposition provides a
solid foundation for data analysis and reporting.
 With well-defined relationships between attributes and reduced redundancy,
analysts can perform complex queries and generate accurate reports with
confidence.
 This enables informed decision-making and enhances the value of the data stored
in the database.
Disadvantages
1. Increased Complexity:
 Non-loss decomposition often leads to a larger number of relations in the database
schema.
 Managing these numerous relations can introduce complexity in terms of schema
design, maintenance, and querying.
 Developers and database administrators need to carefully handle the interrelation
between these decomposed relations, which can become challenging as the
database grows in size and complexity.
2. Query Performance Overhead: .
 Decomposing a relation into smaller relations necessitates more join operations
when querying data across multiple tables.
 This increased number of joins can potentially degrade query performance,
especially for complex queries involving multiple decomposed relations.
 Optimizing such queries becomes more challenging, requiring careful
consideration of indexing, join strategies, and query execution plans.
3. Storage Overhead:
 Non-loss decomposition may lead to increased storage requirements due to the
proliferation of smaller relations and associated indexes.
 While normalization reduces redundancy, it can also result in more storage
overhead for maintaining primary and foreign keys, indexes, and additional
metadata.

Prepared By: SHURITHI.S, AP/CYBERSECURITY, Mahendra Engineering College 13


 This can be a concern in environments where storage resources are limited or
costly.
4. Update Anomalies:
 Non-loss decomposition may introduce complexities in maintaining data
consistency during updates.
 Changes to data that span multiple decomposed relations may require careful
coordination to ensure atomicity and consistency, thus increasing the risk of
update anomalies.
 Managing such dependencies and enforcing integrity constraints becomes more
crucial but can also be more challenging.
5. Dependency Maintenance:
 As the database evolves, maintaining the integrity of functional dependencies
across decomposed relations becomes more complex.
 Any modifications to the schema or data model may require careful analysis and
adjustments to ensure that all dependencies are still preserved.
 This maintenance overhead can increase over time, particularly in large and
dynamic databases.
6. Join Overhead:
 While decomposition helps in reducing data redundancy, it can lead to an increase
in join operations, especially in situations where denormalization might be
preferable for performance reasons.
 This can result in higher computational overhead during query processing,
impacting overall system performance.
7. Indexing Complexity:
 Managing indexes across multiple decomposed relations can be complex.
 Developers need to carefully select and maintain indexes to optimize query
performance while minimizing storage overhead.
 This requires a deep understanding of the query workload and access patterns, as
well as ongoing monitoring and adjustment of indexes to ensure optimal
performance.
8. Learning Curve:

Prepared By: SHURITHI.S, AP/CYBERSECURITY, Mahendra Engineering College 14


 Implementing non-loss decomposition effectively requires a solid understanding
of database normalization principles and functional dependencies.
 For developers and administrators who are not well-versed in these concepts,
there can be a steep learning curve involved in designing, implementing, and
maintaining a normalized database schema.
Conditions Required in Lossless Decomposition in DBMS
To ensure a lossless decomposition, the following conditions must be met:
 Dependency Preservation:
 The dependencies between the attributes of the original relation must be preserved
in the decomposed relations.
 In other words, all functional dependencies in the original relation must be
represented in the decomposed relations.
 Join Preservation:
 The decomposition should not result in the loss of any information that could be
obtained by joining the decomposed relations back together.
 This means that the original relation should be reconstructed without any loss of
data.
 Minimal Redundancy:
 The decomposed relations should not contain any redundant data.
 This means that each attribute should be represented in only one relationship and
that there should be no unnecessary duplication of information.
 Lossless-join Property:
 The decomposed relations should have the property that when joined together,
they produce the original relation.
 This means that there should be no loss of information when the relations are
joined back together.
Example for Lossless Decomposition in DBMS
Consider the following table:
Employee (EmpID, EmpName, EmpAddress, DeptID, DeptName, Salary)
Functional Dependencies:
EmpID → EmpName, EmpAddress

Prepared By: SHURITHI.S, AP/CYBERSECURITY, Mahendra Engineering College 15


DeptID → DeptName, Salary
Using lossless decomposition, we can break down the Employee table into two smaller tables:
Table 1: EmployeeDetails (EmpID, EmpName, EmpAddress, DeptID)
EmpID EmpName EmpAddress DeptID

John 123 Main St 101


1

2 Jane 456 Oak St 102

Bob 789 Elm St 101


3

Table 2: Department Details (DeptID, DeptName, Salary)


DeptID DeptName Salary

101 Sales 50000

102 Marketing 60000

Explanation:
 In the above example, we have an Employee table with six attributes: EmpID, EmpName,
EmpAddress, DeptID, DeptName, and Salary.
 We have identified the functional dependencies in this table and found that EmpID
determines EmpName and EmpAddress, while DeptID determines DeptName and Salary.
 Using lossless decomposition, we have broken down the Employee table into two smaller
tables: EmployeeDetails and DepartmentDetails.The EmployeeDetails table contains the
attributes EmpID, EmpName, EmpAddress, and DeptID, while the Department Details
table contains the attributes DeptID, DeptName, and Salary.
 Both tables are now free of any data redundancy and any updates or modifications can be
made to the smaller tables rather than the entire original table.
 We can also combine these tables using the common attribute DeptID, to retrieve the
original Employee table.
 Thus, lossless decomposition has preserved all the information in the original table while
improving efficiency and reducing redundancy.

Prepared By: SHURITHI.S, AP/CYBERSECURITY, Mahendra Engineering College 16


Non-Loss Decomposition Functional Dependencies drawbacks
1. Increased Complexity:
 Decomposing relations into smaller ones while preserving functional
dependencies can increase the complexity of the database schema.
 This complexity arises from managing the relationships between decomposed
relations, ensuring data integrity, and handling the increased number of tables in
the database.
 It can make the schema harder to understand and maintain, particularly for
developers unfamiliar with the design rationale.
2. Query Performance Overhead:
 Non-loss decomposition often leads to an increase in the number of join
operations required to retrieve data.
 This can potentially degrade query performance, especially for complex queries
involving multiple decomposed relations.
 The overhead of performing joins and accessing multiple tables can result in
slower query execution times and decreased system responsiveness.
3. Storage Overhead:
 Storing data in multiple smaller relations instead of a single larger one can lead to
increased storage overhead.
 While normalization reduces redundancy, it also introduces additional metadata,
such as primary and foreign keys, indexes, and constraints, for each decomposed
relation.
 This can result in higher disk space utilization, particularly in databases with
large volumes of data.
4. Update Anomalies:
 Non-loss decomposition may introduce update anomalies, where updating data in
one decomposed relation requires updates to be made in multiple related relations
to maintain consistency.
 This can increase the complexity of update operations and the risk of data
inconsistencies if updates are not performed correctly or atomically.

Prepared By: SHURITHI.S, AP/CYBERSECURITY, Mahendra Engineering College 17


 Managing data consistency across decomposed relations can be challenging,
especially in environments with concurrent transactions or distributed databases.
5. Indexing Overhead:
 Maintaining appropriate indexes on multiple decomposed relations can be
complex.
 Developers need to carefully select and manage indexes to ensure optimal query
performance while minimizing storage overhead.
 This requires a deep understanding of the data access patterns and query
requirements, as well as ongoing monitoring and adjustment of indexes to
maintain performance.
6. Data Distribution Challenges:
 In distributed database systems, non-loss decomposition can pose challenges
related to data distribution.
 Distributing data across multiple nodes while ensuring data consistency and
minimizing network overhead can be difficult, particularly when dealing with
decomposed relations.
 Ensuring data locality and minimizing data movement across nodes are crucial
considerations in distributed database design.
7. Learning Curve:
 Implementing non-loss decomposition effectively requires a solid understanding
of database normalization principles and functional dependencies.
 Developers and database administrators may need to invest time and effort in
learning these concepts and applying them correctly to database design.
 Additionally, maintaining a decomposed database schema requires ongoing
education and training to ensure that best practices are followed and potential
issues are addressed proactively.
2.4 NORMALIZATION
Normalization is the process of minimizing redundancy from a relation or set of relations.
Redundancy in relation may cause insertion, deletion, and update anomalies. So, it helps to
minimize the redundancy in relations. Normal forms are used to eliminate or reduce redundancy
in database tables.

Prepared By: SHURITHI.S, AP/CYBERSECURITY, Mahendra Engineering College 18


Levels of Normalization
There are various levels of normalization. These are some of them:
1. First Normal Form (1NF)
2. Second Normal Form (2NF)
3. Third Normal Form (3NF)
4. Boyce-Codd Normal Form (BCNF)
5. Fourth Normal Form (4NF)
6. Fifth Normal Form (5NF)
2.4 FIRST NORMAL FORM (1NF)
 First Normal Form (1NF) is a fundamental concept in database normalization, which
establishes the most basic requirements for a well-structured relational database schema.
 It serves as the foundation upon which higher normal forms are built.
 This is the most basic level of normalization. In 1NF, each table cell should contain only
a single value, and each column should have a unique name.
 The first normal form helps to eliminate duplicate data and simplify queries.
Rules for First Normal Form
Rule 1: Single Valued Attributes
Rule 2: Attribute Domain should not change
Rule 3: Unique name for Attributes/Columns
Rule 4: Order doesn't matters
Key characteristics and requirements of First Normal Form:
1. Atomic Values:
 Each attribute (or column) in a relation must hold atomic values, meaning that it
cannot be further divided into smaller pieces of data.
 This ensures that each attribute contains only a single value from the domain of its
respective data type.
2. Unique Column Names:
 Each column in a relation must have a unique name.
 This prevents ambiguity and ensures that each attribute is uniquely identifiable
within the relation.
3. Unique Rows:

Prepared By: SHURITHI.S, AP/CYBERSECURITY, Mahendra Engineering College 19


 Each row in a relation must be unique, meaning that there should be no duplicate
rows.
 This uniqueness is typically enforced through the presence of a primary key,
which uniquely identifies each row in the relation.
4. Ordering of Rows:
 The order in which rows appear in a relation is generally considered irrelevant in
First Normal Form.
 The relational model treats relations as unordered sets of tuples, meaning that the
DBMS is free to store and retrieve rows in any order, unless an explicit ordering
is specified in a query.
First Normal Form ensures that the database schema is free from certain types of data anomalies,
such as repeating groups and partial dependencies. It promotes data integrity, simplifies data
management, and lays the groundwork for further normalization.
EXAMPLE
First Normal Form rule defines that all the attributes in a relation must have atomic domains. The
values in an atomic domain are indivisible units.

We re-arrange the relation (table) as below, to convert it to First Normal Form.

Each attribute must contain only a single value from its pre-defined domain.

2.5 SECOND NORMAL FORM


 A relation that is in First Normal Form and every non-primary-key attribute is fully
functionally dependent on the primary key, then the relation is in Second Normal Form
(2NF).

Prepared By: SHURITHI.S, AP/CYBERSECURITY, Mahendra Engineering College 20


 The second Normal Form (2NF) is based on the concept of fully functional dependency.
 The second Normal Form applies to relations with composite keys, that is, relations with
a primary key composed of two or more attributes.
 A relation with a single-attribute primary key is automatically in at least 2NF. A relation
that is not in 2NF may suffer from the update anomalies.
 To be in the second normal form, a relation must be in the first normal form and the
relation must not contain any partial dependency.
 A relation is in 2NF if it has No Partial Dependency, i.e., no non-prime attribute
(attributes that are not part of any candidate key) is dependent on any proper subset of
any candidate key of the table.
Rules for Second Normal Form
Rule 1: It should be in the first normal form.
Rule 2: It should not have any partial dependencies. This means that all non-key attributes are
fully functional, dependent on the primary key.
Key characteristics and requirements of Second Normal Form
1. First Normal Form (1NF) Compliance:
 Before a relation can satisfy 2NF, it must first comply with the requirements of
First Normal Form (1NF).
 This means that every attribute in the relation must contain atomic values, and
each row must be uniquely identifiable.
2. Elimination of Partial Dependencies:
 Second Normal Form aims to eliminate partial dependencies within a relation.
 A partial dependency occurs when a non-prime attribute (an attribute not part of
the primary key) is functionally dependent on only a portion of the primary key,
rather than the entire primary key.
 To satisfy 2NF, all non-prime attributes must be fully functionally dependent on
the entire primary key.
3. Functional Dependency:
 A functional dependency exists when the value of one attribute uniquely
determines the value of another attribute within a relation.

Prepared By: SHURITHI.S, AP/CYBERSECURITY, Mahendra Engineering College 21


 In the context of 2NF, all non-prime attributes should be functionally dependent
on the entire primary key.
 This ensures that each non-prime attribute is related to the entire primary key,
rather than just a portion of it.
4. Primary Key Determination:
 Determining the primary key of a relation is crucial for ensuring 2NF compliance.
The primary key uniquely identifies each row in the relation.
 It should be minimal (i.e., composed of as few attributes as possible) and ensure
that no partial dependencies exist within the relation.
 By defining an appropriate primary key, you can ensure that all attributes in the
relation are fully functionally dependent on it.
5. Normalization Process:
 Achieving Second Normal Form often involves decomposing a relation into
multiple smaller relations to remove partial dependencies.
 This decomposition process ensures that each relation represents a single entity
and that all attributes are fully functionally dependent on the primary key.
 By normalizing the database schema to 2NF, you can minimize redundancy,
improve data integrity, and facilitate efficient querying and maintenance.
EXAMPLE
Illustrate Second Normal Form (2NF) using a table representing information about
students and the courses they are enrolled in:
Example Before Normalization (Not in 2NF):
Course Course
Student ID Student Name ID Name Instructor

1 John Doe 101 Math Dr. Smith

1 John Doe 102 Physics Dr. Johnson

2 Jane Smith 101 Math Dr. Smith

3 Alice Brown 103 Biology Dr. Williams

3 Alice Brown 104 Chemistry Dr. Davis

Prepared By: SHURITHI.S, AP/CYBERSECURITY, Mahendra Engineering College 22


In this example, the "Instructor" column depends only on the "Course ID" and not on the
entire primary key, which includes "Student ID" and "Course ID". This violates 2NF because the
"Instructor" attribute is not fully functionally dependent on the primary key.
Example After Normalization (In 2NF):
To bring the table into 2NF, we need to split it into two separate tables: one for student-
course enrollment information and another for course details, including instructors.
Student Courses Table
Student Course
ID ID
1 101
1 102
2 101
3 103
3 104
Courses Table
Course ID Course Name Instructor
101 Math Dr. Smith
102 Physics Dr. Johnson
103 Biology Dr. Williams
104 Chemistry Dr. Davis

The "Instructor" attribute is fully functionally dependent on the primary key of the
"Courses" table, which consists of the "Course ID". Each table represents a single entity, and
there are no partial dependencies within either table, satisfying the requirements of Second
Normal Form (2NF)
2.7 THIRD NORMAL FORM
 Third Normal Form (3NF) is a crucial concept in database normalization, building upon
the principles of First Normal Form (1NF) and Second Normal Form (2NF).
 It aims to further reduce data redundancy and dependency by eliminating transitive
dependencies within a relation.
 3NF builds on 2NF by requiring that all non-key attributes are independent of each other.
 This means that each column should be directly related to the primary key, and not to any
other columns in the same table.

Prepared By: SHURITHI.S, AP/CYBERSECURITY, Mahendra Engineering College 23


Rules for Third Normal Form
Rule 1: It should be in second normal form
Rule 2: It should not have any transitive dependencies for non-prime attributes
Key characteristics and requirements of Third Normal Form:
1. Elimination of Transitive Dependencies:
 Third Normal Form aims to eliminate transitive dependencies within a relation.
 A transitive dependency occurs when a non-prime attribute is functionally
dependent on another non-prime attribute, rather than directly on the primary key.
 To satisfy 3NF, all non-prime attributes must be dependent only on the primary
key, not on any other non-prime attribute.
2. Functional Dependency:
 Similar to Second Normal Form, functional dependencies play a crucial role in
3NF.
 Each attribute in the relation should be functionally dependent on the primary
key, ensuring that every non-prime attribute is related directly to the primary key
and not indirectly through another non-prime attribute.
3. Elimination of Redundancy:
 By removing transitive dependencies, Third Normal Form helps minimize
redundancy in the database schema.
 Redundancy occurs when the same information is stored multiple times in the
database, leading to wasted storage space and potential inconsistencies.
 Achieving 3NF reduces redundancy and promotes data integrity.
4. Normalization Process:
 Achieving Third Normal Form often involves further decomposition of relations
to remove transitive dependencies.
 This decomposition process ensures that each attribute is functionally dependent
only on the primary key, leading to a well-structured and normalized database
schema.
 By normalizing the database to 3NF, you can improve data integrity, minimize
redundancy, and facilitate efficient querying and maintenance.

Prepared By: SHURITHI.S, AP/CYBERSECURITY, Mahendra Engineering College 24


Third Normal Form (3NF) builds upon the requirements of Second Normal Form (2NF)
by eliminating transitive dependencies within a relation. By ensuring that all non-prime attributes
are dependent only on the primary key, 3NF helps create a more efficient, well-structured, and
normalized database schema.
Transitive dependencies are indirect relationships between values in the same table that
cause functional dependencies.
For a table to not have any transitive dependencies, we need to ensure that no non-prime
attribute determines another non-prime attribute as only prime attributes or candidate keys can
determine non-prime attributes for a table in 3NF.
EXAMPLE
Transitive Dependency Example:
Consider a table representing information about employees, their departments, and the
locations of those departments:

Employee Employee Department Department Department


ID Name ID Name Location
1 John Doe 101 HR New York
2 Jane Smith 102 Marketing Los Angeles
3 Alice Brown 101 HR New York
In this example, "Department Location" is functionally dependent on "Department
Name," and "Department Name" is functionally dependent on "Department ID." Therefore,
"Department Location" is transitively dependent on "Department ID" through "Department
Name."
Example of Third Normal Form:
To bring the table into 3NF, we need to eliminate the transitive dependency. This
involves splitting the table into two separate tables: one for department details and another for
employee information.
Departments Table
Department Department Department
ID Name Location
101 HR New York
102 Marketing Los Angeles

Prepared By: SHURITHI.S, AP/CYBERSECURITY, Mahendra Engineering College 25


Employees Table

Employee Employee Department


ID Name ID
1 John Doe 101
2 Jane Smith 102
3 Alice Brown 101
Each table represents a single entity, and there are no transitive dependencies. Each non-
prime attribute is fully functionally dependent on the primary key, satisfying the requirements of
third normal form (3nf). This results in a more efficient and well-structured database schema.
2.7 DEPENDENCY PRESERVATION
 Dependency Preserving Decomposition is a technique used in DBMS to decompose a
relation into smaller relations while preserving the functional dependencies between the
attributes.
 The goal is to improve the efficiency of the database by reducing redundancy and
improving query performance.
 In this technique, the original relation is decomposed into smaller relations in such a way
that the resulting relations preserve the functional dependencies of the original relation.
 This is important because if the decomposition results in losing any of the original
functional dependencies, it can lead to data inconsistencies and anomalies.
 To achieve dependency preserving decomposition, there are various algorithms available,
such as the Boyce-Codd Normal Form (BCNF) decomposition and the Third Normal
Form (3NF) decomposition.
 These algorithms are based on the concept of functional dependencies and are used to
identify the attributes that should be grouped together to form smaller relations.
 The BCNF decomposition algorithm is used to decompose a relation into smaller
relations in such a way that each resulting relation is in BCNF.
 BCNF is a higher normal form than 3NF and is used when there are multiple candidate
keys in a relation.
 The 3NF decomposition algorithm is used to decompose a relation into smaller relations
in such a way that each resulting relation is in 3NF.
 3NF is a normal form that ensures that there are no transitive dependencies between the
attributes of a relation.

Prepared By: SHURITHI.S, AP/CYBERSECURITY, Mahendra Engineering College 26


Dependency preserving decomposition is an important technique in DBMS for improving
database efficiency while maintaining data consistency and integrity. It is important to choose
the right decomposition algorithm based on the specific requirements of the database to achieve
the desired results.
For example, in a relation representing student information, if Student_ID uniquely
determines Student_Name, then this is a functional dependency. When decomposing this relation
into smaller ones, such as separating student names into a separate table, you must ensure that
this functional dependency is preserved. This means that even after decomposition, knowing the
Student_ID should still allow you to determine the corresponding Student_Name accurately.
EXAMPLE
Let's consider a simple example to illustrate dependency preservation in DBMS through
normalization.
We have a relation (table) called Employee with the following attributes:
Employee_ID (Primary Key)
Employee_Name
Department_ID
Department_Name
In this relation, we observe the following functional dependencies:
Employee_ID → Employee_Name (Each Employee_ID uniquely determines the
Employee_Name)
Department_ID → Department_Name (Each Department_ID uniquely determines the
Department_Name)
Now, let's normalize this relation to 3NF (Third Normal Form). The process involves
identifying and removing transitive dependencies.
Step 1: First Normal Form (1NF)
Ensure that each attribute contains atomic values.
Our relation is already in 1NF because each attribute contains atomic values.
Step 2: Second Normal Form (2NF)
Remove partial dependencies.
In our relation, there are no partial dependencies since all non-prime attributes (Employee_Name
and Department_Name) are fully functionally dependent on the whole primary key.

Prepared By: SHURITHI.S, AP/CYBERSECURITY, Mahendra Engineering College 27


Step 3: Third Normal Form (3NF)
Remove transitive dependencies.
We have a transitive dependency in the relation: Department_ID → Department_Name.
Department_Name depends on Department_ID, which is not part of the primary key. To remove
this dependency, we decompose the relation into two smaller relations:
Employee_Details:
Employee_ID (Primary Key)
Employee_Name
Department_ID (Foreign Key)
Department_Details:
Department_ID (Primary Key)
Department_Name
Let's ensure dependency preservation:
Employee_Details preserves the dependency Employee_ID → Employee_Name.
Department_Details preserves the dependency Department_ID → Department_Name
The Principles Of Dependency Preservation In DBMS
1. Atomicity: Each attribute (column) in a relation should contain atomic values. This
means that the values cannot be divided further.
2. Functional Dependency: Functional dependencies describe the relationships between
attributes in a relation. A functional dependency exists when knowing the value of one
attribute uniquely determines the value of another attribute.
3. Normalization: The process of normalization aims to organize the attributes and tables
of a relational database to minimize redundancy and dependency. This process involves
decomposing larger relations into smaller ones while ensuring that dependencies are
preserved.
4. Normalization Forms: Normalization forms provide guidelines for structuring relations
to minimize redundancy and dependency. The most common normalization forms
include First Normal Form (1NF), Second Normal Form (2NF), Third Normal Form
(3NF), and Boyce-Codd Normal Form (BCNF), each addressing specific types of
dependencies.

Prepared By: SHURITHI.S, AP/CYBERSECURITY, Mahendra Engineering College 28


5. Preservation of Dependencies: During the normalization process, it's essential to
preserve the functional dependencies that exist in the original relation. This ensures that
the relationships between attributes remain intact even after decomposition.
6. Lossless Decomposition: Decomposing a relation into smaller relations should not result
in the loss of information. A decomposition is considered lossless if it allows us to
reconstruct the original relation through a join operation without losing any information.
7. Data Integrity: Dependency preservation contributes to maintaining data integrity,
ensuring that the database remains consistent and accurate over time. By preserving
dependencies, we minimize the risk of anomalies such as insertion, update, and deletion
anomalies.
2.8 BOYCE/CODD NORMAL FORM
The Boyce-Codd Normal Form (BCNF) is a specific condition in relational database
normalization that ensures the elimination of certain types of anomalies by refining the structure
of relations (tables).
In simpler terms, a relation is said to be in BCNF if and only if every non-trivial
functional dependency (X → Y) in the relation, where X is a set of attributes and Y is another set
of attributes, holds with X being a superkey. Here, a superkey refers to a set of attributes that
uniquely identifies each tuple (row) in the relation.
BCNF ensures that every determinant (X) of a functional dependency is a candidate key,
thereby removing redundancy and anomalies that could arise from dependencies on attributes
that are not part of a key.
Rules for BCNF
Rule 1: The table should be in the 3rd Normal Form.
Rule 2: X should be a superkey for every functional dependency (FD) X−>Y in a given relation.
 To determine the highest normal form of a given relation R with functional
dependencies, the first step is to check whether the BCNF condition holds.
 If R is found to be in BCNF, it can be safely deduced that the relation is also in 3NF,
2NF, and 1NF as the hierarchy shows.
 The 1NF has the least restrictive constraint – it only requires a relation R to have
atomic values in each tuple.
 The 2NF has a slightly more restrictive constraint.

Prepared By: SHURITHI.S, AP/CYBERSECURITY, Mahendra Engineering College 29


 The 3NF has a more restrictive constraint than the first two normal forms but is less
restrictive than the BCNF. In this manner, the restriction increases as we traverse down
the hierarchy.
Examples
Stu_Course
Stu_ID Stu_Branch Stu_Course Branch_Number _No

Computer
101 Science & DBMS B_001 201
Engineering

Computer
Computer
101 Science & B_001 202
Networks
Engineering

Electronics &
102 Communication VLSI Technology B_003 401
Engineering

Electronics &
Mobile
102 Communication B_003 402
Communication
Engineering

Functional Dependency of the above is as mentioned:


Stu_ID −> Stu_Branch
Stu_Course −> {Branch_Number, Stu_Course_No}

Candidate Keys of the above table are: {Stu_ID, Stu_Course}


For satisfying this table in BCNF, we have to decompose it into further tables. Here is the
full procedure through which we transform this table into BCNF. Let us first divide this main
table into two tables Stu_Branch and Stu_Course Table.

Prepared By: SHURITHI.S, AP/CYBERSECURITY, Mahendra Engineering College 30


Stu_Branch Table
Stu_ID Stu_Branch

101 Computer Science & Engineering

Electronics & Communication


102
Engineering

Candidate Key for this table: Stu_ID.


Stu_Course Table
Stu_Course Branch_Number Stu_Course_No

DBMS B_001 201

Computer Networks B_001 202

VLSI Technology B_003 401

Mobile Communication B_003 402

Candidate Key for this table: Stu_Course.

Stu_ID to Stu_Course_No Table


Stu_ID Stu_Course_No

101 201

101 202

102 401

102 402

Candidate Key for this table: {Stu_ID, Stu_Course_No}.


After decomposing into further tables, now it is in BCNF, as it is passing the condition of Super
Key, that in functional dependency X−>Y, X is a Super Key.

Prepared By: SHURITHI.S, AP/CYBERSECURITY, Mahendra Engineering College 31


2.9 MULTIVALUED DEPENDENCY
 Multivalued dependency occurs when two attributes in a table are independent of each
other but, both depend on a third attribute.
 A multivalued dependency consists of at least two attributes that are dependent on a third
attribute that's why it always requires at least three attributes.
Example: Suppose there is a bike manufacturer company which produces two colors(white
and black) of each model every year.

BIKE_MODEL MANUF_YEAR COLOR

M2011 2008 White

M2001 2008 Black

M3001 2013 White

M3001 2013 Black

M4006 2017 White

M4006 2017 Black


Here columns COLOR and MANUF_YEAR are dependent on BIKE_MODEL and
independent of each other.
In this case, these two columns can be called as multivalued dependent on
BIKE_MODEL.
The representation of these dependencies is shown below:
BIKE_MODEL → → MANUF_YEAR
BIKE_MODEL → → COLOR

This can be read as "BIKE_MODEL multidetermined MANUF_YEAR" and "BIKE_MODEL


multidetermined COLOR".
Functionality and Features
 Identify and manage complex relationships
 A relationship between three or more columns in a database table,
 Multivalued Dependent guides the process of data normalization.
 Ensuring efficient data storage by limiting data redundancy

Prepared By: SHURITHI.S, AP/CYBERSECURITY, Mahendra Engineering College 32


 Facilitating effective querying by making database structure easier to understand
 Helping in maintaining data consistency and integrity
Benefits and Use Cases
 Being the backbone of 4NF, it helps in eliminating data redundancy
 It supports the creation of comprehensive and reliable data models
 It encourages efficient query processing and performance
Challenges and Limitations
 It can become complex to manage when there are many MVDs
 Incorrectly defined MVDs may lead to data inconsistency
 It might not be ideal for small-scale databases or simple data models
FOURTH NORMAL FORM (4NF)
 The Fourth Normal Form (4NF) is a level of database normalization where there are
no non-trivial multivalued dependencies other than a candidate key.
 It builds on the first three normal forms (1NF, 2NF, and 3NF) and the Boyce-Codd
Normal Form (BCNF).
 It states that, in addition to a database meeting the requirements of BCNF, it must not
contain more than one multivalued dependency.
Properties
A relation R is in 4NF if and only if the following conditions are satisfied:
1. It should be in the Boyce-Codd Normal Form (BCNF).
2. The table should not have any Multi-valued Dependency.
A table with a multivalued dependency violates the normalization standard of the Fourth
Normal Form (4NF) because it creates unnecessary redundancies and can contribute to
inconsistent data. To bring this up to 4NF, it is necessary to break this information into two
tables.
Example: Consider the database table of a class that has two relations R1 contains
student ID(SID) and student name (SNAME) and R2 contains course id(CID) and course
name (CNAME).

Prepared By: SHURITHI.S, AP/CYBERSECURITY, Mahendra Engineering College 33


Table R1

SID SNAME

S1 A

S2 B

Table R2
CID CNAME

C1 C

C2 D

When their cross-product is done it resulted in multivalued dependencies.


Table R1 X R2
SID SNAME CID CNAME

S1 A C1 C

S1 A C2 D

S2 B C1 C

S2 B C2 D

Multivalued dependencies (MVD) are:


SID->->CID; SID->->CNAME; SNAME->->CNAME
2.10 JOIN DEPENDENCY
 Join decomposition is a further generalization of Multivalued dependencies.
 If the join of R1 and R2 over C is equal to relation R then we can say that a
join dependency (JD) exists, where R1 and R2 are the decomposition R1(A, B, C) and
R2(C, D) of a given relations R (A, B, C, D).
 Alternatively, R1 and R2 are a lossless decomposition of R. A JD ⋈ {R1, R2, …, Rn}
is said to hold over a relation R if R1, R2, ….., Rn is a lossless-join decomposition.

Prepared By: SHURITHI.S, AP/CYBERSECURITY, Mahendra Engineering College 34


 The *(A, B, C, D), (C, D) will be a JD of R if the join of joins attribute is equal to the
relation R.
 Here, *(R1, R2, R3) is used to indicate that relation R1, R2, R3 and so on are a JD of
R. Let R is a relation schema R1, R2, R3……..Rn be the decomposition of R. r( R ) is
said to satisfy join dependency if and only if

Joint Dependency
Example:
Table R1
Company Product

C1 Pendrive

C1 mic

C2 speaker

C2 speaker

Company->->Product
Table R2
Agent Company

Aman C1

Aman C2

Mohan C1

Agent->->Company

Prepared By: SHURITHI.S, AP/CYBERSECURITY, Mahendra Engineering College 35


Table R3
Agent Product

Aman Pendrive

Aman Mic

Aman speaker

Mohan speaker

Agent->->Product
Table R1⋈R2⋈R3
Company Product Agent

C1 Pendrive Aman

C1 mic Aman

C2 speaker speaker

C1 speaker Aman

Agent->->Product

2.10 FIFTH NORMAL FORM/PROJECTED NORMAL FORM (5NF)


A relation R is in Fifth Normal Form if and only if everyone joins dependency in R is
implied by the candidate keys of R. A relation decomposed into two relations must
have lossless join Property, which ensures that no spurious or extra tuples are generated when
relations are reunited through a natural join.
Properties
A relation R is in 5NF if and only if it satisfies the following conditions:
1. R should be already in 4NF.

2. It cannot be further non loss decomposed (join dependency).

Prepared By: SHURITHI.S, AP/CYBERSECURITY, Mahendra Engineering College 36


Example – Consider the above schema, with a case as “if a company makes a product and an
agent is an agent for that company, then he always sells that product for the company”. Under
these circumstances, the ACP table is shown as:
Table ACP

Agent Company Product

A1 PQR Nut

A1 PQR Bolt

A1 XYZ Nut

A1 XYZ Bolt

A2 PQR Nut

The relation ACP is again decomposed into 3 relations. Now, the natural Join of all three
relations will be shown as:
Table R1
Agent Company

A1 PQR

A1 XYZ

A2 PQR

Table R2
Agent Product

A1 Nut

A1 Bolt

A2 Nut

Prepared By: SHURITHI.S, AP/CYBERSECURITY, Mahendra Engineering College 37


Table R3
Company Product

PQR Nut

PQR Bolt

XYZ Nut

XYZ Bolt

The result of the Natural Join of R1 and R3 over ‘Company’ and then the Natural
Join of R13 and R2 over ‘Agent’and ‘Product’ will be Table ACP.
Hence, in this example, all the redundancies are eliminated, and the decomposition of
ACP is a lossless join decomposition. Therefore, the relation is in 5NF as it does not violate
the property of lossless
Atomic Commitment Protocol (ACP)

Prepared By: SHURITHI.S, AP/CYBERSECURITY, Mahendra Engineering College 38

You might also like