0% found this document useful (0 votes)
33 views

Unit II Database Design 1

Uploaded by

anithaselvi92
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views

Unit II Database Design 1

Uploaded by

anithaselvi92
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 20

UNIT II DATABASE DESIGN

Entity-Relationship model – E-R Diagrams – Enhanced-ER Model – ER-to-Relational Mapping – Functional


Dependencies – Non-loss Decomposition – First, Second, Third Normal Forms, Dependency Preservation
– Boyce/Codd Normal Form – Multi-valued Dependencies and Fourth Normal Form – Join Dependencies
and Fifth Normal Form.

Entity-Relationship model: page no:262

The Entity-Relationship (ER) model is a conceptual framework used to describe the


structure of databases in terms of entities, their attributes, and the relationships
between them. It's a popular tool in database design because it helps in
understanding the organization of data and how different pieces of data relate to
each other.
Entity:pg no262
Entity set: pg no 262
Attributes: pg no 263
Values: pg no 263
Relationship: pg no 264
Relationship instance: pg no 264
Descriptive attributes: pg no 265

Attributes types and its types: pg no 267 and fig :7.4


 Simple and composite
 Single value and multivalued
 Derived
Weak entity set:
A weak entity set is an entity set in a database that does not have a primary key
attribute of its own. Instead, it relies on the existence of a related entity set, called
the identifying or owner entity set, to provide a primary key. Weak entities typically
represent concepts or objects that are dependent on another entity for their identity
E-R Diagrams: pg no: 274

 Basic structure: pg no:275


 Mapping cardinality pgno:276

Sample:
ER-to-Relational Mapping:

Binary Relationship with 1:1 cardinality with total participation of an entity

First Convert each entity and relationship to tables. Person table corresponds to Person Entity with key
as Per-Id. Similarly Passport table corresponds to Passport Entity with key as Pass-No. Has Table
represents relationship between Person and Passport (Which person has which passport). So it will take
attribute Per-Id from Person and Pass-No from Passport.

Person Has Passport


Per-
Other Person Attribute Per-Id Pass-No Pass-No Other PassportAttribute
Id

PR1 – PR1 PS1 PS1 –

PR2 – PR2 PS2 PS2 –

PR3 –

Binary Relationship with 1:1 cardinality and partial participation of both entities :

First Convert each entity and relationship to tables. Male table corresponds to Male Entity with key as
M-Id. Similarly Female table corresponds to Female Entity with key as F-Id. Marry Table represents
relationship between Male and Female (Which Male marries which female). So it will take attribute M-Id
from Male and F-Id from Female.
Male Marry Female

M-Id Other Male Attribute M-Id F-Id F-Id Other FemaleAttribute

M1 – M1 F2 F1 –

M2 – M2 F1 F2 –

M3 – F3 –

Binary Relationship with n: 1 cardinality

First Convert each entity and relationship to tables. Student table corresponds to Student Entity with
key as S-Id. Similarly Elective_Course table corresponds to Elective_Course Entity with key as E-Id.
Enrolls Table represents relationship between Student and Elective_Course (Which student enrolls in
which course). So it will take attribute S-Id from Student and E-Id from Elective_Course.
Student Enrolls Elective_Course

S- Other Student S- E- E- Other Elective


Id Attribute Id Id Id CourseAttribute

S1 – S1 E1 E1 –

S2 – S2 E2 E2 –

S3 – S3 E1 E3 –

S4 – S4 E1

Binary Relationship with m: n cardinality

First Convert each entity and relationship to tables. Student table corresponds to Student Entity with
key as S-Id. Similarly Compulsory_Courses table corresponds to Compulsory Courses Entity with key as
C-Id. Enrolls Table represents relationship between Student and Compulsory_Courses (Which student
enrolls in which course). So it will take attribute S-Id from Person and C-Id from Compulsory_Courses.
Student Enrolls Compulsory_Courses

S- Other Student S- C- C- Other Compulsory


Id Attribute Id Id Id CourseAttribute

S1 – S1 C1 C1 –

S2 – S1 C2 C2 –

S3 – S3 C1 C3 –

S4 – S4 C3 C4 –

S4 C2

S3 C3

Binary Relationship with weak entity

First Convert each entity and relationship to tables. Employee table corresponds to Employee Entity
with key as E-Id. Similarly Dependents table corresponds to Dependent Entity with key as D-Name and
E-Id. Has Table represents relationship between Employee and Dependents (Which employee has
which dependents). So it will take attribute E-Id from Employee and D-Name from Dependents.

Employee Has Dependents


E- Other Employee E- D- D- E- Other
Id Attribute Id Name Name Id DependentsAttribute

E1 – E1 RAM RAM E1 –

E2 – E1 SRINI SRINI E1 –

E3 – E2 RAM RAM E2 –

E3 ASHISH ASHISH E3 –

Functional Dependencies : page no:329

They describe the relationships between attributes in a relation (table) and help
ensure data integrity and normalization. It essentially means that the value of one
set of attributes determines the value of another set. For example, if we have
attributes A and B in a relation, we say that B is functionally dependent on A if, for
every value of A, there is only one corresponding value of B.
If X and Y are sets of attributes in a relation R, we say that Y is functionally
dependent on X (denoted as X -> Y) if, for every possible instance of X in R, there is
only one instance of Y.
Example:

StudentID -> Name, Age, Department This functional dependency means


that for every unique StudentID, there is only one corresponding Name, Age,
and Department. In other words, the StudentID uniquely determines the
Name, Age, and Department of a student.

Types:

Trivial Functional Dependencies: These are dependencies where the dependent attribute(s) are a
subset of the determinant attribute(s). For example, if X -> Y holds, then Y is a subset of X.
Non-Trivial Functional Dependencies: These are dependencies where the dependent attribute(s) are
not a subset of the determinant attribute(s). They convey significant relationships between attributes in
the database.

Full Functional Dependencies: A functional dependency Y is fully dependent on X if removing any


attribute from X would cause the dependency to be violated. In other words, Y is fully dependent on X if
X -> Y holds and for any proper subset X', X' -> Y does not hold.

Partial Functional Dependencies: These are dependencies where an attribute is functionally dependent
on only a part of a composite key (multiple attributes acting as the primary key).

Transitive Functional Dependencies: A functional dependency Y is transitively dependent on X if there


exists a set of attributes Z such that X -> Z and Z -> Y, but Z is not a subset of X.

Multivalued Dependencies (MVDs): These describe relationships among attributes where the presence
of certain values in one attribute determines the presence of other values in another attribute. They are
more complex than simple functional dependencies and are essential for database normalization
beyond third normal form (3NF).

Decomposition

Decomposition in the context of database management refers to the process of breaking down a single
relation (table) into multiple smaller relations. This process is typically carried out during database
normalization to reduce redundancy, improve data integrity, and facilitate efficient data management.

Types :

Non loss decomposition

Lossy decomposition:

Lossy decomposition, also known as lossy join decomposition, refers to a decomposition process in
database normalization where some information is lost during the decomposition, and as a result, it
cannot be reconstructed exactly from the decomposed relations. Unlike lossless decomposition, where
the original relation can be fully reconstructed, lossy decomposition results in the loss of some data or
information.

Non loss decomposition:

Non-loss decomposition, also known as lossless decomposition, is a crucial property in database


normalization. When you decompose a relation (table) into multiple smaller relations to eliminate
redundancy and dependency, you want to ensure that you can reconstruct the original relation without
any loss of information.
In other words, non-loss decomposition guarantees that if you join the decomposed relations together,
you will obtain the original relation exactly as it was before decomposition.

we have a relation (table) called Employee with the following attributes:

EmployeeID (Primary Key)

Name

Department

ManagerID (Foreign Key referencing EmployeeID)

Salary

Now, let's say we want to decompose this relation into smaller relations while ensuring non-loss
decomposition.

Step 1: Identify Functional Dependencies (FDs):

EmployeeID -> Name, Department, ManagerID, Salary

ManagerID -> EmployeeID

Step 2: Choose a Normal Form: Let's aim for Third Normal Form (3NF) for this example.

Step 3: Perform Decomposition: Based on the identified functional dependencies, we can decompose
the Employee relation into two smaller relations:

Relation 1 (Employee_Details):

EmployeeID (Primary Key)

Name

Department

Salary

Relation 2 (Management):

ManagerID (Primary Key)

EmployeeID (Foreign Key referencing EmployeeID in Employee_Details)

Step 4: Verify Losslessness: To ensure non-loss decomposition, we need to verify that we can
reconstruct the original Employee relation by joining the decomposed relations together. We can
achieve this by performing a natural join on the common attribute (EmployeeID) between the two
relations.

SELECT * FROM Employee_Details JOIN Management ON Employee_Details.EmployeeID =


Management.EmployeeID;

This query should return the original Employee relation without any loss of information.
Step 5: Enforce Referential Integrity: Ensure that referential integrity is maintained between the
decomposed relations by defining appropriate foreign key constraints and ensuring that each foreign
key references a valid primary key.

First normal form:

First Normal Form (1NF) is the initial step in the normalization process of a
relational database schema.

Atomic Values: Each attribute (column) in a relation must contain atomic (indivisible)
values. This means that the values cannot be further divided into smaller pieces. If
an attribute contains multiple values or composite values, it violates 1NF. To
conform to 1NF, attributes should be atomic.

Unique Column Names: Each column in a relation must have a unique name. This
ensures that attributes are distinctly identifiable within the relation.

No Repeating Groups: There should be no repeating groups of attributes within a tuple


(row). Each tuple should represent a single entity, and each attribute within that
tuple should have a single value. If a relation contains repeating groups of
attributes, it violates 1NF.

StudentID Name Courses


1 Alice Math, Science, English
2 Bob History, Math
3 Charlie Science, Spanish
This table violates 1NF because the "Courses" column contains multiple values

First, let's create a new table for courses:

Courses Table:

CourseID Course
1 Math
2 Science
3 English
4 History
5 Spanish
Next, let's modify the original table to reference the courses from the new
table:

Modified Students Table:

StudentID Name
1 Alice
2 Bob
3 Charlie

Now, we need an additional table to represent the relationship between


students and courses:

StudentCourses Table:

StudentID CourseID
1 1
1 2
1 3
2 4
2 1
3 2
3 5
In this normalized structure, each table represents a single entity, and each
attribute

Second Normal Form (2NF) is a level of database normalization that


builds upon the principles of First Normal Form (1NF). In 2NF, the database
table must fulfill the following two conditions:

contains atomic values.

1. It must be in First Normal Form (1NF):


 This means that the table must have a primary key, and each attribute
(column) must contain only atomic values, meaning it cannot be
further divided into smaller components.
2. All non-prime attributes are fully functionally dependent on the
primary key:
 This condition aims to eliminate partial dependencies. A partial
dependency occurs when a non-prime attribute depends on only a
portion of the primary key, rather than the entire primary key.
+-----------+------------+----------+-------------+
| Employee | Project_ID | Project | Department |
+-----------+------------+----------+-------------+
| John | 101 | Project1 | HR |
| John | 102 | Project2 | HR |
| Alice | 101 | Project1 | Finance |
| Alice | 103 | Project3 | Finance |
+-----------+------------+----------+-------------+
Coverting to 2nf
+-----------+------------+
| Employee | Project_ID |
+-----------+------------+
| John | 101 |
| John | 102 |
| Alice | 101 |
| Alice | 103 |
+-----------+------------+

+------------+----------+-------------+
| Project_ID | Project | Department |
+------------+----------+-------------+
| 101 | Project1 | HR |
| 102 | Project2 | HR |
| 103 | Project3 | Finance |
+------------+----------+-------------+

Multi-valued dependencies
Multi-valued dependencies (MVDs) are a type of constraint in relational database
theory that describe certain relationships between attributes within a table. They
extend the concept of functional dependencies and are used to ensure the integrity
of data in a normalized database schema.
if student ID 1 is enrolled in courses with IDs 101 and 102, and for each
course, there's a set of textbook ISBNs associated uniquely, then we have a
multi-valued dependency:

A (Student ID) B (Course ID) C (Textbook ISBN)


1 101 ISBN1, ISBN2
1 102 ISBN3, ISBN4

Here, knowing the value of B (Course ID) determines the set of values for C
(Textbook ISBNs) for each student (A). This dependency holds regardless of
which courses the student is enrolled in.

Third Normal Form (3NF) in database design is important for further improving
data integrity and reducing redundancy beyond First Normal Form (1NF) and
Second Normal Form (2NF).
Condition:

1.It must be in Second Normal Form (2NF):

2.No Transitive Dependencies: There should be no transitive dependencies in


the schema. This means that every non-prime attribute should be directly
dependent on the primary key, not on another non-prime attribute.
+-----------+------------+----------+-------------+

| Employee | Project_ID | Project | Department |

+-----------+------------+----------+-------------+

| John | 101 | Project1 | HR |

| John | 102 | Project2 | HR |

| Alice | 101 | Project1 | Finance |

| Alice | 103 | Project3 | Finance |

+-----------+------------+----------+-------------+

EmployeeProject Table:

+-----------+------------+

| Employee | Project_ID |

+-----------+------------+

| John | 101 |

| John | 102 |

| Alice | 101 |
| Alice | 103 |

+-----------+------------+

Project Table:

+------------+----------+

| Project_ID | Project |

+------------+----------+

| 101 | Project1 |

| 102 | Project2 |

| 103 | Project3 |

+------------+----------+

Department Table:

+------------+-------------+

| Project_ID | Department |

+------------+-------------+

| 101 | HR |

| 102 | HR |

| 101 | Finance |

| 103 | Finance |

+------------+-------------+

Now, all three tables satisfy 3NF:

 EmployeeProject table has a composite primary key (Employee,


Project_ID).
 Project table has a primary key (Project_ID).
 Department table has a primary key (Project_ID), and Department is
directly dependent on Project_ID, eliminating the transitive
dependency.

Boyce-Codd Normal Form (BCNF) is a further refinement of Third Normal


Form (3NF) and addresses certain types of anomalies not handled by 3NF. To
achieve BCNF, a relation must satisfy the following conditions:

1. It must be in Third Normal Form (3NF):


 The relation must already satisfy the conditions of 3NF, meaning:
It should be in 1NF, ensuring that each attribute contains atomic

values, and there are no repeating groups.
 All non-prime attributes are fully functionally dependent on the
primary key, eliminating partial and transitive dependencies.
2. Every determinant must be a candidate key:
 In every non-trivial functional dependency A → B (where A determines
B), A must be a superkey, i.e., A must be either a candidate key or a
superset of a candidate key.

Employee_ID | Project_ID | Project_Name | Department

---------------------------------------------------

E001 | P101 | Project1 | HR

E001 | P102 | Project2 | HR

E002 | P101 | Project1 | Finance

E002 | P103 | Project3 | Finance

Relation 1 (EmployeeProject):

Employee_ID | Project_ID

-------------------------

E001 | P101

E001 | P102

E002 | P101

E002 | P103

 Here, {Employee_ID, Project_ID} is the candidate key.


 Relation 2 (ProjectDetails):

Project_ID | Project_Name | Department

--------------------------------------

P101 | Project1 | HR

P102 | Project2 | HR

P103 | Project3 | Finance


4NF:

"4NF" typically refers to Fourth Normal Form in the context of database


normalization. It's a level of organization for a relational database that goes beyond
Third Normal Form (3NF). In 4NF, a table is in 3NF, and it also deals with certain
types of multi-valued dependencies. It's a less common form of normalization and is
usually applied in complex database designs where there are multiple multi-valued
dependencies that need to be addressed.

Consider a hypothetical table representing employees and the projects they


are assigned to:

Employee ID Employee Name Project ID Project Name


1 Alice 101 Project X
1 Alice 102 Project Y
2 Bob 101 Project X
3 Charlie 102 Project Y
3 Charlie 103 Project Z

In this table:

 Employee ID is the primary key.


 Employee Name and Project Name are non-key attributes.
 Employee ID and Project ID together form a composite key because an
employee can work on multiple projects, and a project can have
multiple employees.

Now, let's say we notice that there's a multi-valued dependency: Employee


Name depends on Employee ID, and Project Name depends on Project ID.
However, there's also an indirect dependency between Employee Name and
Project Name through the Employee ID and Project ID. For example, in the
given table, Alice is associated with both Project X and Project Y, Bob is
associated with Project X, and Charlie is associated with Projects Y and Z.

To normalize this table to 4NF, we need to decompose it into smaller tables


to eliminate the multi-valued dependencies. Here's how we could do it:

1. Employees Table:
Employee ID Employee Name
1 Alice
2 Bob
3 Charlie
2. Projects Table:
Project ID Project Name
101 Project X
102 Project Y
103 Project Z
3. Employee_Projects Table (to represent the many-to-many relationship
between employees and projects):
Employee ID Project ID
1 101
1 102
2 101
3 102
3 103

Now, the original table has been decomposed into three tables, each
representing a single entity without any multi-valued dependencies. This
satisfies Fourth Normal Form (4NF) by removing the indirect dependency
between Employee Name and Project Name through the composite keys
Employee ID and Project ID.

5nf:

Fifth Normal Form (5NF), also known as Project-Join Normal Form (PJNF), is a level of
database normalization that deals with complex relationships between attributes in
a table. While it's less commonly discussed than lower normal forms like 3NF and
4NF, it addresses certain types of join dependencies that may exist in a database
schema.

To explain Fifth Normal Form (5NF), let's start with a table that contains
information about employees and the departments they work in:

Employee ID Employee Name Department ID Department Name Manager ID


1 Alice 101 HR 10
2 Bob 102 IT 11
3 Charlie 101 HR 10
4 Diana 103 Finance 12
5 Eric 102 IT 11

In this table:

 Employee ID is the primary key.


 Department ID is a foreign key referencing the Department ID in the
Departments table.
 Manager ID is a foreign key referencing the Employee ID of the
manager.
Now, let's say we want to normalize this table to Fifth Normal Form (5NF) to
address certain types of join dependencies.

In 5NF, we are concerned with join dependencies where the join operation
itself causes a redundancy or anomaly. In this table, we have a join
dependency between Department Name and Manager ID. For example, in
the HR department, both Alice and Charlie report to the manager with
Employee ID 10.

To normalize this table to 5NF, we can decompose it into the following


tables:

1. Employees Table:
Employee ID Employee Name Department ID Manager ID
1 Alice 101 10
2 Bob 102 11
3 Charlie 101 10
4 Diana 103 12
5 Eric 102 11
2. Departments Table:
Department ID Department Name
101 HR
102 IT
103 Finance
3. Management Table:
Department ID Manager ID
101 10
102 11
103 12

Now, the original table has been decomposed into three tables, and the join
dependency between Department Name and Manager ID has been
eliminated. Each table represents a single entity without any join
dependencies causing redundancy or anomalies, satisfying Fifth Normal
Form (5NF).
Possible 2 mark

 Describe the main components of an E-R Diagram and their roles in


representing relationships between entities.
 Explain the additional features of the Enhanced-ER Model compared to
the traditional E-R model, and how they improve the representation of
complex relationships.
 Describe the process of converting an E-R Diagram into a set of
relational tables, including the mapping of entities, relationships, and
attributes.
 Define functional dependencies in the context of database design and
provide an example to illustrate the concept.
 Explain what non-loss decomposition is and why it is important in
database normalization.
 Briefly describe each of the first three normal forms (1NF, 2NF, 3NF)
and provide an example of a violation of each.
 Discuss the concept of dependency preservation in the context of
database normalization and why it is desirable.
 Define Boyce/Codd Normal Form (BCNF) and explain its significance in
ensuring data integrity in a relational database.
 Define multi-valued dependencies and explain how they are addressed
in the Fourth Normal Form (4NF).
 Explain what join dependencies are and how they are handled in the
Fifth Normal Form (5NF) to achieve further normalization.

You might also like