0% found this document useful (0 votes)
9 views

DBMS_UNIT4

Uploaded by

gedelaarjun333
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

DBMS_UNIT4

Uploaded by

gedelaarjun333
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

Problems caused by redundancy in DBMS

Data redundancy in databases refers to the unnecessary duplication of data. It can arise from poor
database design or lack of proper normalization. Redundancy can cause several issues:

Problems Caused by Redundancy

1. Wasted Storage

Storing duplicate data consumes more storage than necessary.

2. Data Anomalies

These are inconsistencies that arise due to redundancy.

 Update Anomalies: When you have the same piece of data stored in multiple places,
updating it in one place can lead to inconsistency if it's not updated everywhere.

 Insertion Anomalies: You might have to insert redundant data in multiple places, leading to
inconsistencies.

 Deletion Anomalies: Deleting data in one table might unintentionally remove necessary data
that's needed elsewhere.

3. Increased Complexity

Querying and maintaining redundant data can be more complex.

4. Performance Issues

Duplicate data can slow down search, update, and insert operations.

5. Data Integrity Issues

If data is inconsistent across tables, it can lead to data integrity issues.

Decomposition In DBMS

Decomposition refers to the division of tables into multiple tables to produce consistency in the data.
In this article, we will learn about the Database concept. This article is related to the concept of
Decomposition in DBMS. It explains the definition of Decomposition, types of Decomposition in
DBMS, and its properties.

What is Decomposition in DBMS?

When we divide a table into multiple tables or divide a relation into multiple relations, then this
process is termed Decomposition in DBMS. We perform decomposition in DBMS when we want to
process a particular data set. It is performed in a database management system when we need to
ensure consistency and remove anomalies and duplicate data present in the database. When we
perform decomposition in DBMS, we must try to ensure that no information or data is lost.
Decomposition in DBMS

Types of Decomposition

There are two types of Decomposition:

 Lossless Decomposition

 Lossy Decomposition

Types of Decomposition

Lossless Decomposition
The process in which where we can regain the original relation R with the help of joins from the
multiple relations formed after decomposition. This process is termed as lossless decomposition. It is
used to remove the redundant data from the database while retaining the useful information. The
lossless decomposition tries to ensure following things:

 While regaining the original relation, no information should be lost.

 If we perform join operation on the sub-divided relations, we must get the original relation.

Example:

There is a relation called R(A, B, C)

A B C

55 16 27

48 52 89

Now we decompose this relation into two sub relations R1 and R2

R1(A, B)

A B

55 16

48 52

R2(B, C)

B C

16 27

52 89

After performing the Join operation we get the same original relation
A B C

55 16 27

48 52 89

Lossy Decomposition

As the name suggests, lossy decomposition means when we perform join operation on the sub-
relations it doesn't result to the same relation which was decomposed. After the join operation, we
always found some extraneous tuples. These extra tuples genrates difficulty for the user to identify
the original tuples.

Example:

We have a relation R(A, B, C)

A B C

1 2 1

2 5 3

3 3 3

Now , we decompose it into sub-relations R1 and R2

R1(A, B)

A B

1 2

2 5

3 3

R2(B, C)
B C

2 1

5 3

3 3

Now After performing join operation

A B C

1 2 1

2 5 3

2 3 3

3 5 3

3 3 3

Properties of Decomposition

 Lossless: All the decomposition that we perform in Database management system should be
lossless. All the information should not be lost while performing the join on the sub-relation
to get back the original relation. It helps to remove the redundant data from the database.

 Dependency Preservation: Dependency Preservation is an important technique in database


management system. It ensures that the functional dependencies between the entities is
maintained while performing decomposition. It helps to improve the database efficiency,
maintain consistency and integrity.

 Lack of Data Redundancy: Data Redundancy is generally termed as duplicate data or


repeated data. This property states that the decomposition performed should not suffer
redundant data. It will help us to get rid of unwanted data and focus only on the useful data
or information.
Problems Related to Decomposition

1. Loss of Information

 Non-loss decomposition: When a relation is decomposed into two or more smaller relations,
and the original relation can be perfectly reconstructed by taking the natural join of the
decomposed relations, then it is termed as lossless decomposition. If not, it is termed "lossy
decomposition."

2. Loss of Functional Dependency

 Once tables are decomposed, certain functional dependencies might not be preserved,
which can lead to the inability to enforce specific integrity constraints.

 Example: If you have the functional dependency `A → B` in the original table, but in the
decomposed tables, there is no table with both `A` and `B`, this functional dependency can't
be preserved.

3. Increased Complexity

 Decomposition leads to an increase in the number of tables, which can complicate queries
and maintenance tasks. While tools and ORM (Object-Relational Mapping) libraries can
mitigate this to some extent, it still adds complexity.

4. Redundancy
 Incorrect decomposition might not eliminate redundancy, and in some cases, can even
introduce new redundancies.

5. Performance Overhead

 An increased number of tables, while aiding normalization, can also lead to more complex
SQL queries involving multiple joins, which can introduce performance overheads.

Functional Dependency

A functional dependency occurs when one attribute uniquely determines another attribute within a
relation. It is a constraint that describes how attributes in a table relate to each other. If attribute A
functionally determines attribute B we write this as the A→B.

Functional dependencies are used to mathematically express relations among database entities and
are very important to understanding advanced concepts in Relational Database Systems.
Example:

roll_no name dept_name dept_building

42 abc CO A4

43 pqr IT A3

44 xyz CO A4

45 xyz IT A3

46 mno EC B2

47 jkl ME B2

From the above table we can conclude some valid functional dependencies:
 roll_no → { name, dept_name, dept_building },→ Here, roll_no can determine values of
fields name, dept_name and dept_building, hence a valid Functional dependency

 roll_no → dept_name , Since, roll_no can determine whole set of {name, dept_name,
dept_building}, it can determine its subset dept_name also.
 dept_name → dept_building , Dept_name can identify the dept_building accurately, since
departments with different dept_name will also have a different dept_building

 More valid functional dependencies: roll_no → name, {roll_no, name} ⇢ {dept_name,


dept_building}, etc.

Here are some invalid functional dependencies:

 name → dept_name Students with the same name can have different dept_name, hence
this is not a valid functional dependency.

 dept_building → dept_name There can be multiple departments in the same building.


Example, in the above table departments ME and EC are in the same building B2, hence
dept_building → dept_name is an invalid functional dependency.

 More invalid functional dependencies: name → roll_no, {name, dept_name} → roll_no,


dept_building → roll_no, etc.
Types of Functional Dependencies in DBMS

1. Trivial functional dependency

2. Non-Trivial functional dependency

3. Multivalued functional dependency

4. Transitive functional dependency

1. Trivial Functional Dependency

In Trivial Functional Dependency, a dependent is always a subset of the determinant. i.e. If X →


Y and Y is the subset of X, then it is called trivial functional dependency

Example:

roll_no name age

42 abc 17

43 pqr 18

44 xyz 18

Here, {roll_no, name} → name is a trivial functional dependency, since the dependent name is a
subset of determinant set {roll_no, name}. Similarly, roll_no → roll_no is also an example of trivial
functional dependency.

2. Non-trivial Functional Dependency

In Non-trivial functional dependency, the dependent is strictly not a subset of the determinant. i.e.
If X → Y and Y is not a subset of X, then it is called Non-trivial functional dependency.

Example:

roll_no name age

42 abc 17

43 pqr 18

44 xyz 18
Here, roll_no → name is a non-trivial functional dependency, since the dependent name is not a
subset of determinant roll_no. Similarly, {roll_no, name} → age is also a non-trivial functional
dependency, since age is not a subset of {roll_no, name}

3. Multivalued Functional Dependency

In Multivalued functional dependency, entities of the dependent set are not dependent on each
other. i.e. If a → {b, c} and there exists no functional dependency between b and c, then it is called
a multivalued functional dependency.

For example,

roll_no name age

42 abc 17

43 pqr 18

44 xyz 18

45 abc 19

Here, roll_no → {name, age} is a multivalued functional dependency, since the


dependents name & age are not dependent on each other(i.e. name → age or age → name doesn’t
exist !)

4. Transitive Functional Dependency

In transitive functional dependency, dependent is indirectly dependent on determinant. i.e. If a →


b & b → c, then according to axiom of transitivity, a → c. This is a transitive functional dependency.

For example,

enrol_no name dept building_no

42 abc CO 4

43 pqr EC 2

44 xyz IT 1
enrol_no name dept building_no

45 abc EC 2

Here, enrol_no → dept and dept → building_no. Hence, according to the axiom of
transitivity, enrol_no → building_no is a valid functional dependency. This is an indirect functional
dependency, hence called Transitive functional dependency.

5. Fully Functional Dependency

In full functional dependency an attribute or a set of attributes uniquely determines another


attribute or set of attributes. If a relation R has attributes X, Y, Z with the dependencies X->Y and X->Z
which states that those dependencies are fully functional.

6. Partial Functional Dependency

In partial functional dependency a non key attribute depends on a part of the composite key, rather
than the whole key. If a relation R has attributes X, Y, Z where X and Y are the composite key and Z is
non key attribute. Then X->Z is a partial functional dependency in RBDMS.

Advantages of Functional Dependencies

Functional dependencies having numerous applications in the field of database management system.
Here are some applications listed below:

1. Data Normalization

Data normalization is the process of organizing data in a database in order to minimize redundancy
and increase data integrity. Functional dependencies play an important part in data normalization.
With the help of functional dependencies we are able to identify the primary key, candidate key in a
table which in turns helps in normalization.

2. Query Optimization

With the help of functional dependencies we are able to decide the connectivity between the tables
and the necessary attributes need to be projected to retrieve the required data from the tables. This
helps in query optimization and improves performance.

3. Consistency of Data

Functional dependencies ensures the consistency of the data by removing any redundancies or
inconsistencies that may exist in the data. Functional dependency ensures that the changes made in
one attribute does not affect inconsistency in another set of attributes thus it maintains the
consistency of the data in database.

4. Data Quality Improvement

Functional dependencies ensure that the data in the database to be accurate, complete and
updated. This helps to improve the overall quality of the data, as well as it eliminates errors and
inaccuracies that might occur during data analysis and decision making, thus functional dependency
helps in improving the quality of data in database.
First Normal Form (1NF)

o A relation will be 1NF if it contains an atomic value.

o It states that an attribute of a table cannot hold multiple values. It must hold only single-
valued attribute.

o First normal form disallows the multi-valued attribute, composite attribute, and their
combinations.

Example: Relation EMPLOYEE is not in 1NF because of multi-valued attribute EMP_PHONE.

EMPLOYEE table:

EMP_ID EMP_NAME EMP_PHONE EMP_STATE

7272826385,
14 John UP
9064738238

20 Harry 8574783832 Bihar

7390372389,
12 Sam Punjab
8589830302

The decomposition of the EMPLOYEE table into 1NF has been shown below:

EMP_ID EMP_NAME EMP_PHONE EMP_STATE

14 John 7272826385 UP

14 John 9064738238 UP

20 Harry 8574783832 Bihar

12 Sam 7390372389 Punjab

12 Sam 8589830302 Punjab


Second Normal Form (2NF)

o In the 2NF, relational must be in 1NF.

o In the second normal form, all non-key attributes are fully functional dependent on the
primary key

Example: Let's assume, a school can store the data of teachers and the subjects they teach. In a
school, a teacher can teach more than one subject.

TEACHER table

TEACHER_ID SUBJECT TEACHER_AGE

25 Chemistry 30

25 Biology 30

47 English 35

83 Math 38

83 Computer 38

In the given table, non-prime attribute TEACHER_AGE is dependent on TEACHER_ID which is a proper
subset of a candidate key. That's why it violates the rule for 2NF.

To convert the given table into 2NF, we decompose it into two tables:

TEACHER_DETAIL table:

TEACHER_ID TEACHER_AGE

25 30

47 35

83 38
TEACHER_SUBJECT table:

TEACHER_ID SUBJECT

25 Chemistry

25 Biology

47 English

83 Math

83 Computer

Third Normal Form (3NF)

o A relation will be in 3NF if it is in 2NF and not contain any transitive partial dependency.

o 3NF is used to reduce the data duplication. It is also used to achieve the data integrity.

o If there is no transitive dependency for non-prime attributes, then the relation must be in
third normal form.

A relation is in third normal form if it holds atleast one of the following conditions for every non-
trivial function dependency X → Y.

1. X is a super key.

2. Y is a prime attribute, i.e., each element of Y is part of some candidate key.

Example:
EMPLOYEE_DETAIL table:

EMP_ID EMP_NAME EMP_ZIP EMP_STATE EMP_CITY

222 Harry 201010 UP Noida

333 Stephan 02228 US Boston

444 Lan 60007 US Chicago

555 Katharine 06389 UK Norwich

666 John 462007 MP Bhopal

Super key in the table above:

1. {EMP_ID}, {EMP_ID, EMP_NAME}, {EMP_ID, EMP_NAME, EMP_ZIP}....so on

Candidate key: {EMP_ID}

Non-prime attributes: In the given table, all attributes except EMP_ID are non-prime.

Here, EMP_STATE & EMP_CITY dependent on EMP_ZIP and EMP_ZIP dependent on EMP_ID. The
non-prime attributes (EMP_STATE, EMP_CITY) transitively dependent on super key(EMP_ID). It
violates the rule of third normal form.
That's why we need to move the EMP_CITY and EMP_STATE to the new <EMPLOYEE_ZIP> table, with
EMP_ZIP as a Primary key.

EMPLOYEE table:

EMP_ID EMP_NAME EMP_ZIP

222 Harry 201010

333 Stephan 02228

444 Lan 60007

555 Katharine 06389

666 John 462007


EMPLOYEE_ZIP table:

EMP_ZIP EMP_STATE EMP_CITY

201010 UP Noida

02228 US Boston

60007 US Chicago

06389 UK Norwich

462007 MP Bhopal

Boyce Codd normal form (BCNF)

o BCNF is the advance version of 3NF. It is stricter than 3NF.

o A table is in BCNF if every functional dependency X → Y, X is the super key of the table.

o For BCNF, the table should be in 3NF, and for every FD, LHS is super key.

Example: Let's assume there is a company where employees work in more than one department.

EMPLOYEE table:

EMP_ID EMP_COUNTRY EMP_DEPT DEPT_TYPE EMP_DEPT_NO

264 India Designing D394 283

264 India Testing D394 300

364 UK Stores D283 232

364 UK Developing D283 549


In the above table Functional dependencies are as follows:

1. EMP_ID → EMP_COUNTRY

2. EMP_DEPT → {DEPT_TYPE, EMP_DEPT_NO}

Candidate key: {EMP-ID, EMP-DEPT}

Advertisement

The table is not in BCNF because neither EMP_DEPT nor EMP_ID alone are keys.

To convert the given table into BCNF, we decompose it into three tables:

EMP_COUNTRY table:

EMP_ID EMP_COUNTRY

264 India

264 India

EMP_DEPT table:

EMP_DEPT DEPT_TYPE EMP_DEPT_NO

Designing D394 283

Testing D394 300

Stores D283 232

Developing D283 549


EMP_DEPT_MAPPING table:

EMP_ID EMP_DEPT

D394 283

D394 300

D283 232

D283 549

Functional dependencies:

1. EMP_ID → EMP_COUNTRY

2. EMP_DEPT → {DEPT_TYPE, EMP_DEPT_NO}

Candidate keys:
For the first table: EMP_ID
For the second table: EMP_DEPT
For the third table: {EMP_ID, EMP_DEPT}

Now, this is in BCNF because left side part of both the functional dependencies is a key.

Fourth normal form (4NF)

o A relation will be in 4NF if it is in Boyce Codd normal form and has no multi-valued
dependency.

o For a dependency A → B, if for a single value of A, multiple values of B exists, then the
relation will be a multi-valued dependency.
Example

STUDENT

STU_ID COURSE HOBBY

21 Computer Dancing

21 Math Singing

34 Chemistry Dancing

74 Biology Cricket

59 Physics Hockey

The given STUDENT table is in 3NF, but the COURSE and HOBBY are two independent entity. Hence,
there is no relationship between COURSE and HOBBY.

In the STUDENT relation, a student with STU_ID, 21 contains two courses, Computer and Math and
two hobbies, Dancing and Singing. So there is a Multi-valued dependency on STU_ID, which leads to
unnecessary repetition of data.

So to make the above table into 4NF, we can decompose it into two tables:

STUDENT_COURSE

STU_ID COURSE

21 Computer

21 Math

34 Chemistry

74 Biology

59 Physics
STUDENT_HOBBY

STU_ID HOBBY

21 Dancing

21 Singing

34 Dancing

74 Cricket

59 Hockey

Fifth normal form (5NF)

o A relation is in 5NF if it is in 4NF and not contains any join dependency and joining should be
lossless.

o 5NF is satisfied when all the tables are broken into as many tables as possible in order to
avoid redundancy.

o 5NF is also known as Project-join normal form (PJ/NF).

Example

SUBJECT LECTURER SEMESTER

Computer Anshika Semester 1

Computer John Semester 1

Math John Semester 1

Math Akash Semester 2

Chemistry Praveen Semester 1


In the above table, John takes both Computer and Math class for Semester 1 but he doesn't take
Math class for Semester 2. In this case, combination of all these fields required to identify a valid
data.

Suppose we add a new Semester as Semester 3 but do not know about the subject and who will be
taking that subject so we leave Lecturer and Subject as NULL. But all three columns together acts as a
primary key, so we can't leave other two columns blank.

So to make the above table into 5NF, we can decompose it into three relations P1, P2 & P3:

P1

SEMESTER SUBJECT

Semester 1 Computer

Semester 1 Math

Semester 1 Chemistry

Semester 2 Math

P2

SUBJECT LECTURER

Computer Anshika

Computer John

Math John

Math Akash

Chemistry Praveen
P3

SEMSTER LECTURER

Semester 1 Anshika

Semester 1 John

Semester 1 John

Semester 2 Akash

Semester 1 Praveen

Decomposition in DBMS.

When we divide a table into multiple tables or divide a relation into multiple relations, then this
process is termed Decomposition in DBMS. We perform decomposition in DBMS when we want to
process a particular data set. It is performed in a database management system when we need to
ensure consistency and remove anomalies and duplicate data present in the database. When we
perform decomposition in DBMS, we must try to ensure that no information or data is lost.
Decomposition in DBMS

Types of Decomposition

There are two types of Decomposition:

 Lossless Decomposition

 Lossy Decomposition
Types of Decomposition

Lossless Decomposition

The process in which where we can regain the original relation R with the help of joins from the
multiple relations formed after decomposition. This process is termed as lossless decomposition. It
is used to remove the redundant data from the database while retaining the useful information.
The lossless decomposition tries to ensure following things:

 While regaining the original relation, no information should be lost.

 If we perform join operation on the sub-divided relations, we must get the original
relation.

Example:

There is a relation called R(A, B, C)

A B C

55 16 27

48 52 89
Now we decompose this relation into two sub relations R1 and R2

R1(A, B)

A B

55 16

48 52

R2(B, C)

B C

16 27

52 89

After performing the Join operation we get the same original relation

A B C

55 16 27

48 52 89

Lossy Decomposition

As the name suggests, lossy decomposition means when we perform join operation on the sub-
relations it doesn't result to the same relation which was decomposed. After the join operation,
we always found some extraneous tuples. These extra tuples genrates difficulty for the user to
identify the original tuples.
Example:

We have a relation R(A, B, C)

A B C

1 2 1

2 5 3

3 3 3

Now , we decompose it into sub-relations R1 and R2

R1(A, B)

A B

1 2

2 5

3 3
R2(B, C)

B C

2 1

5 3

3 3

Now After performing join operation

A B C

1 2 1

2 5 3

2 3 3

3 5 3

3 3 3

Properties of Decomposition

 Lossless: All the decomposition that we perform in Database management system should
be lossless. All the information should not be lost while performing the join on the sub-
relation to get back the original relation. It helps to remove the redundant data from the
database.
 Dependency Preservation: Dependency Preservation is an important technique in database
management system. It ensures that the functional dependencies between the entities is
maintained while performing decomposition. It helps to improve the database efficiency,
maintain consistency and integrity.

 Lack of Data Redundancy/Schema Refinement in Database Design:

Data Redundancy is generally termed as duplicate data or repeated data. This property
states that the decomposition performed should not suffer redundant data. It will help us
to get rid of unwanted data and focus only on the useful data or information.

You might also like