Open In App

How to find the highest normal form of a relation

Last Updated : 13 Jan, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

Normalization is the process of structuring data in a database by creating tables and defining relationships between them. This ensures data consistency, protection, and improves the database's efficiency and flexibility. Typically, every table in a relational database is assumed to be in the first normal form (1NF), which requires that all attributes contain atomic (indivisible) values, meaning no multiple values are allowed in a single row.

For a table to achieve the second normal form (2NF), it must eliminate any partial dependencies. To satisfy the third normal form (3NF), the table must also be free of transitive dependencies. Lastly, for a table to be in Boyce-Codd Normal Form (BCNF), every determinant in the functional dependencies must be a super-key.

To understand this topic, you should have a basic idea about Functional Dependency , Candidate keys and Normal forms .

Steps to find the highest normal form of relation: 

  1. Find all possible candidate keys of the relation.
  2. Divide all attributes into two categories: prime attributes and non-prime attributes.
  3. Check for 1st normal form then 2nd and so on. If it fails to satisfy the nth normal form condition, the highest normal form will be n-1.

Check for 1NF (First Normal Form) :

  • Verify that all columns contain only single values (no lists or arrays).
    • Example of non-atomic: {Math, Science}.

Check for 2NF (Second Normal Form) :

  • Identify the primary key (composite or single).
  • Check for partial dependencies (where non-prime attributes depend on part of a composite key instead of the whole key).
  • If partial dependencies exists, the relation is not in 2nf.

Check for 3NF (Third Normal Form) :

  • Identify all functional dependencies.
  • Check for transitive dependencies (non-prime attributes depend on other non-prime attributes).
  • If transitive dependencies exists, the relation is not in 3nf.

Check for BCNF (Boyce-Codd Normal Form) :

  • Identify all functional dependencies.
  • Check if the left-hand side (determinant) of every functional dependency is a super-key.
  • If not, the relation is not in BCNF.


Example 1. Find the highest normal form of a relation R(A,B,C,D,E) with FD set {A->D, B->A, BC->D, AC->BE} 

Step 1.   As we can see, (AC)+ ={A, C, B, E, D}  but none of its subsets can determine all attributes of relation, So AC will be the candidate key. A can be derived from B, so we can replace A in AC with B. So BC will also be a candidate key. So there will be two candidate keys {AC, BC}.

Step 2.  The prime attribute is those attribute which is part of candidate key {A, B, C} in this example and others will be non-prime {D, E} in this example.

Step 3.  The relation R is in 1st normal form as a relational DBMS does not allow multi-valued or composite attributes.

The relation is not in the 2nd Normal form because A->D is partial dependency (A which is a subset of candidate key AC is determining non-prime attribute D) and the 2nd normal form does not allow partial dependency.

So the highest normal form will be the 1st Normal Form.

Example 2. Find the highest normal form of a relation R(A,B,C,D,E) with FD set as {BC->D, AC->BE, B->E} 

Step 1.   As we can see, (AC)+ ={A,C,B,E,D}  but none of its subsets can determine all attributes of relation, So AC will be the candidate key. A or C can’t be derived from any other attribute of the relation, so there will be only 1 candidate key {AC}. 

Step 2.  The prime attribute is those attribute which is part of candidate key {A,C} in this example and others will be non-prime {B,D,E} in this example. 

Step 3.  The relation R is in 1st normal form as a relational DBMS does not allow multi-valued or composite attributes. 

The relation is in 2nd normal form because BC->D is in 2nd normal form (BC is not a proper subset of candidate key AC) and AC->BE is in 2nd normal form (AC is candidate key) and B->E is in 2nd normal form (B is not a proper subset of candidate key AC). 

The relation is not in 3rd normal form because in BC->D (neither BC is a super key nor D is a prime attribute) and in B->E (neither B is a super key nor E is a prime attribute) but to satisfy 3rd normal for, either LHS of an FD should be super key or RHS should be a prime attribute. 

So the highest normal form of relation will be the 2nd Normal form. 

Example 3. Find the highest normal form of a relation R(A,B,C,D,E) with FD set {B->A, A->C, BC->D, AC->BE} 

Step 1.   As we can see, (B)+ ={B,A,C,D,E}, so B will be candidate key. B can be derived from AC using AC->B (Decomposing AC->BE to AC->B and AC->E). So AC will be super key but (C)+ ={C} and (A)+ ={A,C,B,E,D}. So A (subset of AC) will be candidate key. So there will be two candidate keys {A,B}. 

Step 2.  The prime attribute is those attribute which is part of candidate key {A,B} in this example and others will be non-prime {C,D,E} in this example. 

Step 3.  The relation R is in 1st normal form as a relational DBMS does not allow multi-valued or composite attributes. 

The relation is in 2nd normal form because B->A is in 2nd normal form (B is a super key) and A->C is in 2nd normal form (A is super key) and BC->D is in 2nd normal form (BC is a super key) and AC->BE is in 2nd normal form (AC is a super key). 

The relation is in 3rd normal form because the LHS of all FD’s is super keys. The relation is in BCNF as all LHS of all FD’s are super keys. So the highest normal form is BCNF. 

For the video solution of the above examples Refer Here.


Next Article

Similar Reads