0% found this document useful (0 votes)
10 views

unit-3 DBMS

This document covers SQL queries, constraints, and triggers, detailing various SQL commands categorized into DDL, DML, DCL, DQL, and TCL. It explains the structure and purpose of each command, along with examples, and discusses schema refinement, integrity constraints, and the use of triggers in database management. Additionally, it outlines the basic form of SQL queries and the functionality of the UNION operator for combining result sets.

Uploaded by

saiipavann035
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

unit-3 DBMS

This document covers SQL queries, constraints, and triggers, detailing various SQL commands categorized into DDL, DML, DCL, DQL, and TCL. It explains the structure and purpose of each command, along with examples, and discusses schema refinement, integrity constraints, and the use of triggers in database management. Additionally, it outlines the basic form of SQL queries and the functionality of the UNION operator for combining result sets.

Uploaded by

saiipavann035
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 61

UNIT -3

SQL: QUERIES, CONSTRAINTS, TRIGGERS: form of basic SQL query, UNION, INTERSECT,
and EXCEPT, Nested Queries, aggregation operators, NULL values, complex integrity constraints
in SQL, triggers and active databases. Schema Refinement: Problems caused by redundancy,
decompositions, problems related to decomposition, reasoning about functional dependencies, First,
Second, Third normal forms, BCNF, lossless join decomposition, multivalued dependencies, Fourth
normal form, Fifth normal form.

SQL commands: SQL commands are essential for managing databases effectively. These
commands are divided into categories such as Data Definition Language (DDL), Data
Manipulation Language (DML), Data Control Language (DCL), Data Query Language
(DQL), and Transaction Control Language (TCL).

In this article, we will explain the different types of SQL commands,


including DDL, DML, DCL, DQL, and TCL. These SQL sublanguages serve specific
purposes and are important for effective database management.
SQL Commands are mainly categorized into five categories:
 DDL – Data Definition Language
 DQL – Data Query Language
 DML – Data Manipulation Language
 DCL – Data Control Language
 TCL – Transaction Control Language

1
1. Data Definition Language (DDL) in SQL

DDL or Data Definition Language actually consists of the SQL commands that can be
used to defining, altering, and deleting database structures such as tables, indexes,
and schemas. It simply deals with descriptions of the database schema and is used
to create and modify the structure of database objects in the database
Common DDL Commands
Command Description Syntax

Create database or its objects


CREATE TABLE table_name (column1
CREATE (table, index, function, views, data_type, column2 data_type, ...);
store procedure, and triggers)

DROP Delete objects from the database DROP TABLE table_name;

ALTER TABLE table_name ADD


ALTER Alter the structure of the database COLUMN column_name data_type;

Remove all records from a table,


TRUNCATE including all spaces allocated for TRUNCATE TABLE table_name;
the records are removed

Add comments to the data COMMENT 'comment_text' ON TABLE


COMMENT table_name;
dictionary

Rename an object existing in the RENAME TABLE old_table_name TO


RENAME new_table_name;
database

Example of DDL
CREATE TABLE employees (
employee_id INT PRIMARY KEY,
first_name VARCHAR(50),
last_name VARCHAR(50),
hire_date DATE
);
In this example, a new table called employees is created with columns for employee ID,
first name, last name, and hire date.

2. Data Query Language (DQL) in SQL


DQL statements are used for performing queries on the data within schema objects. The
purpose of the DQL Command is to get some schema relation based on the query passed

2
to it. This command allows getting the data out of the database to perform operations with
it. When a SELECT is fired against a table or tables the result is compiled into a
further temporary table, which is displayed or perhaps received by the program.
DQL Command
Command Description Syntax

It is used to retrieve data from SELECT column1, column2, ...FROM table_name


SELECT WHERE condition;
the database

Example of DQL
SELECT first_name, last_name, hire_date
FROM employees
WHERE department = 'Sales'
ORDER BY hire_date DESC;
This query retrieves employees’ first and last names, along with their hire dates, from the
employees table, specifically for those in the ‘Sales’ department, sorted by hire date.
3. Data Manipulation Language (DML) in SQL
The SQL commands that deal with the manipulation of data present in the database
belong to DML or Data Manipulation Language and this includes most of the SQL
statements. It is the component of the SQL statement that controls access to data and to
the database. Basically, DCL statements are grouped with DML statements.
Common DML Commands
Command Description Syntax

INSERT INTO table_name (column1, column2, ...)


INSERT Insert data into a table VALUES (value1, value2, ...);

Update existing data UPDATE table_name SET column1 = value1, column2 =


UPDATE value2 WHERE condition;
within a table

Delete records from a


DELETE DELETE FROM table_name WHERE condition;
database table

Table control
LOCK LOCK TABLE table_name IN lock_mode;
concurrency

Call a PL/SQL or
CALL CALL procedure_name(arguments);
JAVA subprogram

EXPLAIN Describe the access


EXPLAIN PLAN FOR SELECT * FROM table_name;
PLAN path to data

3
Example of DML
INSERT INTO employees (first_name, last_name, department)
VALUES ('Jane', 'Smith', 'HR');
This query inserts a new record into the employees table with the first name ‘Jane’, last
name ‘Smith’, and department ‘HR’.
4. Data Control Language (DCL) in SQL
DCL (Data Control Language) includes commands such
as GRANT and REVOKE which mainly deal with the rights, permissions, and other
controls of the database system. These commands are used to control access to data in the
database by granting or revoking permissions.
Common DCL Commands
Command Description Syntax

Assigns new privileges to a user


GRANT privilege_type [(column_list)]
account, allowing access to
GRANT ON [object_type] object_name TO user
specific database objects, [WITH GRANT OPTION];
actions, or functions.

Removes previously granted


privileges from a user account, REVOKE [GRANT OPTION FOR]
privilege_type [(column_list)] ON
REVOKE taking away their access to [object_type] object_name FROM user
certain database objects or [CASCADE];
actions.

Example of DCL
GRANT SELECT, UPDATE ON employees TO user_name;
This command grants the user user_name the permissions to select and update records in
the employees table.
5. Transaction Control Language (TCL) in SQL
Transactions group a set of tasks into a single execution unit. Each transaction begins
with a specific task and ends when all the tasks in the group are successfully completed. If
any of the tasks fail, the transaction fails. Therefore, a transaction has only two
results: success or failure. We can explore more about transactions here.
Common TCL Commands
Command Description Syntax

BEGIN BEGIN TRANSACTION


Starts a new transaction [transaction_name];
TRANSACTION

Saves all changes made during


COMMIT COMMIT;
the transaction

4
Command Description Syntax

Undoes all changes made during


ROLLBACK ROLLBACK;
the transaction

Creates a savepoint within the


SAVEPOINT SAVEPOINT savepoint_name;
current transaction

Example of TCL
BEGIN TRANSACTION;
UPDATE employees SET department = 'Marketing' WHERE department = 'Sales';
SAVEPOINT before_update;
UPDATE employees SET department = 'IT' WHERE department = 'HR';
ROLLBACK TO SAVEPOINT before_update;
COMMIT;
In this example, a transaction is started, changes are made, and a savepoint is set. If
needed, the transaction can be rolled back to the savepoint before being committed.
Important SQL Commands
1. SELECT: Used to retrieve data from a database.
2. INSERT: Used to add new data to a database.
3. UPDATE: Used to modify existing data in a database.
4. DELETE: Used to remove data from a database.
5. CREATE TABLE: Used to create a new table in a database.
6. ALTER TABLE: Used to modify the structure of an existing table.
7. DROP TABLE: Used to delete an entire table from a database.
8. WHERE: Used to filter rows based on a specified condition.
9. ORDER BY: Used to sort the result set in ascending or descending order.
10. JOIN: Used to combine rows from two or more tables based on a related column
between them.

QUERIES:
A query in a DBMS is a request made by a user or application to retrieve or manipulate
data stored in a database. This request is typically formulated using a structured query
language (SQL) or a query interface provided by the DBMS. The primary purpose of a
query is to specify precisely what data is needed and how it should be retrieved or
modified.
SQL (Structured Query Language)
A standardized programming language used to interact with relational databases. SQL
provides a set of commands for querying, updating, and managing databases.
Table
A fundamental component of a relational database, representing a collection of related data
organized into rows and columns. Each table in a database typically corresponds to a
specific entity or concept.
Field/Column
5
A single piece of data stored within a table, representing a specific attribute or
characteristic of the entities described by the table.
Record/Row
A complete set of data representing an individual instance or entity stored within a table.
Each row contains values for each field/column defined in the table schema.
Primary Key
A unique identifier for each record in a table,ensuring that each row can be uniquely
identified and accessed. Primary keys are used to establish relationships between tables and
enforce data integrity.
Query Language
The language used to communicate with a database management system. This language
allows users to perform operations such as data retrieval, manipulation, and schema
definition.
Major Commands in SQL with Examples
To illustrate the major SQL commands, let's use a SQLite database file named
`company.db`, which contains a table named `employees`. We'll demonstrate various SQL
commands with real changes to this database.
Example Database Structure
Table: employees

employee_id name age department

1 John Doe 30 HR

2 Jane Smith 35 Finance

3 Michael Lee 40 IT

SELECT Statement
The SELECT statement is used to retrieve data from one or more tables in a database.
Syntax
SELECT column1, column2, ...
FROM table_name
WHERE condition;
Example
SELECT * FROM employees WHERE department = 'IT';
This query selects all columns from the "employees" table where the department is 'IT'.
Output
3|Michael Lee|40|IT
INSERT Statement
The INSERT statement is used to add new records into a table.
Syntax
INSERT INTO table_name (column1, column2, ...)
VALUES (value1, value2, ...);

6
Example
INSERT INTO employees (name, age, department)
VALUES ('Sarah Johnson', 28, 'Marketing');
This query inserts a new employee record into the "employees" table with specified values.
Output
To see, if the new data has been successfully inserted, you can execute the SELECT
command, like this
SELECT * FROM employees;
Now, you'll get the entire table and you can see that the new data has been added to the
database
1|John Doe|30|HR
2|Jane Smith|35|Finance
3|Michael Lee|40|IT
4|Sarah Johnson|28|Marketing
UPDATE Statement
The UPDATE statement is used to modify existing records in a table.
Syntax
UPDATE table_name
SET column1 = value1, column2 = value2, ...
WHERE condition;
Example
UPDATE employees
SET department = 'Operations'
WHERE name = 'Michael Lee';
This query updates the department of the employee named 'Michael Lee' to 'Operations'.
Output
Let's run the SELECT command to see the updated database
SELECT * FROM employees;
You can see that, the database has been updated and now Michael's department is set to
Operations
1|John Doe|30|HR
2|Jane Smith|35|Finance
3|Michael Lee|40|Operations
4|Sarah Johnson|28|Marketing
DELETE Statement
The DELETE statement is used to remove existing records from a table.
Syntax
DELETE FROM table_name
WHERE condition;
Example
DELETE FROM employees
WHERE age > 35;
This query deletes records from the "employees" table where the age is greater than 35.
Output
Execute the SELECT command to check the updated database:
SELECT * FROM employees;

7
You can see that Michael has been removed from the database as he is the only one with an
age over 35.
1|John Doe|30|HR
2|Jane Smith|35|Finance
4|Sarah Johnson|28|Marketing

CONSTRAINTS:
Types of Constraints:
 Domain Constraints:
These define the permissible values or types of data that a column can hold.
 Examples: NOT NULL (prevents null values), CHECK (ensures values meet specific
criteria), DEFAULT (sets a default value).
Entity Integrity Constraints:
These ensure that each record or row in a table is unique and identifiable.
 Examples: PRIMARY KEY (uniquely identifies each row), UNIQUE (ensures uniqueness for
specific columns).
Referential Integrity Constraints:
These maintain consistency between related tables by ensuring that foreign key values exist
in the primary key of the related table.
 Example: FOREIGN KEY (establishes relationships between tables).
Informational Constraints:
These are attributes of certain constraints that are not enforced by the database manager but
can be used for optimization or documentation purposes.
 Example: COMMENT (provides information about a constraint).

TRIGGERS:
SQL triggers are a critical feature in database management systems (DBMS) that provide
automatic execution of a set of SQL statements when specific database events, such
as INSERT, UPDATE, or DELETE operations, occur. Triggers are commonly used to
maintain data integrity, track changes, and enforce business rules automatically, without
needing manual input.

Syntax
create trigger [trigger_name]
[before | after]
{insert | update | delete}
on [table_name]
FOR EACH ROW
BEGIN
END;

8
Types of SQL Triggers
Triggers can be categorized into different types based on the action they are associated
with:
1. DDL Triggers
The Data Definition Language (DDL) command events such
as Create_table, Create_view, drop_table, Drop_view, and Alter_table cause the DDL
triggers to be activated. They allow us to track changes in the structure of the database. The
trigger will prevent any table creation, alteration, or deletion in the database.
2.DML Triggers
The Data manipulation Language (DML) command events that begin with Insert, Update,
and Delete set off the DML triggers. DML triggers are used for data validation, ensuring
that modifications to a table are done under controlled conditions.
3. Logon Triggers
These triggers are fired in response to logon events. Logon triggers are useful
for monitoring user sessions or restricting user access to the database. As a result,
the PRINT statement messages and any errors generated by the trigger will all be visible
in the SQL Server error log. Authentication errors prevent logon triggers from being
used. These triggers can be used to track login activity or set a limit on the number of
sessions that a given login can have in order to audit and manage server sessions.

FORM OF BASIC SQL QUERY:


The basic form of an SQL query, specifically when retrieving data, is composed of a
combination of clauses. The most elementary form of an SQL query for data retrieval can be
represented as
Syntax
SELECT [DISTINCT] column1, column2, ...
FROM tablename
WHERE condition;

Let's break it down:

1. SELECT Clause: This is where you specify the columns you want to retrieve. Use an
asterisk (*) to retrieve all columns.
2. FROM Clause: This specifies from which table or tables you want to retrieve the data.
3. WHERE Clause (optional): This allows you to filter the results based on a condition.
4. DISTINCT Clause (optional): is an optional keyword indicating that the answer should
not contain duplicates. Normally if we write the SQL without DISTINCT operator then it
does not eliminate the duplicates.

9
Here are the primary components of SQL queries:

 SELECT: Retrieves data from one or more tables.


 FROM: Specifies the table from which you're retrieving the data.
 WHERE: Filters the results based on a condition.
 GROUP BY: Groups rows that have the same values in specified columns.
 HAVING: Filters the result of a GROUP BY.
 ORDER BY: Sorts the results in ascending or descending order.
 JOIN: Combines rows from two or more tables based on related columns.

UNION:
The SQL UNION operator is used to combine the result sets of two or more SELECT
queries into a single result set. It is a powerful tool in SQL that helps aggregate data from
multiple tables, especially when the tables have similar structures.
In this guide, we’ll explore the SQL UNION operator, how it differs from UNION ALL,
and provide detailed examples to demonstrate its usage.
The SQL UNION operator combines the results of two or more SELECT statements into
one result set. By default, UNION removes duplicate rows, ensuring that the result set
contains only distinct records.
There are some rules for using the SQL UNION operator.
Rules for SQL UNION
 Each table used within UNION must have the same number of columns.
 The columns must have the same data types.
 The columns in each table must be in the same order.
Syntax:
The Syntax of the SQL UNION operator is:
SELECT columnnames FROM table1
UNION
SELECT columnnames FROM table2;
UNION operator provides unique values by default. To find duplicate values, use UNION
ALL.
Note: SQL UNION and UNION ALL difference is that UNION operator removes duplicate rows from
results set and
UNION ALL operator retains all rows, including duplicate.
Examples of SQL UNION
Let’s look at an example of UNION operator in SQL to understand it better.
Let’s create two tables “Emp1” and “Emp2”;
Emp1 Table
Write the following SQL query to create Emp1 table.
CREATE TABLE Emp1(

10
EmpID INT PRIMARY KEY,
Name VARCHAR(50),
Country VARCHAR(50),
Age int(2),
mob int(10)
);
-- Insert some sample data into the Customers table
INSERT INTO Emp1 (EmpID, Name,Country, Age, mob)
VALUES (1, 'Shubham', 'India','23','738479734'),
(2, 'Aman ', 'Australia','21','436789555'),
(3, 'Naveen', 'Sri lanka','24','34873847'),
(4, 'Aditya', 'Austria','21','328440934'),
(5, 'Nishant', 'Spain','22','73248679');

SELECT* FROM Emp1;


Output:

Emp1 Table

Emp2 Table
Write the following SQL query to create Emp2 table
CREATE TABLE Emp2(
EmpID INT PRIMARY KEY,
Name VARCHAR(50),
Country VARCHAR(50),
Age int(2),
mob int(10)
);
-- Insert some sample data into the Customers table
INSERT INTO Emp2 (EmpID, Name,Country, Age, mob)
VALUES (1, 'Tommy', 'England','23','738985734'),
(2, 'Allen', 'France','21','43678055'),
(3, 'Nancy', 'India','24','34873847'),
(4, 'Adi', 'Ireland','21','320254934'),
(5, 'Sandy', 'Spain','22','70248679');

SELECT * FROM Emp2;

11
Output:

Emp2 Table

Example 1: SQL UNION Operator


In this example, we will find the cities (only unique values) from both the “Table1” and
the “Table2” tables:
Query:
SELECT Country FROM Emp1
UNION
SELECT Country FROM Emp2
ORDER BY Country;
Output:

output

Example 2: SQL UNION ALL


In the below example, we will find the cities (duplicate values also) from both the “Emp1”
and the “Emp2” tables:
Query:
SELECT Country FROM Emp1
UNION ALL
SELECT Country FROM Emp2
ORDER BY Country;
Output:

12
Country

Australia

Austria

England

France

India

India

Ireland

Spain

Spain

Sri lanka

SQL UNION ALL With WHERE


You can use the WHERE clause with UNION ALL in SQL. The WHERE clause is used
to filter records and is added after each SELECT statement
Example : SQL UNION ALL with WHERE
The following SQL statement returns the cities (duplicate values also) from both the
“Geeks1” and the “Geeks2” tables:
Query:
SELECT Country, Name FROM Emp1
WHERE Name='Aditya'
UNION ALL
SELECT Country, Name FROM Emp2
WHERE Country='Ireland'
ORDER BY Country;
Output:

13
INTERSECT:
In SQL, the INTERSECT clause is used to retrieve the common
records between two SELECT queries. It returns only the rows that are present in both
result sets. This makes INTERSECT an essential clause when we need to find overlapping
data between two or more queries.
In this article, we will explain the SQL INTERSECT clause, its syntax, key
characteristics, and examples. We will also explore its usage with conditions
like BETWEENand LIKE, along with performance considerations and alternatives.
What is SQL INTERSECT?
The INTERSECT clause in SQL is used to combine two SELECT statements but the
dataset returned by the INTERSECT statement will be the intersection of the data sets of
the two SELECT statements. In simple words, the INTERSECT statement will return
only those rows that will be common to both of the SELECT statements.
The INTERSECT operator is a set operation in SQL, similar to UNION and EXCEPT.
While UNION combines results from two queries and removes
duplicates, INTERSECT returns only the records that exist in both queries, ensuring
uniqueness.

Key Characteristics of SQL INTERSECT:


 Returns only the common rows between two result sets.
 Ensures uniqueness by automatically removing duplicate rows.
 Requires that both SELECT statements have the same number of columns.
 The data types of corresponding columns in both queries must be compatible.
Syntax:
SELECT column1 , column2 ….
FROM table1
WHERE condition
INTERSECT
SELECT column1 , column2 ….
FROM table2
WHERE condition

Examples of SQL INTERSECT


14
Let’s consider two tables: the Customers table, which holds customer details, and
the Orders table, which contains information about customer purchases. By applying
the INTERSECToperator, we can retrieve customers who exist in both tables, meaning those
who have made purchases.
Customers Table

Customers Table

Orders Table

Orders Table

Example 1: Basic INTERSECT Query

15
In this example, we retrieve customers who exist in both
the Customers and Orders tables. The INTERSECT operator ensures that only those
customers who have placed an order appear in the result.
Query:
SELECT CustomerID
FROM Customers
INTERSECT
SELECT CustomerID
FROM Orders;
Output:
CustomerID

8
Explanation:
 The query returns only those customers who appear in both
the Customers and Orders tables.
 If a customer exists in Customers but has never placed an order, they won’t appear in the
result.
 Customer IDs 2, 3, 5, 6, 7, and 8 appear in both the Customers and Orders tables
Example 2: Using INTERSECT with BETWEEN Operator
In this example, we apply the INTERSECT operator along with the BETWEEN condition to
filter records based on a specified range. The query retrieves customers
whose CustomerID falls between 3 and 8 and who have placed an order. The result
contains only the common CustomerID values that meet both conditions.
Query:
SELECT CustomerID
FROM Customers
WHERE CustomerID BETWEEN 3 AND 8
INTERSECT
SELECT CustomerID
FROM Orders;
Output:

16
CustomerID

8
Explanation:
 The first SELECT statement filters customers with CustomerIDbetween 3 and 8.
 The INTERSECT operator ensures that only customers from this filtered set who have
placed an order are included in the result.
 Customers 3, 5, 6, 7, and 8 fall within the specified range (3 to 8).
Example 3: Using INTERSECT with LIKE Operator
In this example, we use the INTERSECT operator along with the LIKE operator to find
common customers whose FirstName starts with the letter ‘J’ in both
the Customers and Orders tables.
Query:
SELECT CustomerID
FROM Customers
WHERE FirstName LIKE 'J%'
INTERSECT
SELECT CustomerID
FROM Orders;
Output:
CustomerID

2
Explanation:
 The query finds customers whose first name starts with ‘J’
in both the Customers and Orders tables.
 The INTERSECT operator ensures that only those customers who have placed an order
are included in the result.
 The final output includes Customer 2 (Jane) only, as per the given example.

EXCEPT:

17
The SQL EXCEPT operator is used to return the rows from the first SELECT
statement that are not present in the second SELECT statement. This operator is
conceptually similar to the subtract operator in relational algebra. It is particularly
useful for excluding specific data from your result set.
The SQL EXCEPT operator allows you to return the rows that exist in the first result set
but not in the second. It is useful for finding records in one table that do not have
corresponding records in another table.
Syntax:
SELECT column_name(s)
FROM table1
EXCEPT
SELECT column_name(s)
FROM table2;
Students Table
-- Create Students Table
CREATE TABLE Students (
StudentID INT PRIMARY KEY,
Name VARCHAR(100),
Course VARCHAR(100)
);

-- Insert Data into Students Table


INSERT INTO Students (StudentID, Name, Course) VALUES
(1, 'Rohan', 'DBMS'),
(2, 'Kevin', 'OS'),
(3, 'Mansi', 'DBMS'),
(4, 'Mansi', 'ADA'),
(5, 'Rekha', 'ADA'),
(6, 'Megha', 'OS');
Output:
StudentID Name Course

1 Rohan DBMS

2 Kevin OS

3 Mansi DBMS

4 Mansi ADA

5 Rekha ADA

6 Megha OS

18
Teaching Assistant Table
-- Create Teaching Assistant Table
CREATE TABLE TA (
StudentID INT PRIMARY KEY,
Name VARCHAR(100),
Course VARCHAR(100)
);

-- Insert Data into TA Table


INSERT INTO TA (StudentID, Name, Course) VALUES
(1, 'Kevin', 'TOC'),
(2, 'Sita', 'IP'),
(3, 'Manik', 'AP'),
(4, 'Rekha', 'SNS');
Output:
StudentID Name Course

1 Kevin TOC

2 Sita IP

3 Manik AP

4 Rekha SNS

Example 1: Filter Student


We want to find all the students who are not teaching assistants.
Query:
SELECT Name
FROM Students
EXCEPT
SELECT Name
FROM TA;
Output:
Name

Mansi

Megha

Rohan

19
Explanation: The EXCEPT operator returns the names that are present in the Students
table but not in the TA table. Notice that “Rohan”, “Mansi”, and “Megha” are returned, as
these students are not listed in the TA table.
Example 2: Retaining Duplicates with EXCEPTALL
By default, EXCEPT removes duplicates from the result set. To retain duplicates, you can
use EXCEPT ALL instead.
Query:
SELECT NameFROM StudentsEXCEPTSELECT NameFROM TA;
Output:
Name

Rohan

Mansi

Mansi

Megha
Explanation: In this case, “Mansi” appears twice in the output because it appears twice in
the Students table and is not in the TA table. EXCEPT ALL retains duplicates from the
first result set.
SQL EXCEPT vs. SQL NOT IN
While both EXCEPT and NOT IN are used to exclude certain records, there are important
differences between them.
Feature EXCEPT NOT IN

Duplicate Removes duplicates from the


Retains duplicates in the result
Handling result

Generally more efficient for large May be slower for large datasets,
Performance datasets as it processes only the especially when checking
required rows multiple conditions

When you need to find rows that When you need to check a
Use Case exist in one result set but not the specific column’s values against
other a list

Supported by most SQL


SQL Support Not supported by MySQL
databases

20
Nested Queries:
Nested queries in SQL are a powerful tool for retrieving data from databases in a
structured and efficient manner. They allow us to execute a query within another query,
making it easier to handle complex data operations.
This article explores everything we need to know about SQL nested queries, including
types, syntax, examples, and outputs. By the end of this guide, we’ll be able to use nested
queries confidently for tasks like filtering, aggregation, and data extraction.
To better understand nested queries, we will use the following sample tables: STUDENT,
COURSE, and STUDENT_COURSE. These tables simulate a real-world scenario of
students, courses, and their enrollment details, which will be used in the examples below.
1. STUDENT Table
The STUDENT table stores information about students, including their unique ID, name,
address, phone number, and age.

STUDENT Table

2. COURSE Table
The STUDENT_COURSE table maps students to the courses they have enrolled in. It
uses the student and course IDs as foreign keys.

COURSE Table

3. STUDENT_COURSE Table
This table maps students to the courses they have enrolled in, with columns for student ID
(S_ID) and course ID (C_ID):

Student_Course Table

Aggregation operators:

21
In a DBMS, aggregation operators are used to perform operations on a group of values to return
a single summarizing value. The most common aggregation operators include COUNT, SUM,
AVG, MIN, and MAX.
Here are some examples of how you might use these operators:

COUNT
Returns the number of rows that matches a specified criterion.
Syntax

COUNT(expression)

Example:

SELECT COUNT(*) FROM Employees;


This query would return the total number of rows in the Employees table.

SUM
Returns the total sum of a numeric column.
Syntax

SUM(expression)

Example:

SELECT SUM(salary) FROM Employees;


This query would return the total sum of the salary column values in the Employees table.

AVG
Returns the average value of a numeric column.

Syntax

AVG(expression)

Example:

SELECT AVG(salary) FROM Employees;


This query would return the average salary from the Employees table.

22
MIN
Returns the smallest value of the selected column.
Syntax

MIN(expression)

Example:

SELECT MIN(salary) FROM Employees;


This query would return the lowest salary from the Employees table.

MAX
Returns the largest value of the selected column.
Syntax

MAX(expression)

Example:

SELECT MAX(salary) FROM Employees;


This query would return the highest salary from the Employees table.
These aggregation operators are often used with the GROUP BY clause to group the result-set
by one or more columns. For example, to find the highest salary in each department, you could
write:
Example:
SELECT department_id, MAX(salary)
FROM Employees
GROUP BY department_id;

This query would return the highest salary for each department in the Employees table.

Null Values:
In SQL, some records in a table may not have values for every field, and such fields
are termed as NULL values. These occur when data is unavailable during entry or

23
when the attribute does not apply to a specific record. To handle such scenarios, SQL
provides a special placeholder value called NULL to represent unknown, unavailable,
or inapplicable data.
Importance of NULL Value
It is essential to understand that a NULL value differs from a zero or an empty string.
A NULL value represents missing or undefined data. Since it is often not possible to
determine which interpretation applies, SQL treats all NULL values as distinct and
does not distinguish between them. Typically, it can have one of three
interpretations:
1. Value Unknown: The value exists but is not known.
2. Value Not Available: The value exists but is intentionally withheld.
3. Attribute Not Applicable: The value is undefined for a specific record.
Principles of NULL values
 Setting a NULL value is appropriate when the actual value is unknown , or when a
value is not meaningful.
 A NULL value is not equivalent to a value of ZERO if the data type is a number
and is not equivalent to spaces if the data type is a character.
 A NULL value can be inserted into columns of any data type.
 A NULL value will evaluate NULL in any expression.
 Suppose if any column has a NULL value, then UNIQUE, FOREIGN key, and
CHECK constraints will ignore by SQL.
Logical Behavior
SQL uses three-valued logic (3VL) : TRUE, FALSE, and UNKNOWN. Logical expressions
involving NULL return UNKNOWN.
 AND: Returns FALSE if one operand is FALSE; otherwise, returns UNKNOWN.
 OR: Returns TRUE if one operand is TRUE; otherwise, returns UNKNOWN.
 NOT: Negates the operand; UNKNOWN remains UNKNOWN.

Logical Behaviour of AND

Logical Behaviour of OR

24
SQL allows queries that check whether an attribute value is NULL. Rather than using
= or to compare an attribute value to NULL, SQL uses IS and IS NOT. This is
because SQL considers each NULL value as being distinct from every other NULL
value, so equality comparison is not appropriate.
Example: Employee Table
CREATE TABLE Employee (
Fname VARCHAR(50),
Lname VARCHAR(50),
SSN VARCHAR(11),
Phoneno VARCHAR(15),
Salary FLOAT
);

INSERT INTO Employee (Fname, Lname, SSN, Phoneno, Salary)


VALUES
('Shubham', 'Thakur', '123-45-6789', '9876543210', 50000.00),
('Aman', 'Chopra', '234-56-7890', NULL, 45000.00),
('Aditya', 'Arpan', NULL, '8765432109', 55000.00),
('Naveen', 'Patnaik', '345-67-8901', NULL, NULL),
('Nishant', 'Jain', '456-78-9012', '7654321098', 60000.00);

Select * FROM Employee;


Output

Employee Table

The IS NULL Operator


In this query, it retrieves the Fname and Lname of employees whose SSN is NULL.
Since SSN represents a unique identifier, rows with NULL in this column
indicate missing data. This query helps identify records that lack this essential
information .
Query:
SELECT Fname, Lname FROM Employee WHERE SSN IS NULL;

25
Output

IS NULL Operator

The IS NOT NULL Operator


In this query, it counts the number of employees who have a valid SSN by excluding
rows where SSN is NULL. The result provides the total number of employees with
an SSN present in the table. The COUNT(*) function ensures that all non-
NULL rows are included in the count.
Query
SELECT COUNT(*) AS Count FROM Employee WHERE SSN IS NOT NULL;
Output

IS NOT NULL Operator

Updating NULL Values in a Table


We can update the NULL values present in a table using the UPDATE statement in
SQL. To do so, we can use the IS NULL operator in the WHERE clause to select the
rows with NULL values and then we can set the new value using the SET keyword.
Let’s suppose that we want to update SSN in the row where it is NULL.
Query:
UPDATE Employee
SET SSN = '789-01-2345'
WHERE Fname = 'Aditya' AND Lname = 'Arpan';

select* from Employee;


Output

Updating NULL Values in Table

26
Integrity constraints in SQL:
Integrity constraints in SQL are rules that help ensure the accuracy and reliability of data in the
database. They ensure that certain conditions are met when data is inserted, updated, or deleted.
While primary key, unique, and foreign key constraints are commonly discussed and used, SQL
allows for more complex constraints through the use of CHECK and custom triggers. Here are
some examples of complex integrity constraints:

1. Using CHECK Constraints


Ensuring a range: You might want a column to only have values within a certain range.
Example:

CREATE TABLE Employees (


ID INT PRIMARY KEY,
Age INT CHECK (Age >= 18 AND Age <= 30)
);
Pattern matching: Ensure data in a column matches a particular format.
Example:

CREATE TABLE Students (


ID INT PRIMARY KEY,
Email VARCHAR(255) CHECK (Email LIKE '%@%.%')
);

2. Composite Primary and Foreign Keys


These are cases where the uniqueness or referential integrity constraint is applied over more than
one column.
Example:

CREATE TABLE OrderDetails (


OrderID INT,
ProductID INT,
Quantity INT,
PRIMARY KEY (OrderID, ProductID),
FOREIGN KEY (OrderID) REFERENCES Orders(OrderID),
FOREIGN KEY (ProductID) REFERENCES Products(ProductID)
);

27
3. Using Stored Procedures
Sometimes, instead of direct data manipulation on tables, using stored procedures can help
maintain more complex integrity constraints by wrapping logic inside the procedure. For
instance, you could have a procedure that checks several conditions before inserting a record.

4. Using TRIGGERS
A trigger is a procedural code in a database that automatically executes in response to certain
events on a particular table or view. Essentially, triggers are special types of stored procedures
that run automatically when an INSERT, UPDATE, or DELETE operation occurs.
A trigger is a predefined action that the database automatically executes in response to certain
events on a particular table or view. Triggers are typically used to maintain the integrity of the
data, automate data-related tasks, and extend the database functionalities.
When implementing complex constraints, it's crucial to strike a balance. While they can ensure data
integrity, they can also add overhead to the database system and increase the complexity of the schema
and the operations performed on it. Proper documentation and understanding of each constraint's purpose
are essential.

Triggers and Active data bases:

Triggers and active databases are closely related concepts in the domain of DBMS.
Let's delve into what each of them means and how they are interconnected.

Triggers

A trigger is a predefined action that the database automatically executes in


response to certain events on a particular table or view. Triggers are typically used
to maintain the integrity of the data, automate data-related tasks, and extend the
database functionalities.

There are various types of triggers based on when they are executed:
BEFORE: Trigger is executed before the triggering event.
AFTER: Trigger is executed after the triggering event.
INSTEAD OF: Trigger is used to override the triggering event, primarily for views.
They can also be categorized by the triggering event:

28
INSERT: Trigger is executed when a new row is inserted.
UPDATE: Trigger is executed when a row is updated.
DELETE: Trigger is executed when a row is deleted.
Here's the basic syntax for creating a trigger in SQL, using MySQL as an

Syntax

CREATE TRIGGER trigger_name


trigger_time trigger_event
ON table_name FOR EACH ROW
trigger_body;

trigger_name: Name of the trigger.


trigger_time: BEFORE, AFTER, or INSTEAD OF.
trigger_event: INSERT, UPDATE, or DELETE.
table_name: The name of the table associated with the trigger.
trigger_body: The set of SQL statements to be executed.

Key Features of Triggers


1. Automatic Execution: Triggers run automatically in response to data modification
events. You don't have to explicitly call them.
2. Event-Driven: They are defined to execute before or after INSERT, UPDATE, and
DELETE events.
3. Transitional Access: Triggers can access the "old" (pre-modification) and "new" (post-
modification) values of the rows affected.

Example of a Trigger
Suppose we have an `Employees` table and we want to maintain an `AuditLog` table that keeps
a record of salary changes for employees.
Employees Table
CREATE TABLE Employees (
EmployeeID INT PRIMARY KEY,
Name VARCHAR(255),
Salary DECIMAL(10, 2)
);

29
AuditLog Table
CREATE TABLE AuditLog (
LogID INT AUTO_INCREMENT PRIMARY KEY,
EmployeeID INT,
OldSalary DECIMAL(10, 2),
NewSalary DECIMAL(10, 2),
ChangeDate DATETIME
);

Now, let's create a trigger that automatically inserts a record into the `AuditLog` table whenever
there's an update to the `Salary` column in the `Employees` table.
Trigger
mysql> DELIMITER //
mysql> CREATE TRIGGER AfterSalaryUpdate
AFTER UPDATE ON Employees
FOR EACH ROW
BEGIN
IF OLD.Salary != NEW.Salary THEN
INSERT INTO AuditLog (EmployeeID, OldSalary, NewSalary, ChangeDate)
VALUES (OLD.EmployeeID, OLD.Salary, NEW.Salary, NOW());
END IF;
END;
//
mysql> DELIMITER ;

Schema Refinement:

 Schema refinement is a process of improving the structure of a database schema,


which is the logical representation of the data and its relationships.
 It aims to create a well-organized and efficient database design by addressing
potential issues like data redundancy and anomalies.
 The primary goal is to ensure data integrity, consistency, and ease of maintenance.
Why is Schema Refinement Important?
 Redundancy:
Storing the same data multiple times can lead to inconsistencies and wasted storage
space.
 Anomalies:

30
 Insertion Anomaly: Difficulty in adding new data without also adding related data that
is not yet available.
 Update Anomaly: Changes to data in one place might not be reflected in other places
where the same data is stored.
 Deletion Anomaly: Deleting data can unintentionally remove other related data.
Data Integrity:
Schema refinement helps maintain the accuracy and consistency of the data.
Database Efficiency:
A well-designed schema can improve query performance and database maintenance.
Techniques for Schema Refinement
 Normalization: A systematic approach to organizing data in tables to minimize
redundancy and anomalies.
 Decomposition: Breaking down large tables into smaller, more manageable tables.
 Functional Dependencies: Identifying relationships between attributes to guide the
schema design.
 Integrity Constraints: Rules that enforce data consistency and validity.

Problems caused by redundancy:


Redundancy means having multiple copies of the same data in the database. This
problem arises when a database is not normalized. Suppose a table of student details
attributes is: student ID, student name, college name, college rank, and course opted.
Student_ID Name Contact College
Course Rank

100 Himanshu 7300934851 GEU B.Tech 1

101 Ankit 7900734858 GEU B.Tech 1

102 Ayush 7300936759 GEU B.Tech 1

103 Ravi 7300901556 GEU B.Tech 1

It can be observed that values of attribute college name, college rank, and course are
being repeated which can lead to problems. Problems caused due to redundancy are:
 Insertion anomaly

31
 Deletion anomaly
 Updation anomaly
Insertion Anomaly
If a student detail has to be inserted whose course is not being decided yet then
insertion will not be possible till the time course is decided for the student.
Student_ID
Name Contact College Course Rank

100 Himanshu 7300934851 GEU 1

This problem happens when the insertion of a data record is not possible without
adding some additional unrelated data to the record.
Deletion Anomaly
If the details of students in this table are deleted then the details of the college will
also get deleted which should not occur by common sense. This anomaly happens
when the deletion of a data record results in losing some unrelated information that
was stored as part of the record that was deleted from a table.
It is not possible to delete some information without losing some other information
in the table as well.
Updation Anomaly
Suppose the rank of the college changes then changes will have to be all over the
database which will be time-consuming and computationally costly.
Student_ID Name Contact College Course Rank

100 Himanshu 7300934851 GEU B.Tech 1

101 Ankit 7900734858 GEU B.Tech 1

102 Ayush 7300936759 GEU B.Tech 1

103 Ravi 7300901556 GEU B.Tech 1

All places should be updated, If updation does not occur at all places then the
database will be in an inconsistent state.
Redundancy in a database occurs when the same data is stored in multiple places.
Redundancy can cause various problems such as data inconsistencies, higher storage
requirements, and slower data retrieval.

32
Problems Caused Due to Redundancy
 Data Inconsistency: Redundancy can lead to data inconsistencies, where the same
data is stored in multiple locations, and changes to one copy of the data are not
reflected in the other copies. This can result in incorrect data being used in
decision-making processes and can lead to errors and inconsistencies in the data.
 Storage Requirements: Redundancy increases the storage requirements of a
database. If the same data is stored in multiple places, more storage space is
required to store the data. This can lead to higher costs and slower data retrieval.
 Update Anomalies: Redundancy can lead to update anomalies, where changes
made to one copy of the data are not reflected in the other copies. This can result
in incorrect data being used in decision-making processes and can lead to errors
and inconsistencies in the data.
 Performance Issues: Redundancy can also lead to performance issues, as the
database must spend more time updating multiple copies of the same data. This
can lead to slower data retrieval and slower overall performance of the database.
 Security Issues: Redundancy can also create security issues, as multiple copies of
the same data can be accessed and manipulated by unauthorized users. This can
lead to data breaches and compromise the confidentiality, integrity, and
availability of the data.
 Maintenance Complexity: Redundancy can increase the complexity of database
maintenance, as multiple copies of the same data must be updated and
synchronized. This can make it more difficult to troubleshoot and resolve issues
and can require more time and resources to maintain the database.
 Data Duplication: Redundancy can lead to data duplication, where the same data
is stored in multiple locations, resulting in wasted storage space and increased
maintenance complexity. This can also lead to confusion and errors, as different
copies of the data may have different values or be out of sync.
 Data Integrity: Redundancy can also compromise data integrity, as changes made
to one copy of the data may not be reflected in the other copies. This can result in
inconsistencies and errors and can make it difficult to ensure that the data is
accurate and up-to-date.
 Usability Issues: Redundancy can also create usability issues, as users may have
difficulty accessing the correct version of the data or may be confused by
inconsistencies and errors. This can lead to frustration and decreased
productivity, as users spend more time searching for the correct data or
correcting errors.
To prevent redundancy in a database, normalization techniques can be used.
Normalization is the process of organizing data in a database to eliminate
redundancy and improve data integrity. Normalization involves breaking down a
larger table into smaller tables and establishing relationships between them. This
reduces redundancy and makes the database more efficient and reliable.

33
Decompositions:

Decomposition refers to the division of tables into multiple tables to produce


consistency in the data. In this article, we will learn about the Database concept.
This article is related to the concept of Decomposition in DBMS. It explains the
definition of Decomposition, types of Decomposition in DBMS, and its properties.
When we divide a table into multiple tables or divide a relation into multiple
relations, then this process is termed Decomposition in DBMS. We perform
decomposition in DBMS when we want to process a particular data set. It is
performed in a database management system when we need to ensure consistency
and remove anomalies and duplicate data present in the database. When we perform
decomposition in DBMS, we must try to ensure that no information or data is lost.

Decomposition in DBMS
Types of Decomposition
There are two types of Decomposition:
 Lossless Decomposition
 Lossy Decomposition

Types of Decomposition
Lossless Decomposition
The process in which where we can regain the original relation R with the help of
joins from the multiple relations formed after decomposition. This process is termed

34
as lossless decomposition. It is used to remove the redundant data from the database
while retaining the useful information. The lossless decomposition tries to ensure
following things:
 While regaining the original relation, no information should be lost.
 If we perform join operation on the sub-divided relations, we must get the
original relation.
Example:
There is a relation called R(A, B, C)
A B C

55 16 27

48 52 89

Now we decompose this relation into two sub relations R1 and R2


R1(A, B)
A B

55 16

48 52

R2(B, C)
B C

16 27

52 89

After performing the Join operation we get the same original relation
A B C

55 16 27

48 52 89

35
Lossy Decomposition
As the name suggests, lossy decomposition means when we perform join operation
on the sub-relations it doesn't result to the same relation which was decomposed.
After the join operation, we always found some extraneous tuples. These extra
tuples genrates difficulty for the user to identify the original tuples.
Example:
We have a relation R(A, B, C)
A B C

1 2 1

2 5 3

3 3 3

Now , we decompose it into sub-relations R1 and R2


R1(A, B)
A B

1 2

2 5

3 3

R2(B, C)
B C

2 1

5 3

3 3

Now After performing join operation


A B C

36
A B C

1 2 1

2 5 3

2 3 3

3 5 3

3 3 3

Problems related to decomposition:


Decomposition in the context of database design refers to the process of breaking down a single
table into multiple tables in order to eliminate redundancy, reduce data anomalies, and achieve
normalization. Decomposition is typically done using rules defined by normalization forms.
However, while decomposition can be helpful, it is not without challenges. Done incorrectly,
decomposition can lead to its own set of problems.

1. Loss of Information
 Non-loss decomposition: When a relation is decomposed into two or more smaller relations, and
the original relation can be perfectly reconstructed by taking the natural join of the decomposed
relations, then it is termed as lossless decomposition. If not, it is termed "lossy decomposition."
 Example: Let's consider a table `R(A, B, C)` with a dependency `A → B`. If you decompose it
into `R1(A, B)` and `R2(B, C)`, it would be lossy because you can't recreate the original table
using natural joins.
Example: Consider a relation R(A,B,C) with the following data:

|A |B |C |
|----|----|----|
|1 |X |P |
|1 |Y |P |
|2 |Z |Q |

Suppose we decompose R into R1(A,B) and R2(A,C).


R1(A, B):

37
|A |B |
|----|----|
|1 |X |
|1 |Y |
|2 |Z |

R2(A, C):

|A |C |
|----|----|
|1 |P |
|1 |P |
|2 |Q |

Now, if we take the natural join of R1 and R2 on attribute A, we get back the original relation R.
Therefore, this is a lossless decomposition.

2. Loss of Functional Dependency


 Once tables are decomposed, certain functional dependencies might not be preserved, which can
lead to the inability to enforce specific integrity constraints.
 Example: If you have the functional dependency `A → B` in the original table, but in the
decomposed tables, there is no table with both `A` and `B`, this functional dependency can't be
preserved.
Example: Let's consider a relation R with attributes A,B, and C and the following functional
dependencies:
A → B
B→C
Now, suppose we decompose R into two relations:
R1(A,B) with FD A → B
R2(B,C) with FD B → C
In this case, the decomposition is dependency-preserving because all the functional dependencies
of the original relation R can be found in the decomposed relations R1 and R2. We do not need
to join R1 and R2 to enforce or check any of the functional dependencies.

38
However, if we had a functional dependency in R, say A → C, which cannot be determined from
either R1 or R2 without joining them, then the decomposition would not be dependency-
preserving for that specific FD.

3. Increased Complexity
 Decomposition leads to an increase in the number of tables, which can complicate queries and
maintenance tasks. While tools and ORM (Object-Relational Mapping) libraries can mitigate this
to some extent, it still adds complexity.

4. Redundancy
 Incorrect decomposition might not eliminate redundancy, and in some cases, can even introduce
new redundancies.

5. Performance Overhead
 An increased number of tables, while aiding normalization, can also lead to more complex SQL
queries involving multiple joins, which can introduce performance overheads.

Reasoning about functional dependency:


In relational database management, functional dependency is a concept that specifies
the relationship between two sets of attributes where one attribute determines the
value of another attribute. It is denoted as X → Y, where the attribute set on the left
side of the arrow, X is called Determinant, and Y is called the Dependent.

A functional dependency occurs when one attribute uniquely determines another


attribute within a relation. It is a constraint that describes how attributes in a table
relate to each other. If attribute A functionally determines attribute B we write this
as the A→B.
Functional dependencies are used to mathematically express relations among
database entities and are very important to understanding advanced concepts in
Relational Database Systems.
Example:
roll_no name dept_name dept_building

42 abc CO A4

39
roll_no name dept_name dept_building

43 pqr IT A3

44 xyz CO A4

45 xyz IT A3

46 mno EC B2

47 jkl ME B2

From the above table we can conclude some valid functional dependencies:
 roll_no → { name, dept_name, dept_building }→ Here, roll_no can determine
values of fields name, dept_name and dept_building, hence a valid Functional
dependency
 roll_no → dept_name , Since, roll_no can determine whole set of {name,
dept_name, dept_building}, it can determine its subset dept_name also.
 dept_name → dept_building , Dept_name can identify the dept_building
accurately, since departments with different dept_name will also have a different
dept_building
 More valid functional dependencies: roll_no → name, {roll_no, name} ⇢
{dept_name, dept_building}, etc.
Here are some invalid functional dependencies:
 name → dept_name Students with the same name can have different
dept_name, hence this is not a valid functional dependency.
 dept_building → dept_name There can be multiple departments in the same
building. Example, in the above table departments ME and EC are in the same
building B2, hence dept_building → dept_name is an invalid functional
dependency.
 More invalid functional dependencies: name → roll_no, {name, dept_name} →
roll_no, dept_building → roll_no, etc.
Read more about What is Functional Dependency in DBMS ?
Types of Functional Dependencies in DBMS
1. Trivial functional dependency
40
2. Non-Trivial functional dependency
3. Multivalued functional dependency
4. Transitive functional dependency
1. Trivial Functional Dependency
In Trivial Functional Dependency, a dependent is always a subset of the
determinant. i.e. If X → Y and Y is the subset of X, then it is called trivial functional
dependency.
Symbolically: A→B is trivial functional dependency if B is a subset of A.
The following dependencies are also trivial: A→A & B→B
Example 1 :
 ABC -> AB
 ABC -> A
 ABC -> ABC
Example 2:
roll_no name age

42 abc 17

43 pqr 18

44 xyz 18

Here, {roll_no, name} → name is a trivial functional dependency, since the


dependent name is a subset of determinant set {roll_no, name}. Similarly, roll_no →
roll_no is also an example of trivial functional dependency.
2. Non-trivial Functional Dependency
In Non-trivial functional dependency, the dependent is strictly not a subset of the
determinant. i.e. If X → Y and Y is not a subset of X, then it is called Non-trivial
functional dependency.
Example 1 :
 Id -> Name
 Name -> DOB
Example 2:
roll_no name age

41
roll_no name age

42 abc 17

43 pqr 18

44 xyz 18

Here, roll_no → name is a non-trivial functional dependency, since the dependent


name is not a subset of determinant roll_no. Similarly, {roll_no, name} → age is
also a non-trivial functional dependency, since age is not a subset of {roll_no,
name}
3. Semi Non Trivial Functional Dependencies
A semi non-trivial functional dependency occurs when part of the dependent
attribute (right-hand side) is included in the determinant (left-hand side), but not all
of it. This is a middle ground between trivial and non-trivial functional
dependencies. X -> Y is called semi non-trivial when X intersect Y is not NULL.
Example:
Consider the following table:
Student_ID Course_ID Course_Name

101 CSE101 Computer Science

102 CSE102 Data Structures

103 CSE101 Computer Science

Functional Dependency:
{StudentID,CourseID}→CourseID
This is semi non-trivial because:
 Part of the dependent attribute ( Course_ID) is already included in the determinant
({Student_ID, Course_ID}).

42
 However, the dependency is not completely trivial because
{StudentID}→CourseID is not implied directly.
4. Multivalued Functional Dependency
In Multivalued functional dependency, entities of the dependent set are not
dependent on each other. i.e. If a → {b, c} and there exists no functional dependency
between b and c, then it is called a multivalued functional dependency.
Example:
bike_model manuf_year color

tu1001 2007 Black

tu1001 2007 Red

tu2012 2008 Black

tu2012 2008 Red

tu2222 2009 Black

tu2222 2009 Red

In this table:
 X: bike_model
 Y: color
 Z: manuf_year
For each bike model (bike_model):
1. There is a group of colors (color) and a group of manufacturing years (manuf_year).
2. The colors do not depend on the manufacturing year, and the manufacturing year
does not depend on the colors. They are independent.
3. The sets of color and manuf_year are linked only to bike_model.
That’s what makes it a multivalued dependency.
In this case these two columns are said to be multivalued dependent on bike_model.
These dependencies can be represented like this:
Read more about Multivalued Dependency in DBMS.
5. Transitive Functional Dependency
In transitive functional dependency, dependent is indirectly dependent on
determinant. i.e. If a → b & b → c, then according to axiom of transitivity, a → c.
This is a transitive functional dependency.
Example:
43
enrol_no name dept building_no

42 abc CO 4

43 pqr EC 2

44 xyz IT 1

45 abc EC 2

Here, enrol_no → dept and dept → building_no. Hence, according to the axiom of
transitivity, enrol_no → building_no is a valid functional dependency. This is an
indirect functional dependency, hence called Transitive functional dependency.
6. Fully Functional Dependency
In full functional dependency an attribute or a set of attributes uniquely determines
another attribute or set of attributes. If a relation R has attributes X, Y, Z with the
dependencies X->Y and X->Z which states that those dependencies are fully
functional.
Read more about Fully Functional Dependency.
7. Partial Functional Dependency
In partial functional dependency a non key attribute depends on a part of the
composite key, rather than the whole key. If a relation R has attributes X, Y, Z
where X and Y are the composite key and Z is non key attribute. Then X->Z is a
partial functional dependency in RBDMS.

First Normal Form (1NF):


Normalization in database management is the process of organizing data to
minimize redundancy and dependency, ensuring efficiency, consistency, and
integrity. This involves structuring data into smaller, logically related tables and
defining relationships between them to streamline data storage and retrieval.
Normal Forms are a set of guidelines in database normalization that define how to
structure data in tables to reduce redundancy and improve integrity. Each normal
form builds on the previous one, progressively organizing data more efficiently.
Levels of Normalization
44
There are various levels of normalization. These are some of them:
 First Normal Form (1NF)
 Second Normal Form (2NF)
 Third Normal Form (3NF)
 Boyce-Codd Normal Form (BCNF)
 Fourth Normal Form (4NF)
 Fifth Normal Form (5NF)
In this article, we will discuss the First Normal Form (1NF).
First Normal Form
If a relation contains a composite or multi-valued attribute, it violates the first
normal form, or the relation is in the first normal form if it does not contain any
composite or multi-valued attribute. A relation is in first normal form if every
attribute in that relation is single-valued attribute.
A table is in 1 NF if:
 There are only Single Valued Attributes.
 Attribute Domain does not change.
 There is a unique name for every Attribute/Column.
 The order in which data is stored does not matter.
Rules for First Normal Form (1NF) in DBMS
To follow the First Normal Form (1NF) in a database, these simple rules must be
followed:
1. Every Column Should Have Single Values
Each column in a table must contain only one value in a cell. No cell should hold
multiple values. If a cell contains more than one value, the table does not follow
1NF.
 Example: A table with columns like [Writer 1], [Writer 2], and [Writer 3] for the
same book ID is not in 1NF because it repeats the same type of information
(writers). Instead, all writers should be listed in separate rows.
2. All Values in a Column Should Be of the Same Type
Each column must store the same type of data. You cannot mix different types of
information in the same column.
 Example: If a column is meant for dates of birth (DOB), you cannot use it to
store names. Each type of information should have its own column.
3. Every Column Must Have a Unique Name
Each column in the table must have a unique name. This avoids confusion when
retrieving, updating, or adding data.
 Example: If two columns have the same name, the database system may not
know which one to use.
4. The Order of Data Doesn’t Matter
In 1NF, the order in which data is stored in a table doesn’t affect how the table
works. You can organize the rows in any way without breaking the rules.
45
Example:
Consider the below COURSES Relation :

 In the above table, Courses has a multi-valued attribute, so it is not in 1NF. The
Below Table is in 1NF as there is no multi-valued attribute.

Second Normal Form (2NF):


Normalization is a structural method whereby tables are broken down in a controlled
manner with an aim of reducing data redundancy. It refers to the process of
arranging the attributes and relations of a database in order to minimize data
anomalies such as update, insert and delete anomalies. Normalization is usually a
sequence of steps which are also called normal forms (NF).
The First Normal Form (1NF) and Second Normal Form (2NF) are very important
towards the achievement of a normalized database. Where 1NF is centered on the
removal of multiple values in an attribute, 2NF is associated with the issue of partial
dependencies.
Second Normal Form
Second Normal Form (2NF) is based on the concept of fully functional dependency.
It is a way to organize a database table so that it reduces redundancy and ensures
data consistency. For a table to be in 2NF, it must first meet the requirements of
First Normal Form (1NF), meaning all columns should contain single, indivisible
values without any repeating groups. Additionally, the table should not have partial
dependencies. In other words,
A relation that is in First Normal Form and every non-prime attribute is fully
functionally dependent on the candidate key, then the relation is in Second Normal
Form (2NF).
Note – If the proper subset of the candidate key determines a non-prime attribute, it
is called partial dependency. The normalization of 1NF relations to 2NF involves
the removal of partial dependencies. If a partial dependency exists, we remove the
partially dependent attribute(s) from the relation by placing them in a new relation
along with a copy of their determinant. Consider the examples given below.
Example-1: Consider the table below.

 There are many courses having the same course fee. Here, COURSE_FEE cannot
alone decide the value of COURSE_NO or STUD_NO.
 COURSE_FEE together with STUD_NO cannot decide the value of
COURSE_NO.

46
 COURSE_FEE together with COURSE_NO cannot decide the value of
STUD_NO.
 The candidate key for this table is {STUD_NO, COURSE_NO} because the
combination of these two columns uniquely identifies each row in the table.
 COURSE_FEE is a non-prime attribute because it is not part of the candidate
key {STUD_NO, COURSE_NO}.
 But, COURSE_NO -> COURSE_FEE, i.e., COURSE_FEE is dependent on
COURSE_NO, which is a proper subset of the candidate key.
 Therefore, Non-prime attribute COURSE_FEE is dependent on a proper subset
of the candidate key, which is a partial dependency and so this relation is not in
2NF.
To convert the above relation to 2NF, we need to split the table into two tables such
as : Table 1: STUD_NO, COURSE_NO Table 2: COURSE_NO, COURSE_FEE.

Now, each table is in 2NF:


 The Course Table ensures that COURSE_FEE depends only on COURSE_NO.
 The Student-Course Table ensures there are no partial dependencies because it
only relates students to courses.
NOTE: 2NF tries to reduce the redundant data getting stored in memory. For
instance, if there are 100 students taking C1 course, we don’t need to store its Fee as
1000 for all the 100 records, instead, once we can store it in the second table as the
course fee for C1 is 1000.

Third Normal Form (3NF):


A relation is in the third normal form, if there is no transitive dependency for non-
prime attributes as well as it is in the second normal form. A relation is in 3NF if at
least one of the following conditions holds in every non-trivial function dependency
X –> Y.
 X is a super key.
 Y is a prime attribute (each element of Y is part of some candidate key).
In other words,
A relation that is in First and Second Normal Form and in which no non-primary-
key attribute is transitively dependent on the primary key, then it is in Third Normal
Form (3NF).
Note:
If A->B and B->C are two FDs then A->C is called transitive dependency.
The normalization of 2NF relations to 3NF involves the removal of transitive
dependencies. If a transitive dependency exists, we remove the transitively
47
dependent attribute(s) from the relation by placing the attribute(s) in a new relation
along with a copy of the determinant. Consider the examples given below.
Example : Consider the below Relation,

In the relation CANDIDATE given above:


 Functional dependency Set: {CAND_NO -> CAND_NAME, CAND_NO -
>CAND_STATE, CAND_STATE -> CAND_COUNTRY, CAND_NO ->
CAND_AGE}
 So, Candidate key here would be: {CAND_NO}
 For the relation given here in the table, CAND_NO -> CAND_STATE and
CAND_STATE -> CAND_COUNTRY are actually true. Thus,
CAND_COUNTRY depends transitively on CAND_NO. This transitive relation
violates the rules of being in the 3NF. So, if we want to convert it into the third
normal form, then we have to decompose the relation CANDIDATE
(CAND_NO, CAND_NAME, CAND_STATE, CAND_COUNTRY,
CAND_AGE) as:
CANDIDATE (CAND_NO, CAND_NAME, CAND_STATE, CAND_AGE)
STATE_COUNTRY (STATE, COUNTRY).

Example 2: Consider Relation R(A, B, C, D, E)


A -> BC,
CD -> E,
B -> D,
E -> A
All possible candidate keys in above relation are {A, E, CD, BC} . All attribute are
on right sides of all functional dependencies are prime. Therefore, the above

Boyce-Codd Normal Form (BCNF):


Boyce-Codd Normal Form (BCNF) is a stricter version of Third Normal Form
(3NF) that ensures a more simplified and efficient database design. It enforces that
every non-trivial functional dependency must have a superkey on its left-hand side.
This approach addresses potential issues with candidate keys and ensures the
database is free from redundancy.
BCNF eliminates redundancy more effectively than 3NF by strictly requiring that all
functional dependencies originate from super-keys.
BCNF is essential for good database schema design in higher-level systems where
consistency and efficiency are important, particularly when there are many candidate
keys (as one often finds with a delivery system).

48
Rules for BCNF
Rule 1: The table should be in the 3rd Normal Form.
Rule 2: X should be a super-key for every functional dependency (FD) X−>Y in a
given relation.
Note: To test whether a relation is in BCNF, we identify all the determinants and
make sure that they are candidate keys.
To determine the highest normal form of a given relation R with functional
dependencies, the first step is to check whether the BCNF condition holds. If R is
found to be in BCNF, it can be safely deduced that the relation is also
in 3NF, 2NF, and 1NF. The 1NF has the least restrictive constraint – it only requires a
relation R to have atomic values in each tuple. The 2NF has a slightly more
restrictive constraint.
The 3NF has a more restrictive constraint than the first two normal forms but is less
restrictive than the BCNF. In this manner, the restriction increases as we traverse
down the hierarchy.
We are going to discuss some basic examples which let you understand the
properties of BCNF. We will discuss multiple examples here.
Example 1
Consider a relation R with attributes (student, teacher, subject).

FD: { (student, Teacher) -> subject, (student, subject) -> Teacher, (Teacher) -> subject}
 Candidate keys are (student, teacher) and (student, subject).
 The above relation is in 3NF (since there is no transitive dependency). A relation
R is in BCNF if for every non-trivial FD X->Y, X must be a key.
 The above relation is not in BCNF, because in the FD (teacher->subject), teacher
is not a key. This relation suffers with anomalies −
 For example, if we delete the student Tahira , we will also lose the information
that N.Gupta teaches C. This issue occurs because the teacher is a determinant
but not a candidate key.

R is divided into two relations R1(Teacher, Subject) and R2(Student, Teacher).


For more, refer to BCNF in DBMS.
Example 2
Let us consider the student database, in which data of the student are mentioned.
Stu_ID Stu_Branch Stu_Course Branch_Number Stu_Course_No

101 Computer Science DBMS B_001 201

49
Stu_ID Stu_Branch Stu_Course Branch_Number Stu_Course_No

& Engineering

Computer Science Computer


101 B_001 202
& Engineering Networks

Electronics &
VLSI
102 Communication B_003 401
Technology
Engineering

Electronics &
Mobile
102 Communication B_003 402
Communication
Engineering

Functional Dependency of the above is as mentioned:


Stu_ID −> Stu_Branch
Stu_Course −> {Branch_Number, Stu_Course_No}
Candidate Keys of the above table are: {Stu_ID, Stu_Course}
Why this Table is Not in BCNF?
The table present above is not in BCNF, because as we can see that neither Stu_ID
nor Stu_Course is a Super Key. As the rules mentioned above clearly tell that for a
table to be in BCNF, it must follow the property that for functional dependency
X−>Y, X must be in Super Key and here this property fails, that’s why this table is
not in BCNF.
How to Satisfy BCNF?
For satisfying this table in BCNF, we have to decompose it into further tables. Here
is the full procedure through which we transform this table into BCNF. Let us first
divide this main table into two tables Stu_Branch and Stu_Course Table.
Stu_Branch Table
Stu_ID Stu_Branch

101 Computer Science & Engineering

102 Electronics & Communication Engineering

Candidate Key for this table: Stu_ID.


Stu_Course Table

50
Stu_Course Branch_Number Stu_Course_No

DBMS B_001 201

Computer Networks B_001 202

VLSI Technology B_003 401

Mobile Communication B_003 402

Candidate Key for this table: Stu_Course.


Stu_Enroll Table
Stu_ID Stu_Course_No

101 201

101 202

102 401

102 402

Candidate Key for this table: {Stu_ID, Stu_Course_No}.


After decomposing into further tables, now it is in BCNF, as it is passing the
condition of Super Key, that in functional dependency X−>Y, X is a Super Key.
Example 3
Find the highest normal form of a relation R(A, B, C, D, E) with FD set as:
{ BC->D, AC->BE, B->E }
Explanation:
 Step-1: As we can see, (AC)+ ={A, C, B, E, D} but none of its subsets can
determine all attributes of the relation, So AC will be the candidate key. A or C
can’t be derived from any other attribute of the relation, so there will be only 1
candidate key {AC}.
 Step-2: Prime attributes are those attributes that are part of candidate key {A, C}
in this example and others will be non-prime {B, D, E} in this example.
 Step-3: The relation R is in 1st normal form as a relational DBMS does not allow
multi-valued or composite attributes.
The relation is in 2nd normal form because BC->D is in 2nd normal form (BC is not
a proper subset of candidate key AC) and AC->BE is in 2nd normal form (AC is
51
candidate key) and B->E is in 2nd normal form (B is not a proper subset of
candidate key AC).
The relation is not in 3rd normal form because in BC->D (neither BC is a super key
nor D is a prime attribute) and in B->E (neither B is a super key nor E is a prime
attribute) but to satisfy 3rd normal for, either LHS of an FD should be super key or
RHS should be a prime attribute. So the highest normal form of relation will be the
2nd Normal form.
Note: A prime attribute cannot be transitively dependent on a key in BCNF relation.
Consider these functional dependencies of some relation R
AB ->C
C ->B
AB ->B
From the above functional dependency, we get that the candidate key of R is AB and
AC. A careful observation is required to conclude that the above dependency is a
Transitive Dependency as the prime attribute B transitively depends on the key AB
through C. Now, the first and the third FD are in BCNF as they both contain the
candidate key (or simply KEY) on their left sides. The second dependency, however,
is not in BCNF but is definitely in 3NF due to the presence of the prime attribute on
the right side. So, the highest normal form of R is 3NF as all three FDs satisfy the
necessary conditions to be in 3NF.
Example 3
For example consider relation R(A, B, C)
A -> BC,
B -> A
A and B both are super keys so the above relation is in BCNF.
Note: BCNF decomposition may always not be possible with dependency preserving,
however, it always satisfies the lossless join condition. For example, relation R (V,
W, X, Y, Z), with functional dependencies:
V, W -> X
Y, Z -> X
W -> Y
It would not satisfy dependency preserving BCNF decomposition.

Lossless join Decomposition:


Lossless join decomposition is a decomposition of a relation R into relations R1, and
R2 such that if we perform a natural join of relation R1 and R2, it will return the
original relation R. This is effective in removing redundancy from databases while
preserving the original data.
In other words by lossless decomposition, it becomes feasible to reconstruct the
relation R from decomposed tables R1 and R2 by using Joins.
Only 1NF,2NF,3NF, and BCNF are valid for lossless join decomposition.

52
In Lossless Decomposition, we select the common attribute and the criteria for
selecting a common attribute is that the common attribute must be a candidate key or
super key in either relation R1, R2, or both.
Decomposition of a relation R into R1 and R2 is a lossless-join decomposition if at
least one of the following functional dependencies is in F+ (Closure of functional
dependencies)
Example of Lossless Decomposition
— Employee (Employee_Id, Ename, Salary, Department_Id, Dname)
Can be decomposed using lossless decomposition as,
— Employee_desc (Employee_Id, Ename, Salary, Department_Id)
— Department_desc (Department_Id, Dname)
Alternatively the lossy decomposition would be as joining these tables is not
possible so not possible to get back original data.
– Employee_desc (Employee_Id, Ename, Salary)
– Department_desc (Department_Id, Dname)
R1 ∩ R2 → R1
OR
R1 ∩ R2 → R2
In a database management system (DBMS), a lossless decomposition is a process of
decomposing a relation schema into multiple relations in such a way that it preserves
the information contained in the original relation. Specifically, a lossless
decomposition is one in which the original relation can be reconstructed by joining
the decomposed relations.
To achieve lossless decomposition, a set of conditions known as Armstrong’s
axioms can be used. These conditions ensure that the decomposed relations will
retain all the information present in the original relation. Specifically, the two most
important axioms for lossless decomposition are the reflexivity and the
decomposition axiom.
The reflexivity axiom states that if a set of attributes is a subset of another set of
attributes, then the larger set of attributes can be inferred from the smaller set. The
decomposition axiom states that if a relation R can be decomposed into two relations
R1 and R2, then the original relation R can be reconstructed by taking the natural
join of R1 and R2.
There are several algorithms available for performing lossless decomposition in
DBMS, such as the BCNF (Boyce-Codd Normal Form) decomposition and the 3NF
(Third Normal Form) decomposition. These algorithms use a set of rules to
decompose a relation into multiple relations while ensuring that the original relation
can be reconstructed without any loss of information.

53
Multivalued Dependency:
In Database Management Systems (DBMS), multivalued dependency (MVD) deals
with complex attribute relationships in which an attribute may have many
independent values while yet depending on another attribute or group of attributes. It
improves database structure and consistency and is essential for data integrity and
database normalization.
MVD or multivalued dependency means that for a single value of attribute ‘a’
multiple values of attribute ‘b’ exist. We write it as,
a --> --> b
It is read as a is multi-valued dependent on b. Suppose a person named Geeks is
working on 2 projects Microsoft and Oracle and has 2 hobbies namely Reading and
Music. This can be expressed in a tabular format in the following way.

Example

Project and Hobby are multivalued attributes as they have more than one value for a
single person i.e., Geeks.
What is Multivalued Dependency?
When one attribute in a database depends on another attribute and has many
independent values, it is said to have multivalued dependency (MVD). It supports
maintaining data accuracy and managing intricate data interactions.
Multi Valued Dependency (MVD)
We can say that multivalued dependency exists if the following conditions are met.
Conditions for MVD

54
Any attribute say a multiple define another attribute b; if any legal relation r(R), for
all pairs of tuples t1 and t2 in r, such that,
t1[a] = t2[a]
Then there exists t3 and t4 in r such that.
t1[a] = t2[a] = t3[a] = t4[a]
t1[b] = t3[b]; t2[b] = t4[b]
t1 = t4; t2 = t3
Then multivalued (MVD) dependency exists. To check the MVD in given table, we
apply the conditions stated above and we check it with the values in the given table.

Example

Condition-1 for MVD


t1[a] = t2[a] = t3[a] = t4[a]
Finding from table,
t1[a] = t2[a] = t3[a] = t4[a] = Geeks
So, condition 1 is Satisfied.
Condition-2 for MVD
t1[b] = t3[b]
And
t2[b] = t4[b]
Finding from table,
t1[b] = t3[b] = MS
And
t2[b] = t4[b] = Oracle
So, condition 2 is Satisfied.
Condition-3 for MVD
∃c ∈ R-(a ∪ b) where R is the set of attributes in the relational table.
t1 = t4
And
t2=t3

55
Finding from table,
t1 = t4 = Reading
And
t2 = t3 = Music
So, condition 3 is Satisfied. All conditions are satisfied, therefore,
a --> --> b
According to table we have got,
name --> --> project
And for,
a --> --> C
We get,
name --> --> hobby
Hence, we know that MVD exists in the above table and it can be stated by,
name --> --> project
name --> --> hobby

4th and 5th Normal Form :


Two of the highest levels of database normalization are the fourth normal form
(4NF) and the fifth normal form (5NF). Multivalued dependencies are handled by
4NF, whereas join dependencies are handled by 5NF.
If two or more independent relations are kept in a single relation or we can say
multivalue dependency occurs when the presence of one or more rows in a table
implies the presence of one or more other rows in that same table. Put another way,
two attributes (or columns) in a table are independent of one another, but both
depend on a third attribute. A multivalued dependency always requires at least three
attributes because it consists of at least two attributes that are dependent on a third.
For a dependency A -> B, if for a single value of A, multiple values of B exist, then
the table may have a multi-valued dependency. The table should have at least 3
attributes and B and C should be independent for A ->> B multivalued dependency.
Example:
Person Mobile Food_Likes

Mahesh 9893/9424 Burger/Pizza

Ramesh 9191 Pizza


Person->-> mobile,
Person ->-> food_likes
This is read as “person multi determines mobile” and “person multi determines
food_likes.”
Note that a functional dependency is a special case of multivalued dependency. In a
56
functional dependency X -> Y, every x determines exactly one y, never more than
one.
Fourth Normal Form (4NF)
The Fourth Normal Form (4NF) is a level of database normalization where there are
no non-trivial multivalued dependencies other than a candidate key. It builds on the
first three normal forms (1NF, 2NF, and 3NF) and the Boyce-Codd Normal Form
(BCNF). It states that, in addition to a database meeting the requirements of BCNF,
it must not contain more than one multivalued dependency.
Properties
A relation R is in 4NF if and only if the following conditions are satisfied:
1. It should be in the Boyce-Codd Normal Form (BCNF).
2. The table should not have any Multi-valued Dependency.
A table with a multivalued dependency violates the normalization standard of the
Fourth Normal Form (4NF) because it creates unnecessary redundancies and can
contribute to inconsistent data. To bring this up to 4NF, it is necessary to break this
information into two tables.
Example: Consider the database table of a class that has two relations R1 contains
student ID(SID) and student name (SNAME) and R2 contains course id(CID) and
course name (CNAME).
Table R1
SID SNAME

S1 A

S2 B
Table R2
CID CNAME

sC1 C

C2 D

When their cross-product is done it resulted in multivalued dependencies.


Table R1 X R2

SID SNAME CID CNAME

S1 A C1 C

57
SID SNAME CID CNAME

S1 A C2 D

S2 B C1 C

S2 B C2 D

Multivalued dependencies (MVD) are:


SID->->CID; SID->->CNAME; SNAME->->CNAME
Join Dependency
Join decomposition is a further generalization of Multivalued dependencies. If the
join of R1 and R2 over C is equal to relation R then we can say that a
join dependency (JD) exists, where R1 and R2 are the decomposition R1(A, B, C)
and R2(C, D) of a given relations R (A, B, C, D). Alternatively, R1 and R2 are a
lossless decomposition of R. A JD ⋈ {R1, R2, …, Rn} is said to hold over a relation
R if R1, R2, ….., Rn is a lossless-join decomposition. The *(A, B, C, D), (C, D) will
be a JD of R if the join of joins attribute is equal to the relation R. Here, *(R1, R2,
R3) is used to indicate that relation R1, R2, R3 and so on are a JD of R. Let R is a
relation schema R1, R2, R3……..Rn be the decomposition of R. r( R ) is said to
satisfy join dependency if and only if

Joint Dependency

Example:
Table R1
Company Product

C1 Pendrive

C1 mic

C2 speaker

C2 speaker
Company->->Product
Table R2

58
Agent Company

Aman C1

Aman C2

Mohan C1
Agent->->Company
Table R3

Agent Product

Aman Pendrive

Aman Mic

Aman speaker

Mohan speaker
Agent->->Product
Table R1⋈R2⋈R3
Company Product Agent

C1 Pendrive Aman

C1 mic Aman

C2 speaker speaker

C1 speaker Aman
Agent->->Product
Fifth Normal Form/Projected Normal Form (5NF)
A relation R is in Fifth Normal Form if and only if everyone joins dependency in R
is implied by the candidate keys of R. A relation decomposed into two relations
must have lossless join Property, which ensures that no spurious or extra tuples are
generated when relations are reunited through a natural join.

59
Properties
A relation R is in 5NF if and only if it satisfies the following conditions:
1. R should be already in 4NF.
2. It cannot be further non loss decomposed (join dependency).
Example – Consider the above schema, with a case as “if a company makes a product
and an agent is an agent for that company, then he always sells that product for the
company”. Under these circumstances, the ACP table is shown as:
Table ACP
Agent Company Product

A1 PQR Nut

A1 PQR Bolt

A1 XYZ Nut

A1 XYZ Bolt

A2 PQR Nut

The relation ACP is again decomposed into 3 relations. Now, the natural Join of all
three relations will be shown as:
Table R1

Agent Company

A1 PQR

A1 XYZ

A2 PQR

Table R2
Agent Product

A1 Nut

A1 Bolt

60
Agent Product

A2 Nut

Table R3
Company Product

PQR Nut

PQR Bolt

XYZ Nut

XYZ Bolt

The result of the Natural Join of R1 and R3 over ‘Company’ and then the Natural
Join of R13 and R2 over ‘Agent’and ‘Product’ will be Table ACP.

61

You might also like