0% found this document useful (0 votes)
24 views16 pages

Relational Algebra

Uploaded by

kthpjm95l
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views16 pages

Relational Algebra

Uploaded by

kthpjm95l
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 16

Relational Database Integrity

Integrity simply means to maintain the consistency of data. Thus, integrity


constraints in a database ensure that changes made to the database by
authorized users do not compromise data consistency.
Thus, integrity constraints do not allow damage to the database.
There are primarily two integrity constraints:
1. Entity Integrity Constraint – Primary Key
2. Referential Integrity Constraint. – Foreign Key

The Keys
Candidate Key:
In a relation R, a candidate key for R is a subset of the set of attributes of R,
which have the following two properties:
(1) Uniqueness - No two distinct tuples in R have the same value for the
candidate key
(2) Irreducible - No proper subset of the candidate key has the uniqueness
property that is the candidate key.

Every relation must have at least one candidate key which cannot be
reduced further. (Example: Email Id)
Duplicate tuples are not allowed in relations. Any candidate key can be a
composite key also.

Let us summarize the properties of a candidate key.

Properties of a candidate key


 A candidate key must be unique and irreducible
 A candidate may involve one or more than one attributes. A candidate key
that involves more than one attribute is said to be composite.
Candidate keys are important because they provide the basic tuple-level
identification mechanism in a relational system.

Primary Key
A primary key, also called a primary keyword, is a key in a relational
database that is unique for each record. It is a unique identifier, such as a
driver license number, telephone number (including area code), or vehicle
identification number (VIN). A relational database must always have one and
only one primary key.

Foreign Keys
A Foreign Key is a column or a combination of columns whose values match
a Primary Key in a different table. The relationship between 2 tables matches
the Primary Key in one of the tables with a Foreign Key in the second table.

Entity Integrity (PK)


Entity Integrity Description
Requirement All primary keys entries are
unique, and no part of a primary
key may be null
Purpose Each row will have a unique
identity, and foreign key values
can properly reference primary
key values.
Example No student can have a duplicate
number, nor can it be null.

Referential Integrity (FK)

Referential Integrity Description


Requirement A foreign key may have either a
null entry, as long as it is not a
part of its table’s primary key, or
an entry that matches the primary
key value in a table to which it is
related.
Purpose It is possible for an attribute NOT
to have corresponding value, but
it will be impossible to have an
invalid entry. The enforcement of
referential integrity makes it
impossible to delete a row in one
table whose primary key has
mandatory matching foreign key
values in another table.
Example A customer might not yet
assigned a sales representative
but impossible to have an invalid
entry.
Example:
Employee Table
Emp_id (PK) Ename Location
100 Akshaya Pune
101 Ananya Delhi
102 Anvita Hyderabad
103 Deepa Chennai

Empid should not be null, should not contain duplicate values.

Department Table
Dept_id (PK) Dname Emp_id (FK)
10 Marketing 100
20 Sales 101
30 Admin 103
40 Operations 102

Deptid is the primary key in the department table. The empid is a foreign key
in the department table. The empid attribute appearing in the department
table should have the entry in the empid of the employee table. If a value
104 is entered in the department table’s empid, it will show an error as
empid 104 is not available in the Employee table.

Operations in Referential Integrity

Delete
 It is easy to delete a row in the department table.
 If one has to delete a row in the employee table, then the
corresponding values is available in the department table, which may
cause an error. In order to delete such a row the RDBMS provides a
solution as ON DELETE CASCASDE which helps to delete those
matching rows.
Insert
 Inserting a row in the employee table is very easy (Ex: 103, Deepa,
Chennai)
 Insertion in the department table is difficult if the corresponding value
is not available in the employee table.
Update (Modify)
 Modifying a value in the employee table is easy.
 If a value is changed in the employee table then the corresponding
value is updated in the department table automatically through
CASCADE operation.
Redundancy and Associated Problems

Redundancy means having multiple copies of same data in the database.


This problem arises when a database is not normalized. Suppose a table of
student details attributes are: student Id, student name, college name,
college rank, course opted.

As it can be observed that values of attribute college name, college rank,


course is being repeated which can lead to problems.

Problems caused due to redundancy are:


 Insertion anomaly (Problem)
 Deletion anomaly
 Updation anomaly

Insertion Anomaly

If a student detail has to be inserted whose course is not being decided yet
then insertion will not be possible till the time course is decided for student.

This problem happens when the insertion of a data record is not possible
without adding some additional unrelated data to the record.

Deletion Anomaly

If the details of students in this table is deleted then the details of college will
also get deleted which should not occur by common sense.
This anomaly happens when deletion of a data record results in losing some
unrelated information that was stored as part of the record that was deleted
from a table.
Updation Anomaly

Suppose if the rank of the college changes then changes will have to be all
over the database which will be time-consuming and computationally costly.

If updation do not occur at all places then database will be in inconsistent


state.

Single-Valued Dependencies

A database is a collection of related information and it is therefore inevitable


that some items of information in the database would depend on some other
items of information. The information is either single-valued or multivalued.
The name of a person or his date of birth is single-valued facts; qualifications
of a person or subjects that an instructor teaches are multivalued facts. We
will deal only with single-valued facts and discuss the concept of functional
dependency.

Let us describe this concept logically.

Functional Dependency (FD) determines the relation of one attribute to


another attribute in a database management system (DBMS) system.
Functional dependency helps you to maintain the quality of data in the
database. A functional dependency is denoted by an arrow →. The functional
dependency of X on Y is represented by X → Y. Functional Dependency plays
a vital role to find the difference between good and bad database design.

Example:
Employee Employee Name Salary City
number

1 Dana 50000 San Francisco

2 Francis 38000 London

3 Andrew 25000 Tokyo

In this example, if we know the value of Employee number, we can obtain


Employee Name, city, salary, etc. By this, we can say that the city, Employee
Name, and salary are functionally depended on Employee number.

Employee number -> Employee name, Salary, City

Here, are some key terms for functional dependency:

Decompositio It is a rule that suggests if you have a table that appears to


n contain two entities which are determined by the same
primary key then you should consider breaking them up into
two different tables.

Dependent It is displayed on the right side of the functional dependency


diagram.

Determinant It is displayed on the left side of the functional dependency


Diagram.

Union It suggests that if two tables are separate, and the PK is the
same, you should consider putting them together

Rules of Functional Dependencies


Below given are the three most important rules for Functional Dependency:

 Reflexive rule –. If X is a set of attributes and Y is_subset_of X, then X


holds a value of Y.
 Augmentation rule - When x -> y holds, and c is attribute set, cx->cy
also holds true. That is adding attributes which do not change the basic
dependencies.
 Transitivity rule - This rule is very much similar to the transitive rule
in algebra if x -> y holds and y -> z holds, then x -> z also holds. X ->
y is called as functionally that determines y.

Normalization

Codd in the year 1972 presented three normal forms (1NF, 2NF, and
3NF). These were based on functional dependencies among the
attributes of a relation. Later Boyce and Codd proposed another
normal form called the Boyce-Codd normal form (BCNF). The fourth
and fifth normal forms are based on multivalue and join dependencies
and were proposed later.

Normalization is a process for evaluating and correcting table structures to


minimize data redundancies, thereby reducing the likelihood of data
anomalies.

Normalization works through a series of stages called Normal Forms.

Normalization results in decomposition of the original relation. It should be


noted that decomposition of relation has to be always based on principles,
such as functional dependence, that ensure that the original relation may be
reconstructed from the decomposed relations if and when necessary.
Careless decomposition of a relation can result in loss of information.

Rules of Data Normalization

The following are the basic rules for the Normalization process:

1. Eliminate Repeating Groups: Make a separate relation for each set of


related attributes, and give each relation a primary key.

2. Eliminate Redundant Data: If an attribute depends on only part of a


multiattribute key, remove it to a separate relation.

3. Eliminate Columns Not Dependent on Key: If attributes do not


contribute to a description of the key, remove them to a separate relation.
4. Isolate Independent Multiple Relationships: No relation may contain
two or more 1:n or n:m relationships that are not directly related.

5. Isolate Semantically Related Multiple Relationships: There may be


practical constrains on information that justify separating logically related
many-to-many relationships.

Normal Forms

First Normal Form (1NF) – Table Format, Primary Key

Second Normal Form (2NF) – INF and no partial dependencies

Third Normal Form (3NF) – 2NF and no transitive dependencies

Boyce-Codd Normal Form (BCNF) – Candidate Key

Consider the table for example:

Proj_nu Proj_nam Emp_nu Emp_nam Job_clas Chg/ Hour


m e m e s hour s
15 Aerospace 101 Anitha Ele.Eng 280.00 20
Building
103 Beena Cs.Eng 350.00 25
105 Chetana Mech.En 200.00 18
g
16 Aircraft 103 Beena Csc.Eng 230.00 40
Designing
100 Deepa Analyst 400.00 30
104 Divya Designer 420.00 50

Here Proj_num and Emp_num are considered as Primary Key

Consider the following deficiencies

1. Project number has null entries


2. Table entries can have data inconsistencies (Ex: Cs.Eng or Csc.Eng)
3. Table displays data redundancies
a) Update Anomalies - Modifying values for emp_num requires
many altercations
b) Insertion Anomalies – Suppose an employee is not assigned any
project , some entry has to be done to complete the employee data
entry
c) Deletion Anomalies – If an employee leaves the company
employee details are deleted, likewise the project details also get
deleted. In order to save the project details dummy employee
details are created.

Conversion to first normal form:


In the 1NF, the table should not contain repeating groups, identifying the
primary key and identifying all the dependencies. In the above table one
proj_num occurrence can reference a group of related entries.
Step1: Eliminate the repeating groups.
1. Start by presenting the data in the tabular format.
2. Each cell has a single value and there are no repeating groups.
3. Eliminate the nulls by making sure that each repeating group attribute
contains a data value.
Proj_nu Proj_nam Emp_nu Emp_nam Job_clas Chg/ Hour
m e m e s hour s
15 Aerospace 101 Anitha Ele.Eng 280.00 20
Building
15 Aerospace 103 Beena Cs.Eng 350.00 25
Building
15 Aerospace 105 Chetana Mech.En 200.00 18
Building g
16 Aircraft 103 Beena Csc.Eng 230.00 40
Designing
16 Aircraft 100 Deepa Analyst 400.00 30
Designing
16 Aircraft 104 Divya Designer 420.00 50
Designing

Step2: Identify the primary key.


Here proj_num does not uniquely identify a row. A combination of proj_num
and emp_num is taken to uniquely identify a row.
Example: proj_num =15 and emp_num = 103 will give
15 Aerospac 103 Beena Cs.Eng 350.00 25
e Building

Step3: Identify all dependencies:


In this, proj_num, emp_num -> proj_name, emp_name, job_class, chg_hour,
hours
The other column values are dependent on proj_num and emp_num.
Proj_num -> proj_name like wise.
Emp_num -> emp_name, job_class, chg_hour
Job_class -> chg_hour.
Dependencies can be depicted in a diagram and such diagram is called
dependency diagram.

Proj_nu Proj_nam Emp_nu Emp_nam Job_class Chg/hour Hours


m e m e

Transitive
Dependency
Partial Dependency

In the above diagram,


1. The PK attributed are bold and underlined
2. Arrows above attributed indicate all desirable dependencies.
3. a) Partial dependencies: Proj_name depends on proj_num. Likewise
employee information can be obtained only through emp_num.
b) Transitive dependency: chg_hour depends on job_class but both
are not primary key attributes. This is known as transitive dependency.
Conversion to second normal form:
1. Converting to 2NF is done only when the 1NF has composite primary
key.
2. If 1NF has a single PK, then it is automatically in 2NF.

Step1: Write each key component on a separate line.


Example:
1. Proj_num
2. Emp_num
3. Proj_num, emp_num
4. 3 tables are created from this (Project, Employee and Assignment)

Step2: Assign corresponding dependent attributes


Project (proj_num proj_name)
Employee (emp_num, emp_name, job_class, chg_hour)
Assignment (proj_num, emp_num, assign_hour)
Project

Proj_num Proj_name

Employee

Emp_num Emp_name Job_class Chg/hour

Assignment

Proj_num Emp_num Assign_hours

Conversion to third normal form:


Step1 : Identify each new determinant.
For every transitive dependency, write its determinant as PK for a new table
Example job_class
Step 2: Identify the dependent attribute
Identify the attributes that are dependent.
Example job_class -> Chg_hour
Step 3: Remove the dependent attribute from transitive dependency
Eliminate all dependent attributes in the transitive relationship.
Project

Proj_num Proj_name

Employee
Emp_num Emp_name Job_class

Job

Job_Class Chg_hour

Assignment

Proj_num Emp_num Assign_hours

Boyce Codd normal form:


1. A table is in BCNF when every determinant in the table is a candidate
key.
2. When a table contains only one candidate key, the 3NF and BCNF are
equivalent.
3. A table is in 3NF when it is in 2NF and there are no transitive
dependencies.

A B C D

A + B -> C , D
C -> B -> It indicates a nonkey attribute determines part of the primary key.
 The above diagram satisfies 3NF but not BCNF.
 To make this table in BCNF
1. Change the primary key to A and C, so the diagram becomes as
follows

A C B D

 But now the table is in 1NF because it contains Partial Dependency.


 Use the normal form step to decompose to BCNF like:

A C D

3NF and BCNF

C B

3NF and BCNF

Desirable Properties of Decomposition


Decomposition is a process of splitting a relation into its projections that will
not be disjoint.
Desirable properties of decomposition are:
 Attribute preservation
 Lossless-join decomposition
 Dependency preservation
Attribute Preservation
This is a simple and obvious requirement that involves preserving all the
attributes that were there in the relation that is being decomposed.
Using functional dependencies the algorithms decompose the universal
relation schema R in a set of relation schemas D = { R1, R2, ….. Rn }
relational database schema, where ‘D’ is called the Decomposition of R.
The attributes in R will appear in at least one relation schema Ri in the
decomposition, i.e., no attribute is lost. This is called the Attribute
Preservation condition of decomposition.

Lossless Decomposition
 Decomposition must be lossless. It means that the information should not
get lost from the relation that is decomposed.
 It gives a guarantee that the join will result in the same relation as it was
decomposed.

Example:
Let's take 'E' is the Relational Schema, With instance 'e'; is decomposed
into: E1, E2, E3, . . . . En;
Example: <Employee_Department> Table

Eid Ename Age City Salary Deptid DeptName


E001 ABC 29 Pune 20000 D001 Finance
E002 PQR 30 Pune 30000 D002 Production
E003 LMN 25 Mumbai 5000 D003 Sales
E004 XYZ 24 Mumbai 4000 D004 Marketing
E005 STU 32 Bangalore 25000 D005 Human
Resource
Relation 1 : <Employee> Table
Eid Ename Age City Salary
E001 ABC 29 Pune 20000
E002 PQR 30 Pune 30000
E003 LMN 25 Mumbai 5000
E004 XYZ 24 Mumbai 4000
E005 STU 32 Bangalor 25000
e

Relation 2 : <Department> Table

Depti Eid DeptName


d
D001 E001 Finance
D002 E002 Production
D003 E003 Sales
D004 E004 Marketing
D005 E005 Human Resource

 So, the above decomposition is a Lossless Join Decomposition, because the


two relations contains one common field that is 'Eid' and therefore join is
possible.
 If the <Employee> table contains (Eid, Ename, Age, City, Salary) and
<Department> table contains (Deptid and DeptName), then it is not
possible to join the two tables or relations, because there is no common
column between them. And it becomes Lossy Join Decomposition.

Dependency Preservation
 Dependency is an important constraint on the database.
 Every dependency must be satisfied by at least one decomposed table.
 If {A → B} holds, then two sets are functional dependent. And, it becomes
more useful for checking the dependency easily if both sets in a same
relation.
 This decomposition property can only be done by maintaining the functional
dependency.

You might also like