UNIT-III Relational Algebra
UNIT-III Relational Algebra
Introduction, CODD Rules, relational data model, concept of key, relational integrity, relational algebra,
relational algebra operations, advantages of relational algebra, limitations of relational algebra, relational
calculus: tuple relational calculus, domain relational Calculus (DRC), QBE
…………………………………………………………………………………………………………………………….
Q) CODD Rules
Relational data model is the primary data model, which is used widely around the world for data storage
and processing. This model is simple and it has all the properties and capabilities required to process data
with storage efficiency. Relational Model was proposed by E.F. Codd to model data in the form of relations
or tables.
1. Information Rule: Data stored in Relational model must be a value of some cell of a table. Everything in a
database must be stored in a table format.
2. Guaranteed Access Rule: Every data element (value) must be accessible by combination of table name,
primary-key (row value), and attribute-name (column value).
3. Systematic Treatment of NULL values: The NULL values in a database must be given a systematic and
uniform treatment. NULL value in database must only correspond to missing, unknown or not applicable
values.
4. Active Online Catalogue: Structure of database must be stored in an online catalogue, known as data
dictionary, which can be accessed by authorized users.
5. Comprehensive Data Sub-language Rule: A database should be accessible by a language supported for
definition, manipulation and transaction management operation. If the database allows access to data
without any help of this language, then it is considered as a violation.
6. View Updating Rule: Different views created for various purposes should be automatically updatable by
the system.
7. High level insert, update and delete rule: Relational Model should support insert, delete, update etc.
operations at each level of relations. Also, set operations like Union, Intersection and minus should be
supported.
8. Physical data independence: Any modification in the physical location of a table should not enforce
modification at application level. The data stored in a database must be independent of the applications that
access the database.
9. Logical data independence: Any modification in logical or conceptual schema of a table should not
enforce modification at application level. For example, merging of two tables into one should not affect
application accessing it which is difficult to achieve.
10. Integrity Independence: Integrity constraints modified at database level should not enforce modification
at application level. This rule makes a database independent of the front-end application and its interface.
11. Distribution Independence: Distribution of data over various locations should not be visible to end-users.
This rule has been regarded as the foundation of distributed database systems.
12. Non-Subversion Rule: Low level access to data should not be able to bypass integrity rule to change data.
1 © www.tutorialtpoint.net all rights reserved Prepared by D. VENKATA REDDY M.Tech(CS), UGC NET, APSET QUALIFIED
Q) Relational model
The relational model uses a collection of tables to represent both data and the relationships among those
data. Tables are logical models. It is a combination of three components such as structural, integrity and
manipulative parts.
Structural part: The structural part defines the database as a collection of relations.
Integrity part: The database integrity is maintained in the relational model using primary key and foreign key.
Manipulative part: The relational algebra and relational calculus are the tools used to manipulate data in the
database. This relational has a strong mathematical background.
Features of Relational model:
Relational data model is the primary data model, which is used widely around the world for data storage
and processing. This model is simple, and it has all the properties and capabilities required to process data
with storage efficiency.
Attribute: Each column in a Table is called as an attribute. Attributes are the properties which define a
relation. e.g., Sid, Sname, DOB
Degree: The total number of attributes which are present in a relation is called the degree of the relation.
Tuple: A single row of a table, which contains a single record for that relation, is called a tuple.
Relation instance: A finite set of tuples in the relational database system represents relation instance. Relation
instances do not have duplicate tuples.
Relation schema: A relation schema describes the relation name (table name), attributes, and their names.
Relation key: Each row has one or more attributes, known as relation key, which can identify the row in the
relation (table) uniquely.
Attribute domain: Every attribute has some pre-defined value scope, known as attribute domain.
2 © www.tutorialtpoint.net all rights reserved Prepared by D. VENKATA REDDY M.Tech(CS), UGC NET, APSET QUALIFIED
Q) Concept of Key
A key in DBMS is an attribute or a set of attributes that help to uniquely identify a tuple (or row) in a
relation (or table). Keys are also used to establish relationships between the different tables and columns of a
relational database. Individual values in a key are called key values.
A key is used in the definitions of various kinds of integrity constraints. A table in a database represents a
collection of records or events for a particular relation. Now there can be thousands and thousands of such
records, some of which may be duplicated.
There should be a way to identify each record separately and uniquely, i.e. no duplicates. Keys allow us to
be free from this hassle.
1.Super Key-
A super key is a set of attributes that can identify each tuple uniquely in the given relation.
Example-
Given below are the examples of super keys since each set can uniquely identify each student in the Student
table-
All the attributes in a super key are definitely sufficient to identify each tuple uniquely in the given relation
but all of them may not be necessary.
2. Candidate Key-
A set of minimal attribute(s) that can identify each tuple uniquely in the given relation is called as a candidate
key.
Example-
Given below are the examples of candidate keys since each set consists of minimal attributes required to
identify each student uniquely in the Student table-
( name , address )
3 © www.tutorialtpoint.net all rights reserved Prepared by D. VENKATA REDDY M.Tech(CS), UGC NET, APSET QUALIFIED
NOTES-
All the attributes in a candidate key are sufficient as well as necessary to identify each tuple uniquely.
Removing any attribute from the candidate key fails in identifying each tuple uniquely.
Those attributes which appears in some candidate key are called as prime attributes.
3. Primary Key-
A primary key is a candidate key that the database designer selects while designing the database. OR
Candidate key that the database designer implements is called as a primary key.
NOTES-
The values of primary key can never be changed i.e. no updation is possible.
5. Foreign Key-
An attribute ‘X’ is called as a foreign key to some other attribute ‘Y’ when its values are dependent
on the values of attribute ‘Y’.
The attribute ‘X’ can assume only those values which are assumed by the attribute ‘Y’.
Here, the relation in which attribute ‘Y’ is present is called as the referenced relation.
The relation in which attribute ‘X’ is present is called as the referencing relation.
The attribute ‘Y’ might be present in the same table or in some other table.
4 © www.tutorialtpoint.net all rights reserved Prepared by D. VENKATA REDDY M.Tech(CS), UGC NET, APSET QUALIFIED
Here, t_dept can take only those values which are present in dept_no in Department table since only those
departments actually exist.
NOTES-
Foreign key references the primary key of the table.
Foreign key can take only those values which are present in the primary key of the referenced
relation.
Foreign key may have a name other than that of a primary key.
Foreign key can take the NULL value.
There is no restriction on a foreign key to be unique.
In fact, foreign key is not unique most of the time.
Referenced relation may also be called as the master table or primary table.
Referencing relation may also be called as the foreign table.
Data integrity constraints refer to the accuracy and correctness of data in the data base. Data integrity
provides a mechanism to maintain date consistence for operations like insert, update and delete. The
different types of data integrity constraints are
1. Domain constraint
2. Key constraint
3. Entity Integrity constraint
4. Referential Integrity constraint
5. Null integrity constraint
1. Domain constraint
It specifies that the value taken by the attribute must be the atomic value from its domain.
Here, value ‘A’ is not allowed since only integer values can be
taken by the age attribute.
5 © www.tutorialtpoint.net all rights reserved Prepared by D. VENKATA REDDY M.Tech(CS), UGC NET, APSET QUALIFIED
2. Key constraint
Key constraint specifies that in any relation-
All the values of primary key must be unique.
The value of primary key must not be null.
This relation does not satisfy the key constraint as here all the values of primary key are not unique.
This relation does not satisfy the entity integrity constraint as here the primary key contains a NULL value.
6 © www.tutorialtpoint.net all rights reserved Prepared by D. VENKATA REDDY M.Tech(CS), UGC NET, APSET QUALIFIED
5. Null integrity constraint
Null implies that the data value is not known temporarily. Consider the relation person, the attribute if
the relations are name, age and salary. The age of the person can’t be null.
Q) Relational algebra
Relational Algebra is procedural query language, which takes Relation as input and generate relation as
output. Relational algebra mainly provides theoretical foundation for relational databases and SQL.
a) Projection (π) :Projection is used to project required column data from a relation.
Relation r:
A, C (r)
7 © www.tutorialtpoint.net all rights reserved Prepared by D. VENKATA REDDY M.Tech(CS), UGC NET, APSET QUALIFIED
b) Selection (σ)
Relation r
c) Union (U)
Union operation in relational algebra is same as union operation in set theory, only constraint is for union of
two relations both relations must have same set of Attributes.
Relations r, s:
rUs
Set Difference in relational algebra is same set difference operation as in set theory with the constraint that
both relations should have same set of attributes.
Relations r, s
r-s
8 © www.tutorialtpoint.net all rights reserved Prepared by D. VENKATA REDDY M.Tech(CS), UGC NET, APSET QUALIFIED
e) Set Intersection:
Suppose there are two tuples R and S. The set intersection operation contains all tuples that are in both R &
S. It is denoted by intersection ∩.
Cross product between two relations let say R and S, so cross product between R X S will results all the
attributes of R followed by each attribute of S. Each record of R will pairs with every record of S.
g) Rename (ρ)
Rename is a unary operation used for renaming attributes of a relation. ρ (a/b)R will rename the attribute ‘b’
of relation by ‘a’.
9 © www.tutorialtpoint.net all rights reserved Prepared by D. VENKATA REDDY M.Tech(CS), UGC NET, APSET QUALIFIED
2. Join operations
A Join operation combines related tuples from different relations, if and only if a given join condition is
satisfied. It is denoted by ⋈.
The natural join is a binary operation that allows us to combine two different relations into one relation and
makes the same columns in two different relations into one column in the resulting relation.
EMPLOYEE DEPARTMENT
EMPLOYEE ⋈.DEPARTMENT
Equi Join
A special case of conditional join where the condition contains only equality.
10 © www.tutorialtpoint.net all rights reserved Prepared by D. VENKATA REDDY M.Tech(CS), UGC NET, APSET QUALIFIED
Here are the different types of the JOINs in SQL:
1. INNER JOIN: Returns records that have matching values in both tables
2. OUTER JOIN:
In outer join, matched pairs are retained unwanted values in other tables are left null.
a) LEFT OUTER JOIN: Returns all records from the left table, and the matched records from the right table
b) RIGHT OUTER JOIN: Returns all records from the right table, and the matched records from the left table
c) FULL OUTER JOIN: Returns all records when there is a match in either left or right table
11 © www.tutorialtpoint.net all rights reserved Prepared by D. VENKATA REDDY M.Tech(CS), UGC NET, APSET QUALIFIED
Q) Advantages of relational algebra
The relational algebra has solid mathematical background. The mathematical background of relational
algebra is the basis of many interesting developments and theorems. If we have two expressions for the same
operation and if the expressions are proved to be equivalent, then a query optimizer can automatically
substitute the more efficient form. Moreover, the relational algebra is a high level language which talks in
terms of properties of sets of tuples and not in terms of for-loops.
For example, if we want to know the price of 101 of petrol, by assuming a 10% increase in the price of
the petrol, which can’t be done using relational algebra.
For example we want to arrange the product name in the increasing order of their price. It can’t be
done using relational algebra.
Q) Relational calculus
Relational calculus is a non-procedural query language. In the non-procedural query language, the
user is concerned with the details of how to obtain the end results.
The relational calculus tells what to do but never explains how to do.
The tuple relational calculus is specified to select the tuples in a relation. In TRC, filtering variable uses
the tuples of a relation.
The result of the relation can have one or more tuples.
n this form of relational calculus, we define a tuple variable, specify the table(relation) name in which
the tuple is to be searched for, along with a condition.
We can also specify column name using a . dot operator, with the tuple variable to only get a certain
attribute(column) in result.
A tuple variable is nothing but a name, can be anything, generally we use a single alphabet for this,
so let's say T is a tuple variable.
To specify the name of the relation (table) in which we want to look for data, we do the following:
Relation (T), where T is our tuple variable.
12 © www.tutorialtpoint.net all rights reserved Prepared by D. VENKATA REDDY M.Tech(CS), UGC NET, APSET QUALIFIED
For example if our table is Student, we would put it as Student(T)
Notation:
Where
For example:
OUTPUT: This query selects the tuples from the AUTHOR relation. It returns a tuple with 'name' from
Author who has written an article on 'database'.
TRC (tuple relation calculus) can be quantified. In TRC, we can use Existential (∃) and Universal Quantifiers
(∀).
For example:
Output: This query will yield the same result as the previous one.
Notation:
Where
For example:
Output: This query will yield the article, page, and subject from the relational computers, where the subject
is a database.
13 © www.tutorialtpoint.net all rights reserved Prepared by D. VENKATA REDDY M.Tech(CS), UGC NET, APSET QUALIFIED
Difference between Tuple Relational Calculus (TRC) and Domain Relational Calculus (DRC) :
In TRC, the variables represent the tuples from In DRC, the variables represent the value drawn
specified relation. from specified domain.
A tuple is a single element of relation. In database A domain is equivalent to column data type and
term, it is a row. any constraints on value of data.
In this filtering variable uses tuple of relation. In this filtering is done based on the domain of
attributes.
Notation : Notation :
{T | P (T)} or {T | Condition (T)} { a1, a2, a3, …, an | P (a1, a2, a3, …, an)}
Example : Example :
{T | EMPLOYEE (T) AND T.DEPT_ID = 10} { | < EMPLOYEE > DEPT_ID = 10 }
Q) QBE
If we talk about normal queries we fire on the database they should be correct and in a well-defined
structure which means they should follow a proper syntax if the syntax or query is wrong definitely we will
get an error and due to that our application or calculation definitely going to stop. So to overcome this
problem QBE was introduced. QBE stands for Query By Example and it was developed in 1970 by Moshe
Zloof at IBM.
It is a graphical query language where we get a user interface and then we fill some required fields to get our
proper result.
In SQL we will get an error if the query is not correct but in the case of QBE if the query is wrong either we
get a wrong answer or the query will not be going to execute but we will never get any error.
Note-:
In QBE we don’t write complete queries like SQL or other database languages it comes with some blank so
we need to just fill that blanks and we will get our required result.
Example
Consider the example where a table ‘SAC’ is present in the database with Name, Phone_Number, and
Branch fields. And we want to get the name of the SAC-Representative name who belongs to the MCA
Branch. If we write this query in SQL we have to write it like
SELECT NAME
FROM SAC
And definitely, we will get our correct result. But in the case of QBE, it may be done as like there is a field
present and we just need to fill it with “MCA” and then click on the SEARCH button we will get our
required result.
14 © www.tutorialtpoint.net all rights reserved Prepared by D. VENKATA REDDY M.Tech(CS), UGC NET, APSET QUALIFIED