Unit 3
Unit 3
E.F. Codd proposed the relational Model to model data in the form of relations
or tables. After designing the conceptual model of the Database using ER
diagram, we need to convert the conceptual model into a relational model which
can be implemented using any RDBMS language like Oracle SQL, MySQL, etc.
So we will see what the Relational Model is.
The relational model uses a collection of tables to represent both data and the
relationships among those data. Each table has multiple columns, and each
column has a unique name. Tables are also known as relations. The relational
model is an example of a record-based model. Record-based models are so
named because the database is structured in fixed-format records of several
types. Each table contains records of a particular type. Each record type
defines a fixed number of fields, or attributes. The columns of the table
correspond to the attributes of the record type. The relational data model is the
most widely used data model, and a vast majority of current database systems
are based on the relational model.
What is the Relational Model?
The relational model represents how data is stored in Relational Databases. A
relational database consists of a collection of tables, each of which is assigned
a unique name. Consider a relation STUDENT with attributes ROLL_NO,
NAME, ADDRESS, PHONE, and AGE shown in the table.
Table Student
ROLL_NO NAME ADDRESS PHONE AGE
4 SURESH DELHI 18
Important Terminologies
Attribute: Attributes are the properties that define an entity.
e.g.; ROLL_NO, NAME, ADDRESS
Relation Schema: A relation schema defines the structure of the relation
and represents the name of the relation with its attributes. e.g.; STUDENT
(ROLL_NO, NAME, ADDRESS, PHONE, and AGE) is the relation schema
for STUDENT. If a schema has more than 1 relation, it is called Relational
Schema.
Tuple: Each row in the relation is known as a tuple. The above relation
contains 4 tuples, one of which is shown as:
1 RAM DELHI 9455123451 18
NULL Values: The value which is not known or unavailable is called a NULL
value. It is represented by blank space. e.g.; PHONE of STUDENT having
ROLL_NO 4 is NULL.
Relation Key: These are basically the keys that are used to identify the rows
uniquely or also help in identifying tables. These are of the following types.
Primary Key
Candidate Key
Super Key
Foreign Key
Alternate Key
Composite Key
Constraints in Relational Model
While designing the Relational Model, we define some conditions which must
hold for data present in the database are called Constraints. These constraints
are checked before performing any operation (insertion, deletion, and updation )
in the database. If there is a violation of any of the constraints, the operation will
fail.
Domain Constraints
These are attribute-level constraints. An attribute can only take values that lie
inside the domain range. e.g.; If a constraint AGE>0 is applied to STUDENT
relation, inserting a negative value of AGE will result in failure.
Key Integrity
Every relation in the database should have at least one set of attributes that
defines a tuple uniquely. Those set of attributes is called keys. e.g.; ROLL_NO
in STUDENT is key. No two students can have the same roll number. So a key
has two properties:
It should be unique for all tuples.
It can’t have NULL values.
Referential Integrity
When one attribute of a relation can only take values from another attribute of
the same relation or any other relation, it is called referential integrity . Let us
suppose we have 2 relations
Table Student
ROLL_N ADDRES AG BRANCH_COD
O NAME S PHONE E E
945512345
1 RAM DELHI 18 CS
1
RAMES 965243154
2 GURGAON 18 CS
H 3
915625313
3 SUJIT ROHTAK 20 ECE
1
4 SURESH DELHI 18 IT
Table Branch
BRANCH_CODE BRANCH_NAME
CS COMPUTER SCIENCE
IT INFORMATION TECHNOLOGY
ELECTRONICS AND
ECE
COMMUNICATION ENGINEERING
CV CIVIL ENGINEERING
BRANCH_CODE of STUDENT can only take the values which are present in
BRANCH_CODE of BRANCH which is called referential integrity constraint. The
relation which is referencing another relation is called REFERENCING
RELATION (STUDENT in this case) and the relation to which other relations
refer is called REFERENCED RELATION (BRANCH in this case).
Anomalies in the Relational Model
An anomaly is an irregularity or something which deviates from the expected or
normal state. When designing databases, we identify three types of
anomalies: Insert, Update, and Delete.
Insertion Anomaly in Referencing Relation
We can’t insert a row in REFERENCING RELATION if referencing attribute’s
value is not present in the referenced attribute value. e.g.; Insertion of a student
with BRANCH_CODE ‘ME’ in STUDENT relation will result in an error because
‘ME’ is not present in BRANCH_CODE of BRANCH.
Deletion/ Updation Anomaly in Referenced Relation:
We can’t delete or update a row from REFERENCED RELATION if the value of
REFERENCED ATTRIBUTE is used in the value of REFERENCING
ATTRIBUTE. e.g; if we try to delete a tuple from BRANCH having
BRANCH_CODE ‘CS’, it will result in an error because ‘CS’ is referenced by
BRANCH_CODE of STUDENT, but if we try to delete the row from BRANCH
with BRANCH_CODE CV, it will be deleted as the value is not been used by
referencing relation. It can be handled by the following method:
On Delete Cascade
It will delete the tuples from REFERENCING RELATION if the value used by
REFERENCING ATTRIBUTE is deleted from REFERENCED RELATION. e.g.;
For, if we delete a row from BRANCH with BRANCH_CODE ‘CS’, the rows in
STUDENT relation with BRANCH_CODE CS (ROLL_NO 1 and 2 in this case)
will be deleted.
On Update Cascade
It will update the REFERENCING ATTRIBUTE in REFERENCING RELATION if
the attribute value used by REFERENCING ATTRIBUTE is updated in
REFERENCED RELATION. e.g;, if we update a row from BRANCH with
BRANCH_CODE ‘CS’ to ‘CSE’, the rows in STUDENT relation with
BRANCH_CODE CS (ROLL_NO 1 and 2 in this case) will be updated with
BRANCH_CODE ‘CSE’.
Super Keys
Any set of attributes that allows us to identify unique rows (tuples) in a given
relationship is known as super keys. Out of these super keys, we can always
choose a proper subset among these that can be used as a primary key. Such
keys are known as Candidate keys. If there is a combination of two or more
attributes that are being used as the primary key then we call it a Composite
key.
Codd Rules in Relational Model
Edgar F Codd proposed the relational database model where he stated rules.
Now these are known as Codd’s Rules. For any database to be the perfect one,
it has to follow the rules.
For more, refer to Codd Rules in Relational Model .
Advantages of the Relational Model
Simple model: Relational Model is simple and easy to use in comparison to
other languages.
Flexible: Relational Model is more flexible than any other relational model
present.
Secure: Relational Model is more secure than any other relational model.
Data Accuracy: Data is more accurate in the relational data model.
Data Integrity: The integrity of the data is maintained in the relational
model.
Operations can be Applied Easily: It is better to perform operations in the
relational model.
Disadvantages of the Relational Model
Relational Database Model is not very good for large databases.
Sometimes, it becomes difficult to find the relation between tables.
Because of the complex structure, the response time for queries is high.
Characteristics of the Relational Model
Data is represented in rows and columns called relations.
Data is stored in tables having relationships between them called the
Relational model.
The relational model supports the operations like Data definition, Data
manipulation, and Transaction management.
Each column has a distinct name and they are representing attributes.
Each row represents a single entity.
Courses
Video
Jobs
In modeling the design of the relational database we can put some restrictions like what
values are allowed to be inserted in the relation, and what kind of modifications and
deletions are allowed in the relation. These are the restrictions we impose on the
relational database.
In models like Entity-Relationship models, we did not have such features. Database
Constraints can be categorized into 3 main categories:
1. Constraints that are applied in the data model are called Implicit Constraints.
2. Constraints that are directly applied in the schemas of the data model, by specifying
them in the DDL(Data Definition Language). These are called Schema-Based
Constraints or Explicit Constraints.
3. Constraints that cannot be directly applied in the schemas of the data model. We call
these Application-based or Semantic Constraints.
So here we are going to deal with Implicit constraints.
Relational Constraints
These are the restrictions or sets of rules imposed on the database contents. It validates
the quality of the database. It validates the various operations like data insertion,
updation, and other processes that have to be performed without affecting the integrity of
the data. It protects us against threats/damages to the database. Mainly Constraints on the
relational database are of 4 types
Domain constraints
Key constraints or Uniqueness Constraints
Entity Integrity constraints
Referential integrity constraints
Types of Relational Constraints
Explanation: In the above relation, Name is a composite attribute and Phone is a multi-
values attribute, so it is violating domain constraint.
2. Key Constraints or Uniqueness Constraints
These are called uniqueness constraints since it ensures that every tuple in the relation
should be unique.
A relation can have multiple keys or candidate keys(minimal superkey), out of which
we choose one of the keys as the primary key, we don’t have any restriction on
choosing the primary key out of candidate keys, but it is suggested to go with
the candidate key with less number of attributes.
Null values are not allowed in the primary key, hence Not Null constraint is also part
of the key constraint.
Example:
EID Name Phone
01 Bikash 6000000009
02 Paul 9000090009
01 Tuhin 9234567892
Explanation: In the above table, EID is the primary key, and the first and the last tuple
have the same value in EID ie 01, so it is violating the key constraint.
3. Entity Integrity Constraints
Entity Integrity constraints say that no primary key can take a NULL value, since
using the primary key we identify each tuple uniquely in a relation.
Example:
EID Name Phone
01 Bikash 9000900099
02 Paul 600000009
Explanation: In the above relation, EID is made the primary key, and the primary key
can’t take NULL values but in the third tuple, the primary key is null, so it is violating
Entity Integrity constraints.
4. Referential Integrity Constraints
The Referential integrity constraint is specified between two relations or tables and
used to maintain the consistency among the tuples in two relations.
This constraint is enforced through a foreign key, when an attribute in the foreign key
of relation R1 has the same domain(s) as the primary key of relation R2, then the
foreign key of R1 is said to reference or refer to the primary key of relation R2.
The values of the foreign key in a tuple of relation R1 can either take the values of the
primary key for some tuple in relation R2, or can take NULL values, but can’t be
empty.
Example:
EID Name DNO
01 Divine 12
02 Dino 22
04 Vivian 14
DNO Place
12 Jaipur
13 Mumbai
14 Delhi
Explanation: In the above tables, the DNO of Table 1 is the foreign key, and DNO in
Table 2 is the primary key. DNO = 22 in the foreign key of Table 1 is not allowed
because DNO = 22 is not defined in the primary key of table 2. Therefore, Referential
integrity constraints are violated here.
Advantages of Relational Database Model
It is simpler than the hierarchical model and network model.
It is easy and simple to understand.
Its structure can be changed anytime upon requirement.
Data Integrity: The relational database model enforces data integrity through various
constraints such as primary keys, foreign keys, and unique constraints. This ensures
that the data in the database is accurate, consistent, and valid.
Flexibility: The relational database model is highly flexible and can handle a wide
range of data types and structures. It also allows for easy modification and updating of
the data without affecting other parts of the database.
Scalability: The relational database model can scale to handle large amounts of data
by adding more tables, indexes, or partitions to the database. This allows for better
performance and faster query response times.
Security: The relational database model provides robust security features to protect
the data in the database. These include user authentication, authorization, and
encryption of sensitive data.
Data consistency: The relational database model ensures that the data in the database
is consistent across all tables. This means that if a change is made to one table, the
corresponding changes will be made to all related tables.
Query Optimization: The relational database model provides a query optimizer that
can analyze and optimize SQL queries to improve their performance. This allows for
faster query response times and better scalability.
Disadvantages of the Relational Model
Few database relations have certain limits which can’t be expanded further.
It can be complex and it becomes hard to use.
Complexity: The relational model can be complex and difficult to understand,
particularly for users who are not familiar with SQL and database design principles.
This can make it challenging to set up and maintain a relational database.
Performance: The relational model can suffer from performance issues when dealing
with large data sets or complex queries. In particular, joins between tables can be
slow, and indexing strategies can be difficult to optimize.
Scalability: While the relational model is generally scalable, it can become difficult
to manage as the database grows in size. Adding new tables or indexes can be time-
consuming, and managing relationships between tables can become complex.
Introduction of Relational Algebra in DBMS
Read
Courses
Video
Jobs
1 2 4
2 2 3
A B C
3 2 3
4 3 4
For the above relation, σ(c>3)R will select the tuples which have c more than 3.
A B C
1 2 4
4 3 4
Note: The selection operator only selects the required tuples but does not display them.
For display, the data projection operator is used.
2. Projection(π): It is used to project required column data from a relation.
Example: Consider Table 1. Suppose we want columns B and C from Relation R.
π(B,C)R will show following columns.
B C
2 4
2 3
3 4
Ram 01
Mohan 02
Vivek 13
Geeta 17
GERMAN
Student_Name Roll_Number
Vivek 13
Geeta 17
Shyam 21
Rohan 25
Consider the following table of Students having different optional subjects in their
course.
π(Student_Name)FRENCH U π(Student_Name)GERMAN
Student_Name
Ram
Mohan
Student_Name
Vivek
Geeta
Shyam
Rohan
Note: The only constraint in the union of two relations is that both relations must have
the same set of Attributes.
4. Set Difference(-): Set Difference in relational algebra is the same set difference
operation as in set theory.
Example: From the above table of FRENCH and GERMAN, Set Difference is used as
follows
π(Student_Name)FRENCH - π(Student_Name)GERMAN
Student_Name
Ram
Mohan
Note: The only constraint in the Set Difference between two relations is that both
relations must have the same set of Attributes.
5. Set Intersection(∩): Set Intersection in relational algebra is the same set intersection
operation in set theory.
Example: From the above table of FRENCH and GERMAN, the Set Intersection is used
as follows
π(Student_Name)FRENCH ∩ π(Student_Name)GERMAN
Student_Name
Vivek
Geeta
Note: The only constraint in the Set Difference between two relations is that both
relations must have the same set of Attributes.
6. Rename(ρ): Rename is a unary operation used for renaming attributes of a relation.
ρ(a/b)R will rename the attribute 'b' of the relation by 'a'.
7. Cross Product(X): Cross-product between two relations. Let’s say A and B, so the
cross product between A X B will result in all the attributes of A followed by each
attribute of B. Each record of A will pair with every record of B.
Example:
A
Name Age Sex
Ram 14 M
Sona 15 F
Kim 20 M
B
ID Course
1 DS
2 DBMS
AXB
Name Age Sex ID Course
Ram 14 M 1 DS
Ram 14 M 2 DBMS
Sona 15 F 1 DS
Sona 15 F 2 DBMS
Kim 20 M 1 DS
Kim 20 M 2 DBMS
Note: If A has ‘n’ tuples and B has ‘m’ tuples then A X B will have ‘ n*m ‘ tuples.
Derived Operators
These are some of the derived operators, which are derived from the fundamental
operators.
1. Natural Join(⋈)
2. Conditional Join
1. Natural Join(⋈): Natural join is a binary operator. Natural join between two or more
relations will result in a set of all combinations of tuples where they have an equal
common attribute.
Example:
EMP
Name ID Dept_Name
A 120 IT
B 125 HR
Name ID Dept_Name
C 110 Sales
D 111 IT
DEPT
Dept_Name Manager
Sales Y
Production Z
IT A
A 120 IT A
C 110 Sales Y
D 111 IT A
2. Conditional Join: Conditional join works similarly to natural join. In natural join, by
default condition is equal between common attributes while in conditional join we can
specify any condition such as greater than, less than, or not equal.
Example:
R
ID Sex Marks
1 F 45
2 F 55
3 F 60
S
ID Sex Marks
10 M 20
11 M 22
12 M 59
1 F 45 10 M 20
1 F 45 11 M 22
2 F 55 10 M 20
2 F 55 11 M 22
3 F 60 10 M 20
R.ID R.Sex R.Marks S.ID S.Sex S.Marks
3 F 60 11 M 22
3 F 60 12 M 59
Relational Calculus
As Relational Algebra is a procedural query language, Relational Calculus is a
non-procedural query language. It basically deals with the end results. It always
tells me what to do but never tells me how to do it.
There are two types of Relational Calculus
1. Tuple Relational Calculus(TRC)
2. Domain Relational Calculus(DRC)
Where,
t: the set of tuples
p: is the condition which is true for the given set of tuples.
{ < x1, x2, x3, ..., xn > | P (x1, x2, x3, ..., xn ) }
where, < x1, x2, x3, …, xn > represents resulting domains variables and P (x 1, x2,
x3, …, xn ) represents the condition or formula equivalent to the Predicate
calculus.
Predicate Calculus Formula:
1. Set of all comparison operators
2. Set of connectives like and, or, not
3. Set of quantifiers
S.N Basis of
O Comparison Relational Algebra Relational Calculus
Relational Calculus is a
Language Type It is a Procedural language. Declarative (non-procedural)
1. language.
performed.