0% found this document useful (0 votes)
4 views

Chapter 5 Part 1

The document discusses the relational model of database management systems (DBMS) as developed by Dr. E.F. Codd, highlighting its structure, components, and key concepts such as keys, integrity rules, and relational algebra operations. It explains the types of relations, including base tables and views, and outlines the importance of data integrity and the mechanisms to enforce it. Additionally, it covers various operations in relational algebra, including selection, projection, and set operations, essential for querying and manipulating data in relational databases.

Uploaded by

saimnaeem9020
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Chapter 5 Part 1

The document discusses the relational model of database management systems (DBMS) as developed by Dr. E.F. Codd, highlighting its structure, components, and key concepts such as keys, integrity rules, and relational algebra operations. It explains the types of relations, including base tables and views, and outlines the importance of data integrity and the mechanisms to enforce it. Additionally, it covers various operations in relational algebra, including selection, projection, and set operations, essential for querying and manipulating data in relational databases.

Uploaded by

saimnaeem9020
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 59

Database Management

Systems

Relational Model and Normalization Chapter 5


Part 1
RELATIONAL MODEL
▪ Dr. E.F. Codd worked to improve the working of DBMSs to handle large volumes OF DATA.
▪ He applied the rules of mathematics to solve the problems of earlier database models.
▪ Dr. Codd presented a paper “A relational model of data for large shared databanks” in June 1970
that contained 12 rules.
▪ A DBMS that satisfies these rules is called a full relational database management system (RDBMS).
▪ The term relation is also derived from the set theory of mathematics.
RELATIONAL MODEL
▪ In a relational model, data is stored in relations.
▪ Relation is another term used for table.
▪ A table in a database has a unique name the identifies its contents.
▪ Each table can be called an intersection of rows and columns.
▪ Every relation consists of many tuples. Tuples are also called records or rows.
▪ Attribute is named column of a relation. Attributes are also called characteristics,
which are represented by attributes or fields.
DATABASE TERMINOLOGY
RELATIONAL DATABASE
MANAGEMENT SYSTEM
▪ RDBMS is a DBMS that is based on relational Model.
▪ It is a collection of software programs for creating, maintaining,
modifying and manipulating a relational database.
▪ An RDBMS that satisfies the 12 Rules of Dr. Codd is called a true
RDBMS.
COMPONENTS OF RELATIONAL DATABASE
MANAGEMENT SYSTEM
▪ Several components or functioning components of RDBMS have their specific roles in
functioning of overall system.
1. File Manager – Allocate Space on the disk and manages the way data is organized
and represented in storage.
2. Database Manager – Acts as an interface b/w users and data in db.
3. Query Processor – Provides built-in support to query the database.
4. Data Dictionary – Stores information about data.
5. DML Pre-compiler – DML (Database Manipulation Language) used to insert, delete
and modify the data in database. DML interpret these statements and interacts with
query processor to generate appropriate code.
6. DDL Compiler – Data Definition Language (DDL) is used to define the database
structure or schema.
TYPES OF RELATION
▪ Base Tables – a table that exist in a database, created by user using SQL
statements.
▪ Query Results – Query is a question asked by user to DBMS to perform different
operations of tables.
▪ Views – It is a virtual table.
KEYS
▪ A Key is an attribute or set of attributes that uniquely identifies a tuple in a relation.
▪ The keys are defined in tables to access or sequence the stored data quickly and
smoothly.
▪ They are also used to create relationships between different tables.
SUPER KEY
▪ A super key is an attribute or set of attributes in a relation that identifies a tuple
uniquely within a relation.
▪ A super key is the most general type of key.
▪ Example: Roll_No is the super key.
CANDIDATE KEY
▪ A candidate key is a super key that contains no extra attribute.
▪ It consists of minimum possible attributes.
▪ A super key like Roll_No and Name contain an extra field Name.
PRIMARY KEY
▪ A primary key is a candidate key that is selected by the database designer to
identify tuples uniquely in a relation.
▪ A relation may contain many candidate keys.
▪ When the designer selects one of them to identify a tuple in the relation, it
becomes a primary key.
ALTERNATE KEY
▪ The candidate keys that are not selected by the database designer as primary key.
COMPOSITE KEY
▪ The primary key that consists of two or more attributes is known as composite key.
FOREIGN KEY
▪ An attribute or set of attributes in a relation (Child Table) whose values match a
primary key n another relation (Parent Table).
RELATIONAL DATA INTEGRITY
▪ Data integrity means reliability and accuracy of data.
▪ Integrity rules are designed to keep the data consistent and correct.
▪ These rules act like a check on the incoming data and are applied to maintain
quality of data.
▪ DBMS provides several mechanisms to enforce integrity of the data in a column to
ensure the quality of data in a database. .
▪ Data integrity falls into following categories:
▪ Entity Integrity
▪ Domain Integrity
▪ Referential Integrity
ENTITY INTEGRITY
▪ This rule ensures that the primary key cannot contain null data.
▪ Also called as row integrity.
▪ If primary key is allowed to have null value, it is not possible to uniquely identify a
tuple in a relation.
DOMAIN INTEGRITY
▪ Set of values that can be stored in a column is called a domain.
▪ Domain integrity ensures restrictions on the values entered in a column.
▪ It specifies the validity of a specific data entry in a column.
▪ The data type of a column enforces domain integrity.
REFERENTIAL INTEGRITY
▪ It preserves the defined relationship between tables when records are added or deleted.
▪ It ensures that key values are consistent across the tables.
▪ A value can not be inserted in foreign key if it has no corresponding value in primary key field of
the relation table.
DATABASE LANGUAGES
▪ Data sublanguage consist of two parts:
▪ Data definition Language (DDL)
▪ Data Manipulation Language (DML)
▪ DDL is used to describe and name the entities, attributes, relationships,
associated integrity and security constraints.
▪ DML is used to insert, modify, delete and retrieve the data from database.
▪ Procedural DML
▪ Nonprocedural DML
RELATIONAL ALGEBRA
▪ It is a procedural query language that processes one or more relations to define
another relation without changing original relation.
▪ There are two categories of operations in relational algebra.
▪ Unary Operations
▪ Selection
▪ Projection

▪ Binary Operations
▪ Union
▪ Set difference
▪ Cartesian Product
Unary Relational Operations
• SELECT Operation: used to select a subset of the tuples from
a relation that satisfy a selection condition. It is a filter that keeps
only those tuples that satisfy a qualifying condition.
Examples:
DNO = 4 (EMPLOYEE)
SALARY > 30,000 (EMPLOYEE)
– denoted by  <selection condition>(R) where the symbol  (sigma) is used to
denote the select operator, and the selection condition is a Boolean
expression specified on the attributes of relation R
21
SELECT Operation Properties
The SELECT operation  <selection condition>(R) produces a relation S that has the
same schema as R

The SELECT operation  is commutative; i.e.,


 <condition1>( < condition2> ( R)) =  <condition2> ( < condition1> ( R))

A cascaded SELECT operation may be applied in any order; i.e.,


 <condition1>( < condition2> ( <condition3> ( R))
=  <condition2> ( < condition3> ( < condition1> ( R)))

A cascaded SELECT operation may be replaced by a single selection with a


conjunction of all the conditions; i.e.,
 <condition1>( < condition2> ( <condition3> ( R))
=  <condition1> AND < condition2> AND < condition3> ( R))) 22
Selection Condition
• Operators: <, , , >, =, 
• Simple selection condition:
– <attribute> operator <constant>
– <attribute> operator <attribute>
– <condition> AND <condition>
– <condition> OR <condition>
– NOT <condition>

23
Select Examples
Person Id Name Address Hobby
1123 John 123 Main stamps
1123 John 123 Main coins
5556 Mary 7 Lake Dr hiking
9876 Bart 5 Pine St stamps

 Id>3000 OR Hobby=‘hiking’ (Person)


 Id>3000 AND Id <3999 (Person)

 NOT(Hobby=‘hiking’) (Person)
 Hobby‘hiking’ (Person) 24
Unary Relational Operations (cont.)
• PROJECT Operation: selects certain columns from the table and
discards the others.
Example:
 LNAME, FNAME,SALARY (EMPLOYEE)
The general form of the project operation is: <attribute list>(R) where  is
the symbol used to represent the project operation and <attribute list> is the
desired list of attributes.

27
PROJECT Operation Properties

The number of tuples in the result of  <list> (R) is always less or equal to the number of
tuples in R.

If attribute list includes a key of R, then the number of tuples is equal to the number of
tuples in R.

 <list1> ( <list2> (R) ) =  <list1> (R) as long as <list2> contains the attributes
in <list1>

28
The number of tuples in the result of π<list>(R) is always less than or equal to the number of tuples
in R.
Example: Consider a relation Student with the attributes:

IF WE PERFORM Π(NAME, AGE)(STUDENT), WE GET:

SINCE DUPLICATE TUPLES ARE REMOVED, THE NUMBER OF TUPLES IN THE RESULT ≤ THE NUMBER OF TUPLES IN THE ORIGINAL RELATION.
If the attribute list includes a key of R, then the number of tuples is equal to the number of tuples in
R.
If we project on a key (a set of attributes that uniquely identify tuples), the number of tuples
remains the same.
Example: If we do π(ID, Name)(Student):

SINCE ID IS A UNIQUE KEY, NO TUPLES ARE REMOVED, SO THE NUMBER OF TUPLES REMAINS 3 (SAME AS THE ORIGINAL TABLE).
π<list1>​(π<list2>​(R))=π<list1>​(R) as long as <list2><list2><list2> contains all attributes in <list1><list1><list1>.

If we first apply π<list2>(R) and then apply π<list1>, it gives the same result as directly applying π<list1>(R)—as
long as <list2> includes all attributes in <list1>
SELECT and PROJECT Operations

(a) (DNO=4 AND SALARY>25000) OR (DNO=5 AND SALARY>30000)(EMPLOYEE)


(b) LNAME, FNAME, SALARY(EMPLOYEE)
(c) SEX, SALARY(EMPLOYEE)
Relational Algebra Operations from Set
Theory
• The UNION, INTERSECTION, and MINUS Operations
• The CARTESIAN PRODUCT (or CROSS PRODUCT)
Operation
Set Operators
• A relation is a set of tuples, so set operations apply:
, , − (set difference)
• Result of combining two relations with a set operator is a
relation => all elements are tuples with the same structure
• Both relations must be type-compatible, meaning they must
have the same number of attributes and matching data types.

36
UNION Operation

Denoted by R  S
Result is a relation that includes all tuples that are either in R or in S or in both. Duplicate
tuples are eliminated.
Example: Retrieve the SSNs of all employees who either work in department 5 or directly
supervise an employee who works in department 5:
DEP5_EMPS  DNO=5 (EMPLOYEE)
RESULT1   SSN(DEP5_EMPS)
RESULT2(SSN)   SUPERSSN(DEP5_EMPS)
RESULT  RESULT1  RESULT2
The union operation produces the tuples that are in either RESULT1 or RESULT2 or both.
The two operands must be “type compatible”.
37
UNION Operation
Type (Union) Compatibility

The operand relations R1(A1, A2, ..., An) and R2(B1, B2, ...,
Bn) must have the same number of attributes, and the
domains of corresponding attributes must be compatible,
i.e.
– dom(Ai) = dom(Bi) for i=1, 2, ..., n.

38
Example
Tables:
Person (SSN, Name, Address, Hobby)
Professor (Id, Name, Office, Phone)
are not union compatible.

But
 Name (Person) and  Name (Professor)
are union compatible so
 Name (Person) -  Name (Professor)
makes sense. 39
UNION Example
STUDENT  INSTRUCTOR:

40
Set Difference Operation
Set Difference (or MINUS) Operation
The result of this operation, denoted by R - S, is a relation that includes all
tuples that are in R but not in S. The two operands must be "type compatible”.

41
Set Difference Example
S1 S2
SID SName Age SID SName Age
473 Popeye 22 202 Rusty 21
192 Jose 22 403 Marcia 20
715 Alicia 28 914 Hal 24
914 Hal 24 192 Jose 22
881 Stimpy 19
42
Relational Algebra Operations From
Set Theory (cont.)
• Union and intersection are commutative operations:
R  S = S  R, and R  S = S  R

• Both union and intersection can be treated as n-ary operations applicable to any number
of relations as both are associative operations; that is
R  (S  T) = (R  S)  T, and
(R  S)  T = R  (S  T)

• The minus operation is not commutative; that is, in general


R-S≠S–R

43
Cartesian (Cross) Product
• If R and S are two relations, R  S is the set of all concatenated tuples <x,y>, where x is a
tuple in R and y is a tuple in S
– R and S need not be union compatible
• R  S is expensive to compute:
– Factor of two in the size of each row; Quadratic in the number of rows

A B C D A B C D
x1 x2 y1 y2 x1 x2 y1 y2
x3 x4 y3 y4 x1 x2 y3 y4
x3 x4 y1 y2
R S x3 x4 y3 y4
44
R S
Cartesian Product Example

• We want a list of COMPANY’s female employees


dependents.
Binary Relational Operations: JOIN and
DIVISION
• The Theta JOIN Operation
• The EQUIJOIN and NATURAL JOIN variations of JOIN
• The DIVISION Operation
JOIN Operation
• Cartesian product followed by select is commonly used to identify and select
related tuples from two relations => called JOIN. It is denoted by a
– This operation is important for any relational database with more than a single relation,
because it allows us to process relationships among relations.
– The general form of a join operation on two relations R(A1, A2, . . ., An) and S(B1, B2, .
. ., Bm) is:
R <join condition>S
where R and S can be any relations that result from general relational algebra expressions.

50
Theta JOIN

• This is based on a Predicate added to a Cartesian Product. In simple


term, if you have joined two tables using CROSS JOIN, then you
can add a filter to the result using one of the comparison operators.
See the example given. Note that it can be implemented using
SELECTION over a Cartesian Product as well.
Equi Join
• This is same as Theta Join but the comparison operator is
equal. Generally, if the operator of the Theta Join is equal
operator (=), then the join is called as Equijoin instead of
Theta Joi
Natural Join
• Natural Join is an Equijoin of two relations over all common attributes.
In other words, when joining two tables, join is done using all common
columns. Therefore, explicit Predicate is not required. It remove all
duplicate attributes.
Outer Join
• In outer join, we include those tuples which meet the given
condition along with that, we also add those tuples which do not
meet the required condition. The result also includes the tuples
from the left and right tables which do not satisfy the conditions.
Based on the tuples that are added from left, right or both the
tables, the outer join is further divided into three types.
• Left Outer join
• Right Outer join
• Full outer join
Left Outer Join( ⟕)
• Left Outer Join is a type of join in which all the tuples from left relation are
included and only those tuples from right relation are included which have a
common value in the common attribute on which the join is being performed.
Notation: R1⟕R2 where R1 and R2 are relations.

Right Outer Join(⟖)


• Right Outer Join is a type of join in which all the tuples from right
relation are included and only those tuples from left relation are
included which have a common value in the common attribute on
which the right join is being performed.

Notation: R1 ⟖ R2 where R1 and R2 are relations.


Full Outer Join(⟗)
• Full Outer Join is a type of join in which all the tuples from the
left and right relation which are having the same value on the
common attribute. Also, they will have all the remaining tuples
which are not common on in both the relations.
Notation: R1 ⟗ R2 where R1 and R2 are relations.

You might also like