0% found this document useful (0 votes)
8 views

Database Management Systems2013-14

Database Management Systems Notes for Graduates

Uploaded by

keshavarn
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Database Management Systems2013-14

Database Management Systems Notes for Graduates

Uploaded by

keshavarn
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 68

Database Management Systems

1. Introduction
Data: The input given to the computer in the form of numbers,
alphabets etc are called data.
Database: The collection of data stored in a computer which contains
information relevant to enterprise is called database.
DBMS: DBMS is a collection of inter related data and a set of programs
to access and modify these data.
DBMS provides a convenient and efficient way to store and retrieve
database information.
Applications of DBMS
1. Banking: It is helpful for storing and processing customer
information, accounts, loans and banking information.
2. Universities: For student information, course registrations, marks,
grades, teachers information etc.
3. Telecommunication: For keeping records of calls made, generating
monthly bills, maintaining balances on prepaid cards and storing
information about the communication networks.
4. Human Resources: For managing information about employees, their
payroll, access and generation of paycheques.
5. Airlines: This is the first area which used database in a
geographically distributed manner.
6. Credit card transactions
7. Finance: In share market
8. Sales: For Customer Product and Purchase information.
9. Researches: For analyzing data using warehouses.
10. Hospitals: To manage Patient’s Disease History Etc..
11. Manufacturing: for Management of Supply and tracking production of
items.
Purpose of DBMS
o Prior to database system, file processing system was used in which
data were stored in operating system files.
o Application programs were developed which allow users to manipulate
and add new information.
The above conventional system has many disadvantages
1. Data redundancy and inconsistencies:
As the requirement arises software is updated. Different programs
create the files and application programs. The files may contain some
data in different files. This leads to more storage space and access
is lost.
This redundancy may also lead to inconsistency. When information
in some file is changed and the same information in another file is
left unchanged.
2. Difficulty in accessing data:
The conventional file processing environment don’t allow needed
data in convenient and efficient manner i.e. subset of content of
files cannot be retrieved.
3. Data isolation:
Since data is stored in different files and may be in different
forms. Writing application for retrieving this data may be difficult.
4. Integrity problems:
Data values stored in the in the database must satisfy certain
consistency constraints, these could not be enforced properly without
a DBMS.

Department of Computer Science, Govt Science College, Chitradurga Page 1


Database Management Systems

Example: Balance of certain Bank accounts must not be less than


certain min amount. Regnos, account nos must be unique and cannot be
null.
5. Atomicity Problems:
Computer System like any other device may subject to failure. It
is crucial that database to be restored to the consistent state that
existed prior to the failure.
Ex: If certain amount is to be transferred from A’s account to
B’s account A is debited and B is not credited that time the system
failure occurs Database goes to inconsistent State.
To avoid inconsistency transactions must be Atomic i.e., either
both credit and debit transactions should occur or neither should
occur. Atomicity is difficult to enforce in the File processing
system.
6. Concurrent Access Anomalies:
For the sake of overall performance and faster response many
systems allow multiple users to update the data simultaneously. If
file processing system is used concurrent updates may result in
inconsistent data.
Ex: Consider account A with Balance 1000, if two customers at
different branches withdraw 400 and 200 respectively at same time.
Programs reading the data at each terminals will read 1000 from the
account of A and will update it as 600 and 800 respectively after
withdrawal. Depending on the later update the account will contain 600
or 800. To guard against this possibility
7. Security Problems:
Enforcing Security in certain critical data such as Banking is
very difficult when file processing systems are used as application
programs are added as and when needed.
View of Data:
DBMS hides certain details of how the data is stored and maintained in
different levels.
Data Abstraction:
To retrieve data efficiently, database designers use complex data
structures to represent data in the database.
But the DBMS developers hide the complexity from the users through
several levels of abstraction.

View level

View 1 View 2 …….. View n

Logical level

Physical level

1. Physical level:
This is the lowest level of abstraction and describes how the data is
actually stored. The physical level describes complex low level structures
in detail.

Department of Computer Science, Govt Science College, Chitradurga Page 2


Database Management Systems

2. Logical level:
The next level describes what data is stored in the database and what
relations exist between the data. Database administrators who must decide
what information to keep in the database use the logical level of
abstraction. Logical level describes the entire database in terms of a small
number of relatively simple structures. Implementation of these structures
may involve complex simple structures.
3. View level:
The view level abstraction is to simplify user interaction with the
system. The view level describes only a part of entire database.
The system may provide many views for the same database.

Instance and Schemas:


Schemas: The overall database design is called a schema. Schemas are
changed rarely.
Instances: The collection of information stored in the database at a
particular moment is called an instance of database.
According to the levels of abstractions the database systems have many
schemas.
1. Physical schema: Database design at physical level.
2. Logical schema: Database design at logical level
3. Sub schemas: In the view level different views of database are
possible.
Data Model
It is a collection of conceptual tools for describing data, data
relationships, data semantics and consistency constituents.
Data model describes design of database at physical logical and view
levels.
1. Relational model:
The relational model uses a collection of tables to represent both
data and relationship to many. Each table contains records of a particular
type. Each record type defines fixed number of fields or attributes.
The columns of the table are corresponding to the attributes of the
record type. The relational model is the most widely used model in the
database.
2. Entity Relationship model:
E-R Model consists of collection of basic objects called entities and
relationships among these objects. An entity is a thing or object in the
real world i.e. distinguishable from other objects. E-R Model is widely used
in database design.
3. Object-Based Data Model:
Object oriented data model is the extension of E-R model with motions
of encapsulation methods and object oriented. This is a new and evolving
technique in the DBMS.
4. Semi structure data model:
In this model we can specify different sets of attributes for
individual data items of same type.
Ex: Employee can have different attributes for different persons. XML
[Extensible Markup Language] is widely used to represent semi structure
data.
The network data model and hierarchical data model proceeded the
relational data model. These models complicate the task of modeling data by
concentrating on underlying implementation (physical level) and so are used
very less.

Department of Computer Science, Govt Science College, Chitradurga Page 3


Database Management Systems

Database languages
Database system provides Data Definition Language (DDL) to specify
database keyword and a Data Manipulation Language (DML) to express database
queries and updates. In practice DDL and DML are parts of Structured Query
Language (SQL).
Data Manipulation Language (DML):
DML are of two types
1. Procedural DML: User has to specify has to specify where data is
needed and how to get this data.
2. Non procedural DML or Declarative DML: User has to specify what
data is needed but need not specify how to get data.
DML enables users to access or manipulate data as organized by the
appropriate data model.
A query is a statement requesting the retrieval of the information
from the database. Portion of DML involves information retrieval is called
Query Language. In common practice DML and Query language both words are
used synonymously.
SQL (Structured Query Language)
It is a most widely used query language. A query process component
of DBMS, translates DML queries in to sequence of actions at physical level.
Types of DML statements
1. Insertion of new information into a database.
Ex: Insert into <table_name> Values <val1, val2………>
2. Deletion of information from the database.
Ex: Delete from <table_name> where <condition>
3. Modification of information stored in the database
Ex: Update <table_name> set <col_name> = <value> where
<condition>
4. Retrieval of information
Ex: Select col1, col2....column n from table1, table2……
where<condition>

Data Definition Language (DDL):


Database Schema can be specified by a set of definitions expressed by
a special language called Data-Definition Language (DDL). DDL is Just Like
any other programming language, which gets input instructions and generates
output which is stored in data dictionary. Data dictionary contains Metadata
– Metadata is a data about data.
Data dictionary is a special type of table accessed and updated only
by DBMS and consulted whenever reading or modifying actual data.

Storage structure and access methods of database are specified DDL


statements called Data storage and definition language.
These statements define the implementation details of database schema.
The data values stored in the database must satisfy certain consistency
constants.
1. Domain constraints
The datatype of the value to be stored in the particular column is
defined by this constant (elementary form of integrity constraints).
2. Referential integrity
A value that appears in one relation for a given set of attributes
also appears for certain set of attributes in another relation. This
condition is called referential integrity. A database modification that may
violate this condition has to be avoided.

Department of Computer Science, Govt Science College, Chitradurga Page 4


Database Management Systems

3. Assertions
A data value is to follow certain validity rules before it is inserted
in to the particular column.
Ex: Balance of an account table of a bank database cannot be negative.
Domain constraint and referential integrity are special forms of assertions.
4. Authorization
The differentiations among the database users to control access are
expresses by authorization.

Data Storage and querying


The functional components of a data component of a database system can
be classified into Storage manager and Query Processor.
Storage Manager:
It is the program module that provides the interface between the
low level data stored in the database and the application programs and
queries submitted to the system.
The storage manager translates the various DML statements into low
level file system commands. Since data movement between main memory and the
disk storage is slow compared to the speed of CPU storage manager should
also take care of minimizing data movement between main memory and disk. The
storage manager is responsible for interaction with the file manager for
storing, retrieving and updating data in database.
Components of Storage Manager
1. Authorization manager
Checks the authority of the user and satisfaction of integrity
constraints.
2. Transaction manager
Manages the consistent stage of database despite of system failure and
exhibition of concurrent transactions without conflicting.
3. File Manager
It manages the allocation of disk storage space and data structure
used to represent information stored on the disk.
4. Buffer Manager:
It enables the database to handle data sizes that are much larger than
the size of main memory. It is responsible for fetching data from the disk
to main memory and decides what data should be taken to cache memory.
Data structures used by storage manager
1. Data files: It stores the database information.
2. Data Dictionary: It stores metadata about the structure of database
and it’s objects.
3. Indices: It provides fast access to those data items that hold a
particular value.
Query processor
1. DDL interpreter:
It interprets the DDL statements and records definitions in the data
dictionary.
2. DML compiler:
Translates DML statements in a query language into a evaluation plan
consisting of low level instructions that the query evaluation engine
understands. DML compiler performs query optimization to pick up the lowest
evaluation plan among the alternatives.
3. DML Evaluation engine:
Executes low level instructions generated by DML compiler.

Department of Computer Science, Govt Science College, Chitradurga Page 5


Database Management Systems

Transaction Manager:
When several operations on the database form a single logical unit
of work, the atomicity of the transaction must be ensured. After this
transaction the correctness of the database state shows consistency.
After the successful execution of all transactions all new values must
exist in the database, despite possibility of system failure. This
persistence requirement is called durability.
A transaction is a collection of operations that perform a single
logical function of a database application. Each transaction is a unit of
both atomicity and consistency. In case of transparent funds from account A
to B we write two programs to debit one account and to credit one account.
This time it may be necessary to allow temporary inconsistency.
Ensuring the atomicity and durability properties is the
responsibility of the transaction-management component because of various
types of failures. Transaction may not always complete its execution
successfully. In this case failure recovery component should detect the
failure and restore the database.
When several transactions update the database concurrently an each
individual transaction is correct but consistency of data may not be
preserved. Concurrency control manager controls the interaction among the
concurrent transactions to ensure the consistency of the database.
Database Users and Administrators
Database Users
The way users interact with the DBMS they are classified into 4 types
1. Naive Users: They are unsophisticated users who interact with the
system by invoking one of the application programs that have been written
previously. Typical users interface forms. User may fill appropriate fields
of the form to interact with the database.
Ex: ATM users, bank cashiers
2. Application programmers: It develops user interfaces using many
development tools that enable an application programmer to construct forms
and reports with minimum effort.
3. Sophisticated users: These are the users who submit queries to
explore the database in DML language.
4. Specialized users: These are the special type of users who write
specialized database applications like computer aided design systems which
store complex data such as graphics, audio, video etc.

Database Administrators (DBA)


A person who has central control over the database system is called
DBA. Some of the Functions of DBA are:
1. Schema definition: Creating original database using DDL.
2. Storage structure and access method definition
3. Schema and physical organization for data access: Changes in the
schema as per the requirement of the organization to improve performance.
4. Granting of authorization for data access: DBA can regulate the
parts of the database to various users from accessing
5. Routine management:
a. Backing up database periodically.
b. Monitoring free disk space and upgrading disk space if
required.
c. Monitoring jobs running on the database and their performance.

Department of Computer Science, Govt Science College, Chitradurga Page 6


Database Management Systems

Database Architecture
Naïve Users Application Sophisticated Database
(tellers, Agents, Programmers users (analysts) Administrators
Web users)

Application Application Query tools Administration


interfaces Programs tools

Query Processor
Compiler DML DDL
and Linker Queries interpreter
Application
Program
DML Compiler
Object Code
and organizer

Query evaluation
engine

Storage manager
Buffer Manager File Manager Authorization Transaction
and integrity Manager
Manager

Disk Storage
Indices Data
Dictionary
Data
Statistical data

Architecture of the database system is influenced by the underlying


computer system on which database runs. Database systems can either be
centralized or client server.
Most of databases now a day’s run on client server system that is
over a distributed system. DBMS is partitioned into two or three parts.
Two-tier architecture:
Application has two components:
1. Application residing on the client machine which invokes
2. Database system is the server machine.

Department of Computer Science, Govt Science College, Chitradurga Page 7


Database Management Systems

Client
USER

Application

Network

Server

Database system

Three-tier architecture:
1. Client machine acts as frontend and communicates with the
application server through forms interface.
2. Application server which communicates with the database system to
access data.

Client
USER

Application

Network

Server
Application Server

Database system

Department of Computer Science, Govt Science College, Chitradurga Page 8


Database Management Systems

2. E-R Model
ER-Model is very useful in mapping the meanings and interactions of
real world Enterprises onto a conceptual schema. The E-R model employs 3
basic notations entity sets relationship sets & attributes.
Entity Set:
An entity is a thing or object in the real world that is
distinguishable from all other objects. Ex: Person in an enterprise.
Entity Set is a set of entities of the same type that share the same
properties (or) attributes.
Ex: employees in an organization.
Entity sets need not be disjoint.
Ex: employee & customer are entity sets.
Any person may be an entity in both entity sets (or) in any one (or)
none. An entity is represented by a set of attribute of descriptive
properties possessed by each member of entity set. Each entity may have its
own values for each attribute.
Employee table
Id Name Address Designation
101 Vivekananda CTA Principal
104 Nagaraj DVG Professor
108 Govindraju CTA Professor
109 Channakeshava CTA Asst Professor
… … … …

Ex: 1st entity (id:101) (name:Mahesh) (address:cta) (design:principal)

Relationship set:
A relationship is an association among several entities.
Ex: Govindraj is working in maths department.
A relationship set is a set of relations of the same type. It is a
mathematical relation on n>=2 entity sets.
If E1,E2...En are entity sets. Then relationship set R is a subset of
{(e1,e2,e3...en)/e1  E1,e2  E2...en  En}
Where e1,e2,e3...en relationship.
An association b/w entity sets is referred to as participation. Entity
sets E1, E2...En participate in relation set R. a relationship instance in an
E-R schema represents an association b/w the named entities in the real
world enterprise. The function that an entity plays in a relation is called
Role of that entity.
In a relationship set the same entity set participates in relationship
more than once in different roles this is called recursive relationship set.

Employee E1 E2 Department
Id Name Address Designation Dept id Dept name
101 Mahesh CTA Principal 01 Administrative
104 Naveen DVG Professor 02 Cs
108 Govindraj CTA Professor 03 Maths
109 Channakeshava CTA Asst Professor E3 Commities
C_ID Committee Name
01 Admission
02 Time table
03 NSS

Department of Computer Science, Govt Science College, Chitradurga Page 9


Database Management Systems

Relationship may also have attributes called descriptive attributes.


Ex: A relation b/w depositor & account, it may be associated with the
attribute access date.

Depositor Access Date Account

Mahesh 12-07-2011 A-101


25-04-2010
Sandesh A-102
23-01-2010
Murthy A-105
02-11-2001

Ravi 06-01-2008 A-115

The relationship sets Depositor and Account are an example for binary
relationship. Most of the relationship sets in the database system are
binary
Attributes
A set of permitted values for an attribute is called domain (or) value
set of that attribute.
Ex: since an entity set may have several attributes each entity can be
described by a set of (attribute name, data value) pairs. One pair for each
attribute of the entity set.
A particular employee entity may be described by the set
{(id,101),(name,Mahesh),(address,CTA),(desg,principal)}
Attributes can be characterized as
Simple & Composite attributes:
Simple attribute contains only part and cannot be divided into sub
parts. Ex: emp-id, name.
Composite attribute can be divided into sub parts.
Ex: Name can be divided into 1st name, middle name, last name
Name Address

First Name Middle Name Last Name Street City State Postal Code

Apartment_no Street_no Street_name

Composite attribute may also appear as a hierarchy.


using composite attribute is a good choice if a user will wish to refer to a
component of that attribute.
Single & multi valued attributes:
An entity having only one value to the attribute is called single
valued attributes Ex: Employee id, An Employee can have only one id at a
time.
Some attributes of an entity can have more than one values or
sometimes nil. Such attributes are called multivalued attributes. Ex: A
person can have more than one phone numbers, more than one car etc.

Department of Computer Science, Govt Science College, Chitradurga Page 10


Database Management Systems

Derived Attribute:
The values for these types of attributes can be derived from the
values of other related attributes. Such attributes are called derived
attributes. Ex: Age can be derived from Date_of_Birth.
Null values:
An attribute takes null value when entity set do not have any value to
it. Null is not equivalent to zero, Null means the value is not known.

Constraints:
Mapping cardinalities, key constraints and participation constraints are
the constants that contents of a database must confirm.
1. Mapping cardinalities:
It expresses the number of entities to which another entity can be
associated via relationship.
A) One-one: Entity in A is associated with the at most one element of
B. And an entity in B associated with at most element of A.
A B

a1 b1

a2 b2

a3 b3

a4 b4

B) One-many: An entity in A can be associated with any no. of entities


in B, and an entity in B can be associated with at most one entity
of A.
A B

a1 b1

a2 b2

a3 b3

a4 b4

C) Many-one: Entity in A is associated with the at most element of B.


But an entity in B can be associated with any number of elements
element of A.
A B

a1 b1

a2 b2

a3 b3

a4 b4

Department of Computer Science, Govt Science College, Chitradurga Page 11


Database Management Systems

D) Many-many: Entity in A is associated with any number of entities of


B. But an entity in B can be associated with any number of elements
element of A.
A B

a1 b1

a2 b2

a3 b3

a4 b4

2. Key Constraints:
A key is a property of an entity set rather than the individual
entity. Individual entities in a database are conceptually different.
Therefore To distinguish them, the values of attribute of an entity
must be such that they can uniquely identify an entity.
Key allows us to identify a set of attributes that suffice to
distinguish entities from each other. They also help to uniquely
identify relationships.
Keys on Entity Sets:
a) Super Key: is a set of one or more attributes that are taken
collectively, allows us to identify an entity in entity set.
Ex: (Customer_id), (customer_name, address, age), (Sdudent_Name,
Class, Age).
b) Candidate Key: is a minimal super key of a set of attributes such
that no subset of it again a super key.
Ex: (Cus_id), (Cus_name, Street_no)
c) Primary Key is a term used by the database users to denote
candidate key.
Properties:
a) Candidate key must be chosen such that it has minimal key
attributes.
b) No two entities of a set are permitted to have same value in that
key.
c) Primary key/Candidate key must be chosen such that Its value change
very rarely or never.
(address is possible to change so it should be avoided)
Keys in Relationship sets:
We have primary key, candidate key, super key, defined for entity
sets, we can have similar keys for relationship sets also.
Let R be the relationship set involving entity sets E1,E2,...En.
Let primary-key[Ei] denote the primary key for entity set Ei. Assume
that the attribute names of all primary keys are unique & each entity set
participates only once in the relationship.
If the relationship R has no attributes associated with it then the
set of attributes
Primary-key[E1]Primary-key[E2] …. …  Primary-key[En]
If the relationship set R has attributes {a1, a2,...an} associated with it
then the set of attributes

Department of Computer Science, Govt Science College, Chitradurga Page 12


Database Management Systems

Primary-key[E1]Primary-key[E2] …. …  Primary-key[En]{a1, a2,...an}


Describes an individual relationship in set R.
In both cases Primary-key[E1]Primary-key[E2] …. …  Primary-key[En]
forms a super key for the relationship set.
In case attribute names of primary keys are not unique across the
entity sets the attribute names are renamed as Entity-set combined with the
name of attribute to form a unique name.
In case an entity set participates more than once in the relationship
set the role name is used instead of entity set name to form a unique
attribute name.
3. Participation Constraints:
a) Total participation: If every entity in an entity set E participates
in at least one relationship set set R, participation is said to be
total.
b) Partial participation: If only some entity in set E participates in
relationship set R, participation is said to be partial.
ER Diagrams:
E-R Diagrams can express overall logical structure of the database
graphically.
Major Components of ER Diagram are:
1) Rectangle : Represents entity set
2) Ellipse : Represents attributes
3) Diamond : Represents relationship set
4) Links : Links attributes to entity sets & entity sets to
relationship sets If the relationship is many to many the line is
undirected. If the relationship is one to many (or) many to one is
directed towards one’s side. If the relationship is one to one the
line is directed both side.
5) Double ellipse : Represents multi valued attributes.
6) Dash ellipses : Denotes desired attributes.
7) Double line : Indicates total participation of one entity in a
relationship set.
8) Double rectangles: Represents weak entity sets.

E-R Diagram for entity set Employee & Department (many to one
relation), phone_no is multivalued attribute, address is composite
attribute, age is derived attribute, emp_id is primary key.

locality

street city
dob
emp_name pincode
dept_name
address location
emp_id dept_id

employee Works department


for

phone_no age

E-R diagram of entity sets customer and account, relationship is


many to many considering account is given to many customers in joint

Department of Computer Science, Govt Science College, Chitradurga Page 13


Database Management Systems

and one customer can have more than one account. access_date is
relationship attribute.

customer_street Access_date
branch_name
customer_city
customer_name acc_no balance

customer depositor account

Role indicators with entity set has relation to itself employee is a


worker under some manager whose information is again in the same Entity
set.
dob
emp_name
address
emp_id
manager
employee Works
worker for

E-R Diagram of (employee, job, department) Ternary relationship

title level

dob
emp_name job dept_name
address location
emp_id dept_id

employee Works department


on

E-R diagram showing total participation. Customer may have an


account or not, but every account must have a customer, or every
account must be associated with a customer.
customer_street
branch_name
customer_city
customer_name acc_no balance

customer depositor account

E-R Diagram with Cardinality limits; minimum and maximum no of


times an entity can participate in the relationship may be indicated
above the lines connecting the Entity set to relationship set. In this
example a limitation for customer to have account is at least 0 or *
indicates any no of accounts. Similarly an account must have atleast
one customer and one account may be given to maximum of 3 persons as
joint account.

Department of Computer Science, Govt Science College, Chitradurga Page 14


Database Management Systems

customer_street
branch_name
customer_city
customer_name acc_no balance

0..* depositor 1..3 account


customer

E-R design issues:


1) Use of entity sets verses attributes:
When an entity has a multivalued attribute like phone no. it is
good idea to treat it as a separate entity. This also helps to keep
extra information about each phone no. such as home, office, mobile
etc.
dob dob Contact_no
emp_name emp_name
address address
location
emp_id emp_id
phone_no

employee employee emp_phone telephone

Entity set with Alternative for


multivalued Attribute multivalued Attribute
Note:
i) Using primary key of an entity set as attribute of another
entity set instead of using a relationship.
Ex: using customer id as an attribute of loan even if each loan
has only one customer it is incorrect. Using the relationship borrower
is the good idea.
ii) Designating the primary key attributes of the related
entity set as attributes of relationship set is wrong.
Ex: loan no. or customer id should not appeal as attributes of
the relationship borrower.
2) Use of entity sets verses relationship sets :
Sometimes an object may be implemented as entity set or sometimes
relationship set.
Ex: A bank loan can be modeled as an entity with attributes, loan
number and amount. It can also be modeled as a relationship between
customer and branch with loan number and amount as descriptive
attributes.
Representing loan as a relationship is convenient when every loan
is held by a single customer and is associated with single branch.
When customer hold a loan jointly we must define a different
relationship for each customer of loan and replicate the values of the
descriptive attributes, loan number and amount and should be ensured
that values of these attributes are same for all customers. This leads
to data replication.
1. Data is stored in multiple types wasting storage space
2. Updates potentially leave the data in an inconsistent state.
Solution for this replication is Normalization.

Department of Computer Science, Govt Science College, Chitradurga Page 15


Database Management Systems

3) Binary v/s n–ary relationship:


Relations in database are often binary. Some relations that
appears to be non binary could actually be better represented by
several binary relationships.
Ex: Employee, Job, Branch.
Consider the abstract ternary relationship set R relating entity
sets A,B,C. We separate relationship set R by an entity set E and
creates three relationship sets.
1. RA relating E and A,
2. RB relating E and B,
3. RC relating E and C.
If relationship set had any attributes these are assigned to the
entity set E and a special identifying attribute is created for E. For
each relation (ai,bi,ci) in the relationship set R, we create a new
entity ei in the entity set E.
A
A
RA

B R C B RB E RC C

Ternary Ternary relationship divided into


Relationship three binary relationships

An identifying attribute may have rows created for the entity


set to represent the relationship set. This attribute along with the
extra relationship sets required increases the complexity of design
and overall storage requirements.
A n-ary relationship set shows clearly that several entity sets
participate in single relationship. Sometimes these may not be a way
to translate constraints on the ternary relationship into constraints
of binary relationship.
4) Placement of relationship attributes:
The cardinality ratio of a relationship set can affect the
placement of relationship attributes. Attributes of one-one or one-
many relationships can be associated with one of the participating
entity sets rather than with one relationship set
Ex: Customer account number -> This is one to many relationship.
Last access of the account data may be stores as attribute of many
side i.e. Account entity set.
This choice of placing attribute in entity set may be complicated
in many to many relationship.
Ex: Customer account where one account can be held by more than
one customer.
Weak Entity sets
An entity set which do not have sufficient attributes to form a
primary key is treated as weak entity sets.
An entity set that have a primary key is called a strong entity set.
Consider entity set payment with attributes payment number, payment
date and payment amount. Payment numbers are sequential numbers generated
separately for each loan so that different loans may share the same payment
number. So this entity set doesn’t have a primary key.

Department of Computer Science, Govt Science College, Chitradurga Page 16


Database Management Systems

Weak entity set is associated with another entity set to give a


meaning. This entity set is called identifying or owner entity set.
So weak entity set is said to be existence dependent on the identifying
entity set. The identifying entity set owns the weak entity set. The
relationship b/w weak entity set and identifying entity set is called
identifying relationship.
The discriminator or partial key of a weak entity set is a set of
attributes that allows us to distinguish between entities of weak entity
set.
amount payment_date

loan_no payment_no payment_amount

loan_pay
customer payment
ment

Extended ER-features:
Specialization:
An entity set person with attributes (person_id, name, street and
city), may be sub classified further depending on the database we are using.
In the example of banking we may classify the entities of person as:
Employee and Customer i.e., a person could be an employee or customer or
both or neither.
The process of designating sub group within an entity set is called
Specialization.
E.R.Diagram for Specialization: Triangle with label-ISA

ISA

ISA stands for ‘is a’ ISA relationship also referred to as a super


class-subclass relationship
We can apply specialization repeatedly to refine a design scheme i.e.,
Person classified as employee and customer. Again employee may be sub
classified as officer, teller, secretary, A customer may be classified as
borrower or depositor.
Name Street
Person_ID City

Person

Salary ISA Person_ID

Employee Customer

ISA ISA

Officer Teller Depositor Borrower

Secretary

Department of Computer Science, Govt Science College, Chitradurga Page 17


Database Management Systems

An account may be sub classified if a bank gives preference to certain


customers by giving specific interest rate for savings account and allowing
over draft for checking account.
Branch_name
Acc_No Balance

Account

ISA
interest Rate Overdraft amount

Savings Checking
Account Account

Generalization:
Specialization represents a top-down approach in design process by
refining an initial entity set into successive levels af entity sub groups.
The design process by which multiple entity sets are synthesized into a
higher level entity set on the basis of common features is a bottom up
approach called Generalization.
Generalization is reverse process of Specialization & is bottom-up
approach.

Attribute inheritance:
Entities are Sub grouped and classified as higher and lower by
specialization and generalization the attributes of higher level entities
sets are said to be inherited by the lower level entity sets is called as
attribute inheritance.
Example: The entity sets employee and customer inherits the peson_id,
name, address etc. attributes of person entity set.
If a higher level entity set participation any relationship set, the
lower level entity set also inherits participation in relationship set.
Ex:
Account Made by Customer

ISA

SB CA
The lower and higher level entities may be arrived by either
generalization or specialization the outcome is same. The attributes and
relationship of a higher level entity set apples to all of its lower level
entity sets.
The distinctive features of a lower level set apply only within a
particular lower level entity set. The E-R diagram of generalization /
specialization depicts a hierarchy at entity sets. If a lower lever
entity set involved in only one ISA relationship with higher level entity
set it is referred as Single Inheritance.

Department of Computer Science, Govt Science College, Chitradurga Page 18


Database Management Systems

EX:
Person

ISA

Employee Customer

If a lower level entity set involved in more than one ISA relationship
with the higher level entity set, it is referred as Multiple Inheritance.
Person Employee

ISA ISA

Guest faculty Teacher Staff

CONSTRAINTS ON GENERALIZATION
Constraints are required to model an enterprise more accurately. They
may be used in different situations.
I. Constraint to determine entities that can be in the lower level entity
set:
1) Condition defined: Membership of lower level entity sets is evaluated on
the basis of whether or not an entity satisfies an explicit conditionor.
Ex: If account a higher level entity set has attribute account_type
all entities that satisfies the condition account_type=”savings account” are
allowed to belong the lower entity set savings account;
Since all the lower level entities are evaluated on the bases of the
same attribute. This type of generalization is said to be Attribute Defined.
2) User Defined: This constraint will not have any membership condition. The
database user assigns entities to a given entity set.
Ex: Assigning employee to a committee work or team.
II. Second type of constraint
Constraint that relates to whether or not entities may belongs to more than
one lower level entity set within a single generalization the lower level
entity sets may be disjoint or over lapping
1) Disjoint: Disjoint constraint requires that entity do not belong to more
than one lower level entity set.
Ex: an account entity can be any one of savings account or checking
account but cannot be both.

ISA

(Disjoint)
2)Overlapping: if the same entity belong to more than one lower level
entity set within a single generalization

Department of Computer Science, Govt Science College, Chitradurga Page 19


Database Management Systems

Ex: An employee of a bank may also be a customer than an entity in


person entity set belong to both customer and employee entity set
III. Completeness constraint on specialization or generalization:
Specifies whether or not entity in the higher level entity set must be
belong to at last one of the lower level entity sets within the
generalization/specialization.
1. Total generalization/specialization each higher level entity must
belong to a lower level entity set.
Ex: An account may be either any one of saving account or checking account.

Account

ISA

Savings Account Checking Account


2. Partial gerneralization/specialization:
Some higher level entities may not belong to any lover level entity
set.
Ex: An employee belong to any of committees formed each as admission
committees, cultural committee, also it is possible that an employee may
not belong to any of the committees

Aggregation:
Aggregation is an abstraction through the relationships is treated as
higher level entities.
E-R model con not express relationship among relationships
Ex: consider a ternary relationship works–on between employee, job and
branch, to record managers for jobs performed by an employee at a branch.
If we represent this by binary relationship b/w employee and manager
will not explain comprehensively.
Job
Job

Employee Works_on Branch


Employee Works_on Branch

Manages
Manages

manager
manager

a. ER-Diagram with Redundant Relations b. ER-Diagram with Aggregation


We can represent it as a quaternary relationship from manager to
employee, branch and job. This leads to redundant information. The best way
to model this situation is through Aggregation.
We represent relationship set works on with entity sets employee, job
and branch as higher level entity set. We can create binary relationship

Department of Computer Science, Govt Science College, Chitradurga Page 20


Database Management Systems

mages b/w manager entity set and works-on to represent who manages which
task.
ALTERNATIVE E-R NOTATIONS:
Previously used symbols in E-R notations
1) Entity set
2) Attributes
3) Weak entity set
4) Multi valued attribute
5) Relationship set
6) Derived attribute
7) Identifying relationship set for weak entity set
8) Total participation of entity set in relationship
9) Primary key
10) Discriminator
11) Many to many
12) One many
13) One to one relationships
14) Cardinality limits
15) Role indicator
16) Generalization/Specialization
17) Total generalization
18) Disjoint generalization
Alternative E-R notations
Entity set E with attributes A1,A2,A3 and primary key.

E
A1
A2
A3

Many to Many Relationship

* R * OR

One to One Relationship


1 1
R OR

One to Many Relationship


1
R * OR

There is no standard for E-R notation. The notation we used are called
chen’s notation.
The US National Institute for Standards and Technology defined a
standard that used . Craw’s-foot notation IDEF1X in 1993.

Department of Computer Science, Govt Science College, Chitradurga Page 21


Database Management Systems

3. RELATIONAL MODEL
Structure of relational database.
A relational database contains of a collection of tables each of which
is assigned unique name the row in a table represents a relationship among a
set of values.
Basic structure:
D1 D2 D3
Emp_id Emp_name Emp_address
101 Raju Dvg
102 Rama Cta
: : :

The column headers of a table are called attributes.


Ex:emp_id, emp_name, emp_address in employee table
A set of permitted values are called the domain of that attribute.
Ex: 101, 102, .. .. for emp_id.
Let D1 denote set of all employee id’s D2 denote set of all emp_names
D3 denote set of all emp_address.
Any row of a employee table consists of 3-tuple(V1, V2, V3) where v1 is
emp_name in domain D2 & V3 is emp_address in domain D3. Employee table will
contain only a subset of the set of the set of all possible rows.therefore
employee is a sub set of D1 X D2 X D3
In general a table of n attributes must be a subset of D1 X D2 X … X Dn
Mathematically a relation is a subset of Cartesian product of a list
of domains. This definition is same as the deffination of a table.So we can
use terms relation & tuple in place of table & row.
A tuple variable is a variable whose domain is the set of all tuples.
In employee relation let ‘t’ be the tuple variable then t[emp_id] as the
domain has the set of all emp_ids.
t[Emp_name] has the set of all emp_names as its domain.
A domain is atomic if elements of the domain are considered to be
indivisible units set of integers is atomic domain.We consider integers set
donot have sub parts again.
Ex: emp_name has non-atomic domain as name may be again sub divided as
first_name, middle_name, last_name.
It is also possible that several attributes have the same domain.
Ex: consider customer relation with attributes:
(customer_name, customer_address, customer_city)
Employee relation with attributes:
(emp_name, emp_designation, emp_branch)
Here emp_name,customer_name both share the same domain.
One domain value that is a member of any possible domain is null,
which represents the value unknown or does not exist.
DATA BASE SCHEMA:
Data base schema is the logical design of the database and the
database instance is a snap shot of the data in the database at a given
instant time. In relational model relation corresponds to a variable and
relation schema corresponds to type definaton of the variable in a
programming language.
For convention we use lowercase names for relations[ex: account] and
names beginning with uppercase letters for relation schemas[ex
:Account_schema].
Account_schema=(account_no, branch_name, balance)

Department of Computer Science, Govt Science College, Chitradurga Page 22


Database Management Systems

"Account is a relation in account schema", is written as


account(Account_schema)
The relation instance corresponds to the value of a variable. The
contents of relation instance may change with time as the
relation is updatted.
Branch_schema=(branch_name, branch_city, assets)

NOTE: Using common attributes in relations is one of the way to relate


tuples of distinct relations.
Branch_name is common in both account_schema and branch_schema to
relate the tuples in those relations.
In the above two schemas can be represented in only one relation as
account_branch_schema (account_no, balance, branch_name, branch_city,
assets).
By doing so branch_name, city and its assets are repeated for each
account in the branch leading to data duplication.
If any branch information has to be changed we have to change all
tuples in the relation for perticular branch.
If a branch has no account but to retain branch information the NULL
value should be stored in account_no and balance attributes. That is the
reason we have created 2 relations.

KEYS:
A super key is a set of one OR more attributes that allows us to
uniquely identity a tuple in the relation.
EX: (customer_id), (customer_id, customer_name)
Candidate key is a minimal super key, where no proper subset of it is
again a super key.
EX: {cusemer_id}, {customer_name, customer_street}
Primary key: It denotes a candidate key choosen by database designer
as principle means of identifying tuples within a relation.
Primary key, candidate key OR a super key are property of entire
relation.
Candidate key must be chosen such that duplicate values never occur in
the attributes and the primary key should be choosen suchthat its value
never OR rarely changes.
Let 'R' be a relation schema, K  R is super key for R. Relation r(R)
has no two distinct tuples in which have the same values on all attributes
in K.
i.e, if t1 and t2 are in r
and t1≠t2 then
t1[k] ≠t2[k].
Foreign key:-
(r1) account (r2) branch
account_no branch_name

branch_name branch_city
balance assets
A relation r1 may include among its attributes the primary key of
another relation schema r2 this attribute is called foreign key.
Any insertion or updation on relation r1 references for existence of
that particular value in the relation r2. If that particular value is not

Department of Computer Science, Govt Science College, Chitradurga Page 23


Database Management Systems

there in the relation r2 updation/insertion that value into r1 is not


allowed.
The relation r1 is called referencing relation of foreign key
dependency and relation r2 is called referenced relation of foreign key.
Schema diagrams pictorially represents database schema along with
primary key and foreign key dependencies.
branch account depositor customer
branch_name account_no cust_name cust_name
account_no
branch_city branch_name cust_street
assets balance cust_city

loan borrower
loan_no cust_name
loan_no
branch_name
amount
Query languages:-
A Query language is a language in which a user requests information
from the database Query languages are categorized into two
1. Procedural and 2. Non procedural
In a procedural language user gives a sequence of operations on the
database to compute the desired result.
In a Non procedural language user describes the desired information
without giving a specific procedure.
The relational algebra is procedural where as the tuple relational
calculus and domain relational calculus are Non procedural.
Fundamental Relational Algebra
Relational algebra consists of a set of operations that take one or
two relations as input and produce a new relation as their result.
There are six fundamental operations in Relational algebra are
Unary operations - select, project and rename.
Binary operations - union, set difference and cartesion product.
Apart from these fundamental operations there are other operations
like set intersection, natural join, division and assignment all these
operations are defined in terms of then fundamental operations .
The result of a relational operation itself is a relation .
Select Operation:-
The select operation selects tuples that satisfy a given predicate.
Lower case Greek letter sigma (σ) is used to denote selection operation and
the predicate appears as subscript to σ.
Relational algebra queries.
1. To select tuples from loan relation whose branch name is “cta”.
σbranch-name=”cta”loan)
2. To select tuples whose amount is more than 1200.
σamount>1200(loan)
We can use relational operators =,≠,<,≤,>,≥, in the selection
predicate to compare. To use more complex predicates (or) to combine
them we also use logical operators AND-(ᴧ), OR-(v), NOT-().
Ex:- to use both conditions in the above examples.
σbranch_name=”cta” ᴧ amount>1200(loan).

Department of Computer Science, Govt Science College, Chitradurga Page 24


Database Management Systems

Project operation:-
The project operation is a unary operation that returns it’s argument
relation with only specified attributes (or) certain attributes left at.
Projection is denoted by upper case greek letter ‘PAI’(П),the
attributes we wish to get in the result are listed in the subscript.
Пloan_number,amount(loan).

Composition of relation operators:-


Output of a relational operation is again a relation, input of a
relational operation is also a relation. Since the result of a relational
algebra operation is of the same type as it’s input’s. Relational algebra
operations can be composed together into a relational algebra expression.
Пcustomer_name(σcustomer_city=”dvg”(customer)).

Union operation:-
Using union operation we can combine output of two relational
operations into one.
Consider two tables,
1)Depositer (Account_no, customer_name, balance).
Account_no Cust_name Balance
A-51 RNC 25000
A-15 CKM 35260
2) Borrower (loan_no, customer_name, amount)
Loan_no Cust_name Amount
L-21 DGR 20,000
L-42 NPN 15,000
The union operator can be used to combine customer=names from both
tables.
Пcustomer_name(borrower)∪ Пcustomer_name(depositor).

For union of two relations “r∪s” to be valid two conditions must hold.
1)The relations r and s must be of the same arity. They must have the
same number of attributes.
2)  i, The domain of the ith attribute of r and the ith attribute of s
must be the same.
r and s can be either database relations or temporary relations that
are the result of relational algebra expression.

Set Difference operation:-


Set difference operator (-) allows us to find tuples that are in one
relation but not in other. The expression r-s produces a relation containing
tuples in r but not in s.
Πcustomer_name(depositor)-Пcustomer_name(borrower)
Similar to union operation.set difference is taken between comparable
relations. attributes of r&s both are same &both relations have the same
arity.

Cartesion product operation:-


Cartesian product () allows us to combine information from any two
relations. Cartesian product of relations r1&r2 is r1  r2
If any attributes in relations r1&r2 are having same names they are
attached with their relation name.
Ex:- borrower=(customer_name,loan_number)

Department of Computer Science, Govt Science College, Chitradurga Page 25


Database Management Systems

loan=(loan_number,branch_name,loan_amount)
Then relation schema for r = borrower  loan is
(borrower.customer_name, borrower.loan_number, loan.loan_number,
loan.branch_name, loan.amount)
However attribute loan_number is in both relations so removing
relation name for other attributes will not raise any ambiguity. So,relation
schema for r is:
(customer_name, borrower.loan_number, loan.loan_number, branch_name, amount)

Borrower Loan
Customer_name Loan_number Loan_number Branch_name Amount
C K M l-01 l-01 C T A 25000
R N C l-02 l-02 D V G 15000
D G R l-03 l-03 C T A 25000

Relation r = borrower  loan:-


Customer_name borrower.Loan_number borrower.Loan_number Branch_name Amount
C K M l-01 l-01 CTA 25000
C K M l-01 l-02 DVG 15000
C K M l-01 l-03 CTA 25000
R N C l-02 l-01 CTA 25000
R N C l-02 l-02 DVG 15000
R N C l-02 l-03 CTA 25000
D G R l-03 l-01 CTA 15000
D G R l-03 l-02 DVG 15000
D G R l-03 l-03 CTA 25000

If there are n1 tuples in borrower and n2 tuples in loan, then there


are n1n2 ways of choosing a pair of tuples, so relation ‘r’ contains n1n2
tuples.
For some tuples t in r it may be that
t[borrower.loan_no] ≠ t[loan.loan_no]
In general if we have relations r1(R1) and r2(R2) then r1r2 is a
relation whose schema is the concatenation of R1 & R2.
Relation R contains all tuples t for which there is a tuple t1 in r1 &
a tuple t2 in r2 for which
t[R1]=t1[R1] & t[R2]=t2[R2]
Suppose we want to list the names of customers along with their branch
& amount, cartesian product operation associates every tuple of loan with
every tuple of borrower, we have to use some predicate to eliminate unwanted
tuples.
We may match loan.loan_number=borrower.loan_number.
borrower.loan_number =loan.loan_number(borrower  loan).
If we want custumer_name along with their loan amount we can write,
П customer_name,amount( borrower.loan_number=loan.number (borrower × loan))

Output of this query


Customer_name Amount
CKM 25000
RNC 15000
DNR 25000

Department of Computer Science, Govt Science College, Chitradurga Page 26


Database Management Systems

Rename Operation:
The Rename Operation denoted by rho(ρ), lets us to give names to
results of relational algebra expressions.
ρx(E) returns the result of expression E under the name ‘x’.
If relational algebra expression E has arity ‘n’ then the expression
ρx(A1,A2,……An)(E)
Returns the result of expression E under the name x and with the
attributes renamed to A1,A2,...An.
Ex: to find employee name along with their Manager name where as both
are in the same table.
To find the Managers of every employee, first we have to compute
temporary relation. Exactly as employee
emp_id emp_name manager_id
101 Vivekananda 101
102 Sannamma 101
103 Shirahatti 102
104 Madhu 102
105 Shobha Dalawai 101
106 Nagaraj 105
Then we will take the tuples which are having Manager_id of relation
is equal to emp_id of temporary relation as below.
Пemployee.emp_name,m.emp_name(σemployee.manager_id=m.emp_id(employee × ρm(employee))))
This will give the result as.
Employee. emp_name M.emp_name
Vivekananda Vivekananda
Sannamma Vivekananda
Shirahatti Sannamma
Madhu Sannamma
Shobha Dalawai Vivekananda
Nagaraj Shobha Dalawai

Formal Definition of the Relational algebra.


A basic expression in the relational algebra consists of either one of
the folling, 1. A relation in the database 2. A constant relation.
A constant relation is written by listing its tupples within {}
Ex: {(101, Maheshwarappa, 101),(102, Sannamma, 101), . . . }
A general expression in the relational algebra is constructed using
smaller sub expression. Let E1 & E2 be the relational algebra expressions.
Then, 1. E1∪E2
2. E1-E2
3. E1×E2
4. σp(E1) where p is a predicate on attributes in E1.
5. Пs(E1) where s is a list consisting of some of the attributes in E1.
6. x(E1) where x is the new name for the result of E1.
All are relational algebra expressions.

Other Relational Algebra Operations


1. Set intersection operation: set intersection can be implemented using
a pair of set difference operations
r∩s = r –(r –s)

Department of Computer Science, Govt Science College, Chitradurga Page 27


Database Management Systems

2. Natural join:
Consider two relations r(R) & s(S). The natural join of r&s
denoted by r⋈s is a relation on schema R∪S.
r⋈s=ПRUS(σr.A1=s.A1∧r.A2=s.a2∧ … ∧r.An=s.An(r×s)

where R∩S={A1,A2,...An}
the natural join of borrower and loan tables,
Пcustomer_name,loan_number,amount(borowwer⋈loan)
Can be implemented by using
Пcustomer_name,loan_number,amount(σborrower.loan_no=loan.loan_no(borrower×loan))

3. Division operator:
We can find branch name where amount of loan is more than 20000 by
r1=Пbranch_name(σamount>20000(loan))
We can get customer name along with branch name by using
r2=Пcustomer_name,branch_name(borrower⋈loan)
If we want to find out customers in r2 whose branch name is in r1
we can use division operator.
Пcustomer_name,brench_name(borrower⋈loan)  Пbranch_name(σamount>20000(loan))

4. Assingment operator:
Works like an assignment operator in any program language. We may
assign output of certain relation algebra operation to a variable.
ex: temp1 ← ΠR-S(r)

5. Outer join operations:


Some tuples in either or both of the relations being joined may be
lost due to insuffient matching information on the other side
relation. Outer joins preserves those tuples that would be lost in an
join by creating tuples with null values in the result.
Left outer join ( )takes all tuples in the left relation that did
not match with any tuple in the right relation, pads the tuples with
null values for all other attributes from the right relation and adds
them to the result of the natural join.
The left outer join r s can be expressed in basic relational
algebra operations as:
(r⋈s)∪(r-∏R(r⋈s))X{(null,…, null)}
Right Outer join takes all tuples in the right relation that did
not match with any tuple in the left relation, pads the tuples with
null values for all other attributes from the left relation and adds
them to the result of the natural join.
Full outer Join ( ) does both the left and right outer join
operations, padding tuples from the left relation that did not match
any from the right relation, as well as tuples from the right relation
that did not match any from the left relation and adding them to the
result of the natural join.

Extended Relational Algebra operations:


Extended Relational Algebra operations provide the ability to write queries
that cannot be expressed using the basic relational operations.
1. Generalized Projection: Generalized Projection extends the projection
operation by allowing operations such as arithmetic and string
functions to be used in the projection list.

Department of Computer Science, Govt Science College, Chitradurga Page 28


Database Management Systems

General form:
∏F1,F2…Fn(E).
Where E is any relational-algebra expression, and each of F1, F2…Fn is
an arithmetic Expression involving constants and attributes in the Schema of
E. The expression may be simply an attribute, or a constant, or it may be an
arithmetic expression which includes +, -, *, and ÷ on numeric attributes,
numeric constants or expressions which generate arithmetic values.
Ex: ∏empid,fname,salary*12(emp).
Displays EmpID, firstname and annual salary from emp table.
2. Aggregation:
Aggregate operation permits the use of aggregate functions such as min
or average, on sets of values.
Aggregate functions take a collection of values and return a single value as
a result. Ex: the sum function takes collection of values as input adds
them and gives the result of addition as output. Similarly we can perform;
average of set of values, Count set of values, find minimum or maximum among
set of values.
General form: G1, G2, … Gn G F1(A1), F2(A2), … Fm(Am)(E)
where E is any relational algebra expression; G1, G2, … Gn is the list
of attributes on which the aggregate operation has to group. Fi is an
aggregate function on Attribute Ai.
The tuples in the result of expression E are partitioned into groups in such
a way that:
1. All tuples in a group have the same valuesfor G1, G2, … Gn.
2. Tuples in different groups have different values for G1,G2,…Gn.
As a special case of the aggregate operation, the listof attributes
G1, G2, … Gn can be empty, this corresponds to aggregation without grouping
(simple aggregation).
Ex: to find sum of salaries of all employees in emp table we can write:
Gsum(salary)(emp).

Multisets: If the collection of values has multiple occurrences of


certain values, whatever the order they occur in the collection, the
collection is called as multiset.
For eliminating duplicate values in the multisets before inputting to the
aggregate function a hyphenated string “distinct” is appended to the end of
the function name:
Ex: to find the no of departments in which employees are working;
Gcount-distinct(deptid)(emp)
Aggregate may have to be applied to a group of set of tuples, instead
of single set of tuples, for Example to find the sum of salaries of
employees in each department the query may be written as;
Dept_idGsum(salary)(emp)

Department of Computer Science, Govt Science College, Chitradurga Page 29


Database Management Systems

4. SQL
Background of SQL
IBM developed the original version of SQL, Originally called sequel,
as part of the system-R project in the early 1970’s. The sequel language has
evolved since from then & it is known as SQL, (Structured Query Language).
Now many database projects support SQL.
In 1986, American National Standards Institute (ANSI) & International
organization for Standardization (ISO) published an SQL standard, called
SQL-86. ANSI published the extended standard for SQL, SQL-89, SQL-92,SQL-99
and the latest is SQL-2003.
SQL uses a combination of relational algebra and relational calculus
constructs.
SQL Language has Several Parts:
1) Data Definition Language (DDL).
The SQL DDL provides commands for defining relation schemas, deleting
relations and modifying relation schemas.
2) Interactive Data Manipulation Language (DML).
The SQL DML includes a query language based on both the relational
algebra on both the relational algebra & the tuple relational calculus. It
includes commands to insert, delete & modify tuples in the database.
3) Integrity.
SQL DDL includes commands for specifying integrity constraints that
the data stored in thedatabase must satify.Updates that violate integrity
constraints are not allowed.
4) View Definition.
SQL DDL includes commands for defining views.
5) Transaction control:
SQL includes commands for specifying beginning & end of transactions.
6) Embedded SQL & Dynamic SQL:
Embedded and Dynamic SQL define how SQL statements can be embedded
within general purpose programming languages such as c, c++, Java, PL/I,
Cobal, Pascal & Fortran.
7) Authorization:
The SQL DDL commands specify access rights to relations & views.
Data Definition:
The set of relations in the database are specified using DDL.
DDL defines the following:
 Schema for each relation.
 Domain of values associated with each attribute.
 Integrity constraints.
 The set of indices to be maintained for each relation.
 The security and authorization information for each relation.
 Physical storage structure for each relation on disk.
Basic domain types:
1. Char (n): fixed length, character string of user specified length
‘n’. Full form: Character.
2. Varchar (n): A variable length character string up to a maximum
length of ‘n’. Full form: Character varying.
3. Int: A finite subset of integers that is machine dependent. Full
form: Integer.
4. Small int: A machine-dependent subset of the integer domain.

Department of Computer Science, Govt Science College, Chitradurga Page 30


Database Management Systems

5. Numeric(p,d): A fixed point number with user-specified precise. The


number consists of p digits of which‘d’ are decimal digits d<p.
6. Real (double precision): Floating point & double precision floating
point & numbers with machine dependent precision.
7. float(n): floating point number with at least ‘n’ digits of
precision.
8. Date: calendar date with year (four digits), month & day.
9. Time: Time of day with hours, mins & seconds. Time (P) specifies
number of fractional digits for second.
10. Time stamp: combination of date & time. Time stamp (p) specifies
number of fractional digits for seconds.

Basic schema definition in SQL:


Create table r
(A1 D1, A2 D2…An Dn,
<Integrity constraint-1>,
……………..
……………..
<Integrity constraint-k>);
Where r is the name of relation Ai is the name of attribute, Di is the
Domain type of that attribute, the integrity constraints may be primary key,
foreign key, not null, unique check etc.
1. Primary key: Primary key (Aj1, Aj2….Ajm)
The primary key attributes are not-null & unique
Ex:1. create table depositor
(customer_name char(20),
Account_number char (10),
Primary key (customer_name,account_number));
2. create table account
(account_number char(10),
Branch_number char(15),
Balance numeric(12,2)
Primary key (account_number));

Commands over a relation.


Insert:
insert into r values
(value-1 attribute-1, value-2 attribute-2...value-n attribute-n);
Ex: insert into account values
(‘A-111’ account_no, ’chitradurga’ branch_name, 1500 balance);
In the insert statement values are specified in the order in which
attributes are listed in the relation schema otherwise the attribute names
are also specified along with the values.
Delete:
Deleting all tuples
delete from r;
Deleting a table
drop table r;
Altering a table:
Alter table r add A D;
Alter table r drop A;
Where r is an existing relation, A is attribute name and D is Domain type.

Department of Computer Science, Govt Science College, Chitradurga Page 31


Database Management Systems

Basic structure of SQL Queries:


The Basic Structure of an SQL expression consists of three clauses
select, from, and where.
1. The select clause corresponds to the projection operation of the
relational algebra. It is used to the list the attributes desired in
the result of a query.
2. The from clause corresponds to the cartetion products operation of the
relation algebra. It lists the relations to be scanned in the
evaluation of the expression.
3. The where clause corresponds to the selection of a predicate of the
relational algebra. It consists of a predicate involving attributes of
relations that appear in the from clause.
A typical SQL query is of the from:
select A1, A2...An
from r1, r2...rn
Where p;
Each Ai represents an attribute & each ri a relation. P is a predicate. This
query is equivalent to relational algebra expression
П A1,A2,…An(p(r1 × r2 ×. . . ×rm))
If the where clause is omitted, the predicate P is true. Unlike the
result of a relational algebra expression, the result of the query may
contaion multiple copies of some tuples.

The select clause:


To list all branch_names in the loan relation
Select branch_name
from loan;
To eliminate duplicates in branch_name
Select distinct branch_name
from loan;
To explicitly specify duplicates are allowed use ‘All’-keyword
Select All branch_name,
from loan;

The ‘*’ asterisk symbol is used to specity ‘all attributes’.


“Select * from …;” indicates that all attributes of all relations
appearing in the from clause.
Select clause may also contain arithmetic expressions involing the
operaters +,-,* and /
Ex: select loan_number, branch_name, amount*100 from loan.
Where clause:
Where clause includes comparison operators: <, <=, >, >=, = & <>
SQL allows comparison of strings, and arithmetic expressions as well as
special types such as date.
More than one comparisons can be connected via logical operators
and(∧),or(∨)and not() in the where clause.
SQL also includes comparison operator ‘between’ to select particular range
of data.
Examples:
1. Find all loans taken from Bangalore branch and amount greater than 1200.
Select loan_number
from loan
where branch_name=’Bangalore’ and amount>12000;

Department of Computer Science, Govt Science College, Chitradurga Page 32


Database Management Systems

2. Find loans whose amount is more than 50000 but less than 100000.
Select loan_number
from loan
where amount between 50000 and 100000;
 We can also use not between to exclude the range.

From clause:
From clause defines a Cartesian product of the relations in the
clause. A Natural join is defined in terms of a Cartesian product.
To select customer_name, loan_number and amount from borrower and loan
table.
Relational algebra expression is: Πcustomer_name,loan_no,amount(borrower ⋈ loan)
SQL statement: Select customer_name, borrower, loan_number, amount
from borrower, loan
Where borrower.loan_no=loan.loan_no.
To avoid ambiguity if two attributes have same name in two relations
we use those attributes along with relation name using dot operator.

The Rename operation:


SQL provides a mechanism for renaming both relations and attributes,
it uses “as” clause, old_name as new_name.
The “as” clause can appear in both the select and from clauses.
Select customer_name, borrower, loan_id as loan_id, amount
from borrower, loan
where borrower.loan_number=loan.loan_number;
This is helpful to change the name of the attribute in the result.

Tuple variables:
The “as” clause is particularly useful in defining the notion of tuple
variables. A tuple variable in SQL must be associated with a particular
relation. tuple variables are defined in the “from” clause by the way of the
“as” clause.
Ex: Select customer_name, T.loan_number,S.amount
from borrower as T, loan as S
where T.loan_number=S.loan_number
Tuple variables are most useful for comparing two tuples in the same
relation. Ex: To find branch names having assets greater than atleast one
branch located in “Bangalore”
select distinct T.branch_name
from branch as T, branch as S
where T.assets > S.assets and S.branch_city=”Bangalore”
String Operations:
Stings are specified by enclosing them in single quotes. If we want to
use single quote within a string quote within a string, We have to specify
two single quotes.
Ex: “It’s right” is a specified as ”it’’s right”
Strings are compared using “Like” operator we describe two special
characters.
1. Percent(%): The % character matches any substring.
2. Underscore(_): The _ character matches one character.
Examples: list customer names who has ‘A’ in their names.
Select customer_name
From customer
Where customer_name like “%A%”

Department of Computer Science, Govt Science College, Chitradurga Page 33


Database Management Systems

String matching examples:


1. Name starting with ‘Ba’ “Ba%”
2. Name has ‘h’ in 3rd position “--h%”
3. Name has 3 characters only “---“
4. Name has at least 3 characters “---%”

Escape character: % and _ matches substrings and characters respectively. If


we want to use them in the matching string to match those characteristics we
have to use an ‘escape’ character before it such as; ’\’.
Ex: To match “a%b” we have to write the matching string as: ”a\%b”
To match “a_b” within string matching string is “a\_b”
To match “a\b” matching string is: “a\\b”.

String functions
Upper() Converts the given string to upper case.
Lower() Converts the given string to lower case.
‘||’ is a concatenation operator to join strings.

Apart from these SQL offers a no of string functions.


Note: We can use “similar to” operation instead of ’like’ in SQL-1999 std.

Ordering the display of tuples


To sort the tuples in the result of a query we use “Order-by” clause.
Select distinct customer_name
from Depositor order by Customer_name.
To specify sort order we mention asc and desc. Sorting can be performed on
multiple attributes. Order by clause is specified after where clause.
Ex: Select Customer_name,balance
from depositer, account
Where depositor.loan_no=account.Loan no
Order by Customer_name asc balance desc;
SQL allows duplicates in their results which is useful in some
situations. We can also specify how many copies of those tuples appear in
result, using multiset versions of relational operators, given multiset
relations r1 and r2.
1. If c1 copies of tuple t1 are there in relation r1 & t1 satisfies
selection σθ then are c1 copies of t1 in σθ(r1).
2. for each copy of tuple t1 in r1 these is a copy of tuple ΠA(t1)
In ПA(r1). Where ΠA(t1) denotes the projection of the single tuple t1.
3. If there are c1 copies of tuple t1 in r1 & c2 copies of tuple t2 in r2
there are c1×c2 copies of tuples t1.t2 in r1×r2.

Set Operations:
The SQL operations union, intersect and except are three operations
of relational algebra union ∪, intersection ∩, & set difference -. The set
operators automatically eliminate duplicates. If we want to allow
duplicates, we have to use union all, intersect all, except all.
Consider two tuples:
 Select customer_name from depositors;
and
 Select customer _name from borrower;

Department of Computer Science, Govt Science College, Chitradurga Page 34


Database Management Systems

The union operator:


To find all the bank customers having a loan an account or both at the bank.
Ex: (Select customer_name from depositor)
Union
(Select customer_name from borrower);
To allow duplicates in the result:
(Select customer_name from depositor)
Union all
(Select customer_name from borrower);
The intersect operations:
To find customer who have both a loan & an account at the bank.
(Select distinct customer_name from depositor)
Intersect
(Select distinct customer_name from borrower)

Except operation:
To find all customers who have an account but no loan at bank.
(Select distinct customer_name from depositor)
Except
(Select distinct customer_name from borrowers)
To allow duplicates we write:
(Select customer_name from depositor)
Except all
(Select customer_name from borrower)
Aggregate functions:
Aggregate functions take a collection of values as input and return a
single value. SQL offers 5 built in aggregate functions.
 Average : avg()
 Minimum : min()
 Maximum : max()
 Total : sum()
 Count : count()
Sum & avg operates only on numbers, But min,max & count can operate on
numeric as well as non numeric data also.
Example
Consider employee table with (emp_name, dept_id, salary)
1. To find total no of employees
Select count(*) from employee;
Or
Select count(emp_name) from employee;
2. To find no of employees in each dept
Select dept_id,count(dept_id)
from employee
group by dept_id;
3. To find average and total salary given to all employees
Select avg(salary),sum(salary)
From employee;
4. To find minimum & maximum salary given to all employees
Select min(salary),max(salary)
From employee;
Group by clause groups the similar typles into one in the result
While clause can be used to compare attributes of the relation.
Having clause is used to compare aggregate function result values.

Department of Computer Science, Govt Science College, Chitradurga Page 35


Database Management Systems

Consider table with following schema,


Reg_no name categary combination sem sex marks

5. The following query displays sum of marks in each category and


category whose sum is more than average marks of all students.
Select sum(marks),category from student
Where combination = PMCs
Group by category,sem
Having sum(marks)>avg(marks);

Null values:
Null value indicates value unknown or not exists. We can use keyword
null to specify null values.
Ex: select loan_no
From loan
Where amount is null;
We can also use “is not null” to test the absence of null value.
The result of an arithmetic expression is always null if it contains a
null operand. “is null” & “is not null” checks presence & absence of null
value in expression. All comparisons (>,>=,<,<=,=,) result into a null
value. Apart from this predicates in the where clause also contains and or
not logical operators.
AND
AND NULL
True Null
False False
Null Null
OR
OR null
True True
False Null
null null
NOT: Not of null is null.
A Boolean type data can take values: true, false and unknown (null) as
per SQL:1999.
All aggregate functions except count(*) ignores null values.
Nested Subqueries
A sub query is a select from where expression nested within another
query. Sub queries are useful to perform tests for set membership make set
comparisons and determine set cardinality.
1) Set membership: the ‘in’ connective tests for set membership, where set
is a collection of values produced by a select clause. The ‘not in’
connective tests for the absence of set membership.
Ex: 1) To find all customers who are borrower of the bank and also having
accounts.
Select distinct customer_name
From borrower
Where customer_name in (select customer_name from depositor);

Department of Computer Science, Govt Science College, Chitradurga Page 36


Database Management Systems

Ex: 2) To find customers who have a loan at the bank but do not have
account.
Select distinct customer-name
from borrower
Where customer_name not in (Select customer_name from borrower);
2) Set comparison:
To compare sets we use two keywords “some” and “all” SQL allows comparisons
with >some, >=some, =some, <>some, <some, <=some,
Ex: >some means “greater than at least one”
The keyword “any” is synonymous to “some” in SQL.
<>some is not same as “not in”
To find the names of all depts., Who gives salary more than salary
given to any employee of “physics” dept.
Emp_name Dept_name Salary

Select dept_name
from emp_SQL
Where salary >some(select salary
from emp_SQL
where dept_name=”physics”);

Similar to some we have keyword “all” <all, <=all, >=all, >all, =all
and <>all. <>all is identical to “not in”
Ex: Find the dept that has the highest average salary.
Aggregate functions in SQL cannot be composed like max(avg
(salary)). We can write a sub query to provide avg(salary) then we can
select maximum among them

Select dept_name
From emp_sal
group by dept_name
having avg (balance)>=all (select avg(balance)
from account
group by branch name);
3) Test for empty relations:
Using SQL we can test whether a sub query has any tuples in its
result. “exists” construct returns true if the argument sub query is non
empty.
Find all the customers who have both an account and a loan at the bank.
Select customer_name
From borrower
Where exists (select *
from depositor
where depositor.Customer_name=borrower.Customer_name);

We can also test for the non_existance of tuples in a sub query by using the
“not exists” construct.
Find all customers who have an account at all branches located in CTA:
Select district S.customer_name
From depositor as S.
Where not exists ((select branch_name
from branch
where branch_city=”CTA”)

Department of Computer Science, Govt Science College, Chitradurga Page 37


Database Management Systems

except
(select R.branch_name
from depositor as D,account as A
where D.account_no=A.account_no
AND S.customer_name=D.customer_name)).
4) Test for the absence of duplicate tuples.
Unique construct returns the value true if the argument subquery
contains no duplicate tuples.
To find all customers who have at most one account at ‘CTA’ branch.
Select T.customer_name
from depositor as T
where unique (select R.customer_name
from account, depositor as R
where T.customer_name=R.customer_name and
R.account_no=account.account_no and
account.branch_name=”CTA’);
If we use “not unique” in the above query instead of “unique” we can
find all customers who have at least two accounts at “CTA” branch. It is
possible to unique to be true even if there are multiple copies of a tuple
with any one of the attribute is null.

Complex Queries:
Simple queries in SQl consist of single select-from-where statement,
Possibly with group by and having clauses. We can compose complex queries
using 1. Derived relations and 2. with clause.
1. Derived relations:
A subquery expression can be used in the clause of select statement.
If we are using such an expression the result of this subquery must be given
a name, we can also rename the attributes using ‘as’ clause.
Consider subquery to select branch names with average balance
(select branch_name,avg(balance)
from account
group by branch_name)
as branch_avg(branch_name,avg_bal);
To select branch_name and average balance whose branch average is more than
1200.
Select branch_name, avg_balance
from (select branch,avg(balance)
from account
group by branch_name)
as branch_avg(branch_name,avg_bal)
where avg_balance>=1200;

To find maximum branch total balance among all branches.


Select max(total_bal)
from (select branch_name,sum(balance)
from account
group by branch_name)
as branch_total(branch_name,tot_bal);

2. The with clause:


The with clause provides a way of defining a temporary view which is
available only to the query in which the with clause occurs. The with
clause, introduced in SQL:1999 is supposed only by some databases.

Department of Computer Science, Govt Science College, Chitradurga Page 38


Database Management Systems

Ex: (1) with max_balance(value)as


select max(balance)
from account
select account_number
from account, max_balance
where account.balance=max_balance.value;

This query displays account_no whose balance is equal to maximum value


that is selected in the temporary view created by ‘with’ clause.
(2) To find all branches where total account deposit is greater than
the average of the total account deposits at all the branches.
With branch_total(branch_name) value as
Select branch_name,sum(balance)
From account
Group by branch_name
With branch_total_avg(value)as
Select avg(value)
From branch_total
Select branch_name
From branch_total,branch_total_avg
Where branch_total.value>=branch_total_avg.value;

The above queries can be written without “with” clause, but it would
be more complicated and harder to understand. Using with clause makes the
query logic clearer. It also permits a view definition to be used in
multiple places within a query.

Views:
Views allow us to create a personalized collection of relations that
is better matched to a certain user’s intuition. We can also allow some
attributes accessible to some users only and restrict accesing some data
using views. “Any relation that is not part of the logical model, but is
made visible to a user as a virtual relation is called a “view”.

Definition of view: Create view as <query expression>

A query expression is a legal query expression. Views can be used as a


table but only for querying. Update operations are not executed on views.
However views are updatable by following some rules.
When we define a view the database system stores the definition of the
view itself, rather than the result of evaluation of relation algebra
expression that defines the view. View relation is stored as a query
expression. Each time we use the view, the view relation gets recomputed.

Ex: (1) create view branch_total(branch_name,total)


as select branch_name,sum(amount)
from loan
group by branch_name
This view can be used to define other views also;
(2) create view branch_total_max(max)
as select max(total) from branch_total
Defining view interms of other views is known as view expansion. But
view cannot be defined interms of itself. View names are not recursive.

Department of Computer Science, Govt Science College, Chitradurga Page 39


Database Management Systems

Modification of database
Deletion, updates and insertion are the modification operations on a
database.
1. Deletion:
Using a delete expression we can delet whole tuple but cannot delete
values on only particular attributes.
Syntax:
delete from r where p;
Where ‘p’ is a predicate and ‘r’ represents a relation. The delete
statement first finds all tuples t in ‘r’ for which P(t) is true and then
they deleted, if where clause is omitted all tuple in ‘r’ are deleted.
Delete from loan;
This command deletes all tuples from the loan relation. The delete
command operates on only one relation. If we want to delete tuples from
several relations we must use one delete command for each relation.
The predicate in the where clause may be as complex as the predicate
in the where clause of select statement ie, all comparison operators,
logical operators, set operators, sub-queries every element can be used.
 To delete from employee table whose emp-id is 108
Delete from employee
Where emp-id=108;
 To delete from employee table whose name starts from ‘p’
Delete from employee
Where first-name like “p%”;
 To delete employee information whose salary is less than avg salary
Delete from employee
Where salary <(select avg(salary) from employee);
Although we can delete tuple from only one relation we can reference
any no of relation in the where clause using nested sub-query (select from
where)
2. Insertion:
The simple insert statement is a request to insert one tuple.
Ex: insert into account values (1031,’shimoga’,1500);
This inserts one tuple into account table with account_no=1031,
branch_name=shimoga and balance=1500.
If you are inserting values into the table in different order of
attributes we can write.
Insert into account(branch_name,balance,account_no)
values(‘shimoga’,1500,1031);
You can also insert tuple onto a relation using result of a query. Consider
a table with employee_id, salary and dept
Ex: emp_dept(emp_id,salary,dept_id)
insert into emp_dept
Select emp_id,salary,dept_id
From employee
Whewe salary>25000;
Insert statements using select statement must be carefully eualuated
before execution.
Ex: insert into account
Select * from account;

Department of Computer Science, Govt Science College, Chitradurga Page 40


Database Management Systems

Insert an infinite no of tuples into account table as the query will


insert are row for each tuple select from the table and the select statement
will not encounter end of table as insertion is going on.
insert statement may also insert null values.
Ex: If you don’t know where Account 108 is then branch_name is null.
insert into account values(‘108’,null,1200);
3. Updates:
Update statement changes the values of attributes for which The predicate
in the where clause is true.
Syntax:
Update r
Set A=V
Where P;
Where P is a predicate r is a relation & A is attribute name, whose
value is changed to V.
Ex: 1. To give 10% increament to all employees salary we can write.
update employee
Set salary = salary*1.1;
2. To give increment to employees of certain dept.
Update employee
Set salary= salary*1.1
Where dept_id =4;
As with insert & delete, update statement updates only one relation
At a time and may contain a complex predicate in where clause update
statement can use case statements.
Ex: Update employee
Set salary=case
When salary>=50000 then salary*1.10
When salary>= 25000 then salary*1.05
Else salary *1.03
End

Update of a view:
Modification on view defined has to be translated to modification to
the actual relation in the logical model of the database, in order to
execute. This may cause many serious problems.
1. Consider loan relation loan(loan_no,branch_name,amount) a view is
defined to select Loan_no & branch_name.
Create view loan_branch as
Select loan_no, branch_name
From loan;
If we try to insert into loan branch view.
Insert into loan_ branch values (‘101’,’CTA’)

This insert statement has to insert above values into table ‘loan’
which consists 3 Attributes & insert statement cannot insert values to
‘Amount’ attribute.
 DBMS may reject this operation.
OR
 Insert data to the table with null values to ‘Amount’.
These are the two possible ideas to insert to relation through view.
2. Consider one more relation. Borrower(cust_name, loan_no) a view selects
data from both tables.
Create view loan_info as

Department of Computer Science, Govt Science College, Chitradurga Page 41


Database Management Systems

Select cust_name, amount


From borrower, loan
Where borrower.loan_no=loan.loan_no;
If we try to insert into this view by:
Insert into loan_info values(‘Raju’,1500);
These values have to be inserted into two tables borrower & loan.
(‘Raju’, null) to borrower and
(Null, null, 1500) to loan
These values must not be updated because very important loan_no is
missing and might be a primary key in loan, and does not contain desired
information
3. If we have a view with where condition in its select clause.
Create view loan info as
Select loan_no, branch_name, amount
From loan
Where branch_name=’shimoga’;
If we try to insert a tupple of another branch.
Insert into loan info values (‘l-02,’chitradurga’, 2500);
SQL may allow this value to be inserted but it is not desirable to
avoid this view must be defined with check option at the end of view
definition.
In general a view is updatable, if it follows certain conditions in
sql.
 From clause has only one database relation.
 The select clause contains only attributes names of the relation and
does not have any expression aggregates or distinct specification.
 Any attributes not listed in the select clause can be set to null.
 The query does not have a group by or having clause, under these
conditions the update, insert, and delete operations would be
forbidden.

Transactions:
A transaction consists of a sequence of query and/or update statements. A
transaction begins implicitly when the SQL statement is executed. SQL
statements must end with the any one of the following statements commit or
rollback.
1. commit work: commits the current transactions, it makes the updates
performed by the transaction become permanent in the database. After the
transaction is committed a new transaction is automatically started .
2. rollback work: cause the current transaction to be rolled back. It un-
does all the updates performed by sql statements in the transaction.
The keyword ‘work’ is optional for both the transactions.
For some transactions which are to be atomic, i.e., two or more
transactions need to be executed as single transactions. Such as transfer of
funds from one account to another, SQL:1999 standard has a better
alternative supported in some SQL implementations only Multiple SQL
statements can be enclosed between keywords
Begin atomic … End
All the statements between the keywords then form a single
transaction.

Department of Computer Science, Govt Science College, Chitradurga Page 42


Database Management Systems

Joined Relations:
Apart from joining tuples by Cartesian product sql provides mechanisms to
join relation such as condition join natural join & various forms of outer
joins.
These joins operations are typically used as sub-query expressions in
the from clause.

Inner join:
Consider two relations
Loan(loan_no, branch_name, amount)
borrower(customer_name, loan_no)

To use inner join we can write the following in the from clause of the
select statements
Select………
from loan inner join borrower on
loan.loan_no = borrower.loan_no;
the expression computes the join of the loan & the borrower relations. the
attributes of the result consist of the attributes of the left hand side
relation followed by the attributes of the right hand side relation.
using ‘as’ clause we can rename the attributes of the result
ex: loan inner join borrower
on loan.loan_no=borrower.loan_no
as Lb (loan_no, branch, amount, cust, cust_loan_no);
Inner join joins tuples of the both relations which are having only
matching tuples in the opposite relations.

Natural join:
Natural join joins two relations without specifying the condition for
joining but it requires both relations have same attribute names which are
to be joined.
Ex: loan_no in laon & loan_no in borrower.
Select …….
from loan natural inner join borrower;
This expression computes the natural join of the two relations using
the only common attribute name loan_no.
Using natural join is similar to inner join but it result will produce
loan_no attribute only once.
The attributes will be as.(loan_no, branch_name, amount,
customer_name)
Where as inner join with joing condition will produce the loan_no
twice.(loan_no, branch, branch_name, amount, customer_name, loan_no)
Loan
Borrower Customer
Loan_no Branch_name Amount Cust_name Loan_no
L-170 Dvg 3000 Ramesh L-170
L-230 Cta 4000 Raju L-230
L-260 Smg 2530 Ravi L-155

Inner join output


Loan_no Branch_name amount customer Loan_no
L-170 Dvg 3000 ramesh L-170
L-230 Cta 4000 Raju L-230

Department of Computer Science, Govt Science College, Chitradurga Page 43


Database Management Systems

Natural join output


Loan_no Branch_name Amount Customer_name
L-170 Dvg 3000 Ramesh
L-230 Cta 4000 raju

Left outer join


If any tuples in the handside do not match any tuple in the right hand
side then these left hand side tuples are padded with nulls in the right
handside and added to the result along with the result of inner join.
Ex: select…………
From loan left outer join borrower on
Loan.loan_no=borrower.loan_no.
The output of this query will be as
Loan_no Branch_name Amount Customer_name Loan_no
L-170 DVG 3000 ramesh L-170
L-230 CTA 4000 raju L-230
L-260 SMG 2530 null null

Right Outer join


Right outer join is symmetric to the left outer join. Tuples from the right
hand side relation that do not match any tuple in the left hand side
relation are padded with nulls and are added to the result.
Ex: select…………
From loan right outer join Borrower
On loan.loan_no=borrower.loan_no;
Loan_no Branch_name Amount Customer_name Loan_no
L-170 Dvg 3000 ramesh L-170
L-230 CTA 4000 raju L-230
Null null Null ravi L-155
We can also specify natural join for both left and right outer joins as
1. loan natural left outer join Borrower.
2. loan natural right outer join Borrower.
Similarly as natural inner join natural left outer & right outer join
result will have only attribute Corresponding to “loan_no” which is the join
Attribute.

Full Outer join:


The full outer join is a combination of the left and right outer join types.
After the operation computes the result of the inner join,it extends with
nulls.tuples from the left hand side relation that did not match with any
tuple from right hand side and adds them to result.
Similarly it extends with nulls tuples from the right hand side
relation that did not match with any tuples from the left hand side
relation and adds them to result.
Ex: select…………
From loan full outer join borrower using(loan_no).
Result will be:
Loan_no Branch_name amount Customer_name
L-170 DVG 3000 ramesh
L-230 CTA 4000 raju
L-260 Smg 2530 null
L-155 Null null ravi

Department of Computer Science, Govt Science College, Chitradurga Page 44


Database Management Systems

Integrity Constraints:
Integrity constraints ensure that changes made to the database by
authorized users do not result in a loss of data consistency. Integrity
constraints guard against accidental damage to the database.
Example of integrity constants:
1. An account balance cannot be null.
2. No two accounts can have the same account no.
3. Every account no in the depositer relation must have a matching
account_no in the account relation.
4. The hourly salary of a bank employee must be at least 6.00 per hour.
Constraints on a single relation
1)NOTNULL, 2)unique, 3)check(predicate)
Referential integrity is a constraint that forces value in one
relation depends on value in other relation.

1. NOTNULL constraints:
‘NULL’ value is a member of all domains & is a legal value to be
inserted to any attribute of a relation. But, for some attribute it is in
appropriate to insert a null value.
Ex: 1)account_no of an account cannot be null
2)Balance in an account cannot be null.
This NOTNULL constraints can be forced in the create table statements
as fallows.
Create table account
(Account_no char(10) not null,
Balance number(12,2) not null);
NOTNULL specification prohibits the insertion of a null value for the
attribute. A primary key of a relation cannot be ‘NULL’ i.e., the primary
key attribute cannot contain a null value.

2. Unique constraints:
Unique (Aj1,Aj2,………,Ajm)
The unique specification says that attributes Aj1,…….,Ajm form a
candidate key. No two tuples in the relation can be equal in all the primary
key attributes. A candidate key attributes can have null value, if it is not
declared as NOTNULL.

3. Check clause:
Check clause in SQL can be applied to relation declarations as well as
to domain declarations. In a relation declaration the clause check(p)
specifies a predicate ‘p’ that must be satisfied by every tuple in a
relation. A comman use of check clause is to ensure that attribute values
satisfy specified condition.
Ex:1. check(salary>=1000)
Implies that salary of an employee must be atleast 1000.
2. To check simulated enumerated type
Create table student
(name char(15) not null,
combination char(5) Check(combination in(“pcm”,”pmcs”,”cmcs”,”cbz”)));
Check clause applied to a domain declaration.
Create domain hourly_wage number(5)
constraint wage_test check(value>=6);
Where hourly wage is used as a domain that attribute can have atleast ‘6’ as
its value.

Department of Computer Science, Govt Science College, Chitradurga Page 45


Database Management Systems

A domain can be restricted to contain a specified set of values


Create domain account type char(10)
constraint acc_test Check (value in (‘SD’,’CA’,’RD’,’FD’));
This restricts any attribute with account type domain can have any one
of four values SB, CA, RD, or FD.
A subquery can be used in a check constraint
check(branch_name in (select branch_name from branch));

4. Referential integrity:
Referential integrity ensure that a value that appears in one relation
for a given set of attributes depends on certain set of attributes in
another relation.
“Foreign key” is one of the relational integral constraint.
A branch relation contains branch information. An ‘account’ created in
account relation contains branch_name that should be listed in branch
relation. This can be forced using foreign key.
Ex: the definition of account table
Create table account
(…………..
Foreign key(branch_name) references branch);
In general, let r1(R1) & r2(R2) be relations with primary key k1 & k2
respectively. We say that a subset α of R2 is a foreign key referencing k1
in relation R1. If it is required that for every tuple t2 in r2 there must
be a tuple t1 in r1 such that t1[k1]=t2[α].
Requirements of this form are called referential integrity
constraints, or subset dependencies. By default in SQL a foreign key
reference the primary key attributes of the reference table.
Short form of defining foreign key:
Create table ………
(…………………….
Branch_name char (15) reference branch ,
……………………);
When a referential integrity is violated the normal procedure is to
reject that operation. However a foreign key clause can specify that if a
delete or update operation is executed on a referenced relation instead of
rejecting that the action, system takes steps to change tuple in the
referencing relation to restore the constraint.
Ex: create table account
(……………
Foreign key (branch_name ) reference branch
on delete cascade
on update cascade,
……………);
If any tuple in the referenced relation is deleted the corresponding
rows in the account relation are deleted automatically. Similarly if any
tuple in the referenced branch table is updated the corresponding branch
name in the referencing account table is also updated.
We can also use ‘set null’ instead of cascades which will set the
value to null. We can also set the value to default value of that domain by
‘set default’.

Department of Computer Science, Govt Science College, Chitradurga Page 46


Database Management Systems

Functional dependency
In a relation R, attribute ‘α’ is functionally dependent on attribute
β of R if only if each β value in R has associated with it precisely one ‘α’
value in R (at any one time)
A functional dependence is a special form of integrity constraint.
Ex: consider loan(loan_no, branch_name, Amount)
The relation loan satisfies: Loan.Loan_no->loan.Amount.
We mean that every legal extension of that relation satisfies that
constraint. Recognizing functional dependencies is an essential part of
understanding the meaning or semantics of the data.
Partial dependency:
A functional dependency α->β is called a partial dependency if there
is a proper subset γ of α that α->β. We say that β is partially dependent on
α.

5. RELATIONAL DATABASE DESIGN:


Features of good relational Designs.
Design alternatives:larger schemas.
We will combine (natural join) two relations borrower(customer_id, loan_no)
with loan(loan_no, branch_name, Amount)
Consider joined table
bor_loan=(customer_id, loan_id, branch_name, amount)
customer_id loan_no branch_name amount
101 L-100 Chitradurga 10000
102 L-100 Chitradurga 10000
103 L-100 Chitradurga 10000
104 L-101 Davanagere 25000
105 L-102 Chitradurga 30000
Relationship borrower to loan is many-to-many. i.e., a customer can
have more than one loan & a loan can be given to several customers. Then
customer_id or loan_no individually cannot be the primary key for relation
bor_loan. Primary key for borrower_loan is combination of customer_id &
loan_no.
The loan_no L-100 is given to 3 customers then branch_name Chitradurga
and amount 10000 is repeated for each tuple. Any modification to database
this stage may lead the database to inconsistency. i.e., Repetition of
information arise when we user larger schemas.

Design alternatives:smaller schemas.


In the above example the loan_no is associated with amount i.e., they
are functionally dependent. In order to avoid data redundancy we can
decompose this table into two as
1) Borrower (loan_no, branch_name, amount).
2) Loan (cust_id, loan_no).
In this case it is very easy to decompose into 2 relations using the
function dependency loan_no->(branch_name,amount)
But decomposing a schema is hard, when it has large no of attributes.
Some decompositions are not helpful consider the following example,

Department of Computer Science, Govt Science College, Chitradurga Page 47


Database Management Systems

Employee_id Name Address Phone_no


101 Raju CTA 100012
102 Raju DVG 200002
….. …… ….. …..
Here decomposing it into two as employee and emp_address
employee emp_address
Employee_id Name Name Address Phone_no
101 Raju Raju CTA 100012
102 Raju Raju DVG 200002
…. …. …. …. ….
These relations have 2 employees with same name. This composition of
tables is used again to regenerate original tuples by joining employee and
emp_address.
Output will be as,
Emp_id Name Address Phone_no
101 Raju CTA 100012
101 Raju DVG 200002
102 Raju CTA 100012
102 Raju DVG 200002
…. …. …. ….
The decomposition of these tables is not fruitful because it leads
into loss of information & cannot produce the original data. This type of
decomposition is called “Loosely decomposition”.
Decomposition using functional dependency in the previous example of
borrower & loan does not lead to loss of data & hence it is “Lossless
decomposition”.

First Normal form:


We say a relation R is in first normal form (1NF) if the domains of
all the attributes of R are atomic. A domain is atomic if its elements are
indivisible units. If domains contain a multilevel attribute or a composite
attribute the domain would not be atomic. To achieve atomicity we create a
new relation & create one tuple for each item in a multivalued set. For
composite attributes we let each component be an attribute in its own right.

Second Normal form:


A relation R is in second normal form if & only if it is in 1NF &
every non key attribute is fully dependent on the primary key.
Decomposing using Functional Dependencies.
Keys and functional dependencies:
Legal relations are the ones that satisfy constraints, Keys &
functional dependencies.
Super Key:
Let R be a relation schema, a subset K of R is a super key of R if any legal
relation r(R).
For all pairs t1 and t2 of r
such that t1≠t2 then
t1[k] ≠ t2[k].
i.e., no two tuples in any legal relation r(R) may have the some value
on attribute set ‘k’.

Department of Computer Science, Govt Science College, Chitradurga Page 48


Database Management Systems

Functional dependency:
Consider a relation schema R and let αR and βR. the functional
dependency α->β Holds on schema R, if in any legal relation r(R) for all
pairs of tuple α->β
Let t1 and t2 in r such that t1[α]=t2[α],
It is also the case that t1[β]=t2[β].
Given a relation R attribute Y of R is functionally dependent on
attribute X of R if and only if each X value I R has associated with it
prescisely one Y value in R at any instance.
A key is a set of attributes that uniquely identifies an entire tuple.
A functional dependency allows us to express constraints that uniquely
identity the values of certain attributes.
Functional dependencies allow us to express constraints that we cannot
express with Super keys.
Ex: bor_loan(customer_id, loan_no, branch_name, amount)
Candidate key is the combination of customer_id and loan_no. But the
functional dependency is, Loan_no -> amount, branch_name.
Uses of functional dependencies:
1) To test relations to see whether they are legal under a given set of
functional dependencies.
If a relations r is legal under a set F of functional dependencies we
say that r satisfies ‘F’
2) To specify constraint on the set of legal relations. We shall thus
concern ourselves with only those relations that satisfy a given set
of functional dependencies.
If we wish to constrain ourselves to relation on schema R that satisfy
a set F of functional dependencies we say that F holds on R.
Some functional dependencies are said to be ‘trivial’ because they are
satisfied by all relations.
EX: A->A
A functional dependency α->β is ‘trivial’ if β  α
Closure of F (F+):
F+ is a closure of set F. the set of all functional dependencies that
can be inferred given the set F, F+ is a super set of F.

Boyce–codd normal form: (BCNF)


It eliminates all redundancy that can be discovered based on
functional dependencies.
A relation schema is in BCNF with respect to a set F of functional
dependencies. If for all functional dependencies in F+ of the form α->β
where αR and βR at least one of the following holds.
 α->β is trivial functional dependency(βα)
 α is a super key for schema R.
Decomposition using BCNF general rules:
Let R be a schema i.e. not in BCNF then there is at least one non trivial
functional dependency α->β, such that α is not a super key for R.
We replace R in our design with two schemas.
1. (α∪β) 2. (R-(β-α))
Ex: case of bor_loan relation
α=loan number, β=amount,branch_name & then bor_loan is replaced by
1. (αυβ)=(loan-no, amount, branch_name)
2)(R-(β-α))=(customer-id,loan-no)

Department of Computer Science, Govt Science College, Chitradurga Page 49


Database Management Systems

BCNF & depending preservation


There are several ways of expressing database consistency constraints,
primary key, Functional dependencies, check constraints, assertions,
triggers etc. Testing these constraints each time data base is updated can
be costly. Therefore testing a functional dependency can be done by
considering just one relation. Decomposition into BCNF can prevent efficient
testing of certain functional dependencies.
Ex: consider an E-R diagram for personal Banker having many to many
from customer to employee but only one at a given branch.

name city

name
id
branch address

customer
Works_in

Cust_bank type

Emp_id
employee
name
To ensure our requirement we change it as follows.
name city

branch name
id
address

Cust_bank
Type customer
er_branch
Emp_id

employee
name
This allows more than one personal banker for every customer. But this
is not in BCNF, Because emp_id is not a super key.
But by using two relations.
(customer_id, employee_id, type)
(employee_id, branch_name)
We can achieve BCNF but this is exactly same as first E-R diagram
using works_in relationship. We can express the constraint, that a customer
may have at most one personal banker at a given branch by Functional
Dependency: Customer_id, branch_name->Employee_id
In our BCNF design there is no schema that includes all the attributes
appearing in Functional Dependency. Our design is not dependency preserving.

Department of Computer Science, Govt Science College, Chitradurga Page 50


Database Management Systems

Third normal form


A relation schema R is in third normal form with respect to a set F of
functional dependencies if, for all functional dependencies in F+ of the
form α ->β where αR and βR and atleast one of the following holds.
 α->β is a trivial functional dependency.
 α is a super key for R.
 each attribute an in β-α is contained in a candidate key for R.
The third condition does not say that a single candidate key should
contain all the attributes in β-α. Each attribute A in β-α may be contained
in a different candidate key.
Third condition gives a minimal relaxation to BCNF conditions which
will help to ensure that dependency preserved. Any schema satisfies BCNF
also satisfies 3NF.
Functional dependency theory.
1. Closure of a set of functional dependencies:
Given a relational schema R, A functional dependency f on R is
logically implied by a set of functional dependencies F on R if every
relation instance r(R) that satisfies F also satisfies f.
Let F be a set of function dependencies the closure of F. denoted by
F+ is the set of all functional dependencies logically implied by F.
Axioms or rules of inference provide a simpler technique for reasoning
about functional dependencies. We use Greek letters (α,β,γ…) for sets of
attributes, and upper case roman letters from the beginning of the alphabet
for individual attributes We use αβ to denote α∪β.
Collection of rules is called Armstrong’s axioms in honor who first
proposed it.
 reflexivity rule: if α is a set of attributes and βα then α->β holds.
 Augmentation rule: If α->β holds and γ is a set of attributes then
γα->γβ holds.
 Transitivity rule: If α->β and β->γ holds then α->γ holds.
Armstrong’s rule are completed and sound (do not generate incorrect
functional dependencies), but it is tiresome to use them directly so we list
additional rules using Armstrong’s rules.
Union rule: If α->β holds and α->γ holds then α->βγ holds.
Decomposition rules: If α->βγ holds, then α->β holds and α->γ holds.
Pseudo transitivity rule: If α->β holds and γβ->δ holds then αγ->δ holds.
Ex:
Consider R=(A,B,C,G,H,I) and the set of functional dependencies {A->B, A->C,
CG->H, CG->I, B->H}
We can write several members of F+
 A->H since A->B and B->H hold by applying Armstrong’s rules.
 CG->HI since CG->H and CG->H and CG->I the union rule implies that CG-
>HI.
 AG->I since A->C and CG->I the pseudo transitivity rule implies that
AG->I holds.
2. closure of attribute sets
We say that an attribute B is functionally determined by α, if α->B.
to test whether α is super key, we must derive an algorithm for computing
the set of attributes functionally determined by α. An efficient algorithm
for computing the set of attributes functionally determined by ‘α’ is useful

Department of Computer Science, Govt Science College, Chitradurga Page 51


Database Management Systems

not only for testing whether α is a super key but also for several other
tasks.
Let α be a set of attributes we call the set of all attributes
functionally determined by α under a set F of functional dependencies the
closure of α under F. we denote it by α+
Algorithm for computing closure of α under F
result:=α;
while(changes to result) do
For each functional dependency β->γ in F do
begin
if β⊆ result then result:=result ∪ γ;
end

Algorithm working:
 Compute (AG)+ with the functional dependencies {A->B,A->C,CG->H,CG->I,
B->H}.
 A->B causes us to include B in result. To see this fact we observe
that A-> B is in F. Aresult (which is AG), so result:=result ∪ B.
 A->C causes result become ABCG.
 CG->H causes result become ABCGH.
 CG->I causes result to become ABCGHI.
Uses of attribute closure algorithm
 To test if α is a super key, we compute α+ and check if α+ contains
all attributes of R.
 We can check if a functional dependency α->β holds. By checking βα+.
That is we compute α+ by using attribute closure and then check if it
contains β.
 It gives us an alternative way to compute F+: for each γ⊆r, we find
the closure γ+, and for each sγ+, we output a functional dependency
γ->s.

Lossless decomposition
Let R be relation schema, F be a set functional dependency on R.
Let R1 and R2 from a decomposition of R.
Let r(R) be a relation with schema R.
we say that the decomposition is lossless decomposition if for all legal
database instances.
ΠR1(r)⋈ΠR2(r)=r.
In other words if we project γ onto R1 and R2 and compute the natural
join of the projection results we get back exactly r.
R1 and R2 from a lossless decomposition of R, if at least one of the
following functional dependencies is in F+:
 R1∩R2->R1
 R1∩R2->R2
If R1∩R2 forms a super key of either R1 and R2 the decomposition of R
is lossless decomposition .we can use attribute closure to test efficient
for super key.
A decomposition that is not a lossless decomposition is called a loosy
decomposition (lossy-join decomposition). Lossless decomposition is also
called lossless join decomposition.

Department of Computer Science, Govt Science College, Chitradurga Page 52


Database Management Systems

Dependency Preservation
Let F be a set of functional dependencies on a schema R, and let
R1,R2….Rn be a decomposition of R. the restriction of F to Ri is the set of
Fi of all functional dependencies of F+ that include only attributes of Ri.
The set of restrictions F1,F2,…..Fn is the set of dependencies that can
be checked efficiently.
Let F’=F1∪F2∪……Fn
F’ is a set of functional dependencies on schema R, but in general F’≠F.
However even if F’≠F it may be F’+=F+.
We say that a decomposition having the property F’+=F is a dependency
preserving decomposition.

Algorithm for testing dependency preservation:


The input is D={R1,R2,…..Rn} of decomposed relation schemas & set F of
functional dependencies.
Compute F+;
For each schema Ri in D do
begin
Fi:=the restriction of F+ to Ri;
End
F’:=0
For each restriction Fi do
begin
F’:=F’∪Fi;
End
Compute F’+;
If (F’+=F+) then return(true) else return(false),

This algorithm is expensive since it requires computation of F+.


Consider an alternative algorithm
Result =
While (changes to result) do
For each Ri in the decomposition
t=(result ∩ Ri)+ ∩ Ri
result =result ∪ t.
The test applies the procedure to each -> in F.
The attribute closure is here under the set of functional dependencies
F. if result contains all attributes in  then the functional dependency
 is preserved.
The decomposition is dependency preserving if & only if the procedure
shows that all the dependencies in F are preserved.

Decomposition using Functional dependencies:


1. BCNF decomposition:
Testing for BCNF
 Check if a non trivial dependency  causes a violation of BCNF,
compute +(attribute closure of ) and verify that it includes all
attributes of Ri i.e, it is a super key of R.
 To check if a relation schema R is in BCNF .it sufficient to check
only the dependencies in the given set F for violation of BCNF rather
than checking all dependencies in F+.

Department of Computer Science, Govt Science College, Chitradurga Page 53


Database Management Systems

This procedure to check BCNF works in some cases but if a relation is


decomposed the above procedure will not work. An alternative BCNF test is
sometimes easier than computing every dependency in F+.
o For every subset  of attributes in Ri check that +(the alternative
closure of  under F) Either includes no attribute of Ri-, or
includes all attributes of Ri.
If the condition is violated by some set of attributes α in Ri
consider the following functional dependency, which can be shown to be
present in F+: α(α+-α) ∩ Ri
This dependency shows that Ri violates BCNF.

BCNF decomposition algorithm


Using BCNF decomposition algorithm we can decomposes R into a
collection of BCNF schemas R1,R2,… Rn, the decomposition that the algorithm
generates is not only in BCNF, but also a loss less decomposition. Because a
schema Ri is replaced with (Ri-β) and (α,β) the dependency αβ holds, and
(Ri-β)∩(α,β)=α.
If α∩β=Ø, then those attributes in α∩β would not appear in the schema
(Ri-β) and the dependency (α-β) would no longer hold.
Algorithm
Result:={R};
Done:=false;
Compute F+;
While(not done)do
If (there is a schema Ri in result that is not in BCNF)
Then begin
Let αβ be a nontrivial functional dependency that holds on
Ri such that αRi is not in F+ and α∩β=Ø;
Result:=(result–Ri)∪(Ri-β)∪(α,β);
End
Else done:=true;
Example for BCNF decomposition:
Consider relation:
lending=(branch_name, banch_city, assets, customer_name, loan_no, amount).
And the set of function dependencies:
1. branch_nameassets, branch_city.
2. loan_noamount, branch_name.
Candidate key for this schema is {loan_no, customer_name}
We can apply algorithm for each FD.
1. Branch_name is not a super key in the lending relation. So lending is
not in BCNF, we replace lending by R1, R2
branch=(branch_name,branch_city,assets)
loan_info=(branch_name ,customer_name, loan_no, amount) so the
relation branch is in BCNF.
2. consider FD loan_noamount,branch_name in loan_info we replace
loan_info by R3, R4.
Loanb = (loan_no, branch_name, amount)
borrower = (customer_name, loan_no) loanb & borrower both are in BCNF.

Department of Computer Science, Govt Science College, Chitradurga Page 54


Database Management Systems

Decomposition to 3NF:
Algorithm for finding a dependency-preserving lossless decomposition
into 3NF use set of dependencies FC, cannonical cover of F.
Algorithm
Let Fc be a canonical cover for F;
i:=0;
For each function dependency αβ in Fc Do
if none of the schemes Rj j=1,2…i contains αβ
then begin
i:=i+1;
Ri:=αβ;
End
If none of the schemas Rj j=1,2,…i Contains a candidate key for R
then begin
i:=i+1;
Ri:= any candidate key for R;
End
Return(R1,R2,….Ri)
This algorithm is also called the 3NF Synthesis algorithm since it
takes a set of dependencies and adds one schemas at a Time, instead of
decomposing the initial schema repeatedly.

Canonical cover: Fc is a set of dependencies such that F Logically implies


all dependencies in Fc and Fc logically implies all dependencies in F.
Futher Fc must have the properties
1) No functional dependency in Fc contains an extraneous attribute.
(If we can remove an attribute of a functional dependency without
changing the closure of the set of functional dependencies we call
that attribute extraneous)
2) Each left side of a functional dependency in Fc is unique
That is there are no two dependencies α1β1 & α2β2 in Fc such that
α1=α2.

Comparison of BCNF & 3NF


3NF is a loosless dependency preserving normal form, Where as BCNF is not a
dependency preserving normal form.
But if we have null values to represent some of the data items in the
possible meaningful relationships decomposing to 3NF may cause repetetion of
information.
A good database will have the following properties.
1) BCNF.
2) Loosless.
3) Dependency preserving.
It is not always possible to satisfy all three we have to choose
between BCNF & 3NF. Also SQL does not provide a way of specifying functional
Dependencies except for the special case of declaring super keys using
primary key / uniqe constraints.

Decomposition using multivalued dependencies:


Consider schema: Cust_loan=(loan_no, cust_name, cust_street, cust_city)
having functional dependency.
Cust_idcust_name, cust_street, cust_city.
Using BCNF decomposition we obtain two schemas.

Department of Computer Science, Govt Science College, Chitradurga Page 55


Database Management Systems

R1=(customer_id, customer_name)
R2=(loan_no, cust_id, cust_street, cust_city)
If a customer has more than one address, we cannot enforce the
functional dependency, Both the relations are in BCNF, but if a customer has
more than one address we can’t deal with this problem. To deal with it we
must define a new form of constraint called a “multivalued dependency”.
Multivalued dependency:
Consider Functional dependency AB, then we cannot have two tuples
with the same value but different B values. But we can have two tuples with
the same value but different B values using Mulitivalued dependency.
Functional Dependencies are refered to as equality generating
dependencies, and multivalued dependency is reffered to as tuple generating
dependencies.
Let R be a relation schema & αR and βR the multivalued dependency
αβ Holds on R if any legal relation r(R),for all pairs of topples t1 &t2
in r such that t1[α]=t2[α],there exist tuples t3 &t4 in r such that
t1[α]=t2[α]=t3[α]=t4[α]
t3[β]=t1[β]
t3[R-β]=t2[R-β]
t4[β]=t2[β]
t4[R-β]=t1[R-β]

R Α β R-α-β
t1 a1…ai ai+1…aj aj+1……an
t2 a1…ai bi+1…bj bj+1……bn
t3 a1…ai ai+1…aj bj+1……bn
t4 a1…ai bi+1…bj aj+1……an

Above table gives a tabular picture of t1, t2, t3 & t4. The
multivalued dependency αβ says that the relation between α  β is
independent of the relationship between α & R-β.
If the multivalued dependency αβ is satisfied by all relations on
schema R then αβ is trivial multivalued dependency on schema R thus
αβ is trivial of βα or βα=R.

Example with multivalued dependency


Loan_no Cust_id Customer_street Cust_city
L-23 101 Kelagote Cta
L-23 101 Rajajinagar B’LORE
L-93 102 P.B.Roar Dvg
Multivalued dependency in the table is refereed as
Cust_idcustomer_street customer_city.
Multi-valued dependency are used in two ways
1. To test relations to determine whether they are legal under a given
set of functional & multivalued dependencies.
2. To specify constrains on the set of legal relations. We shall thus
concord ourselves with only those relations that satisfy a given set
of functional & multivalued dependencies.
From the definition of multivalued dependency we can write
 If αβ then αβ
In other words, every functional dependency is also a multivalued
dependency.

Department of Computer Science, Govt Science College, Chitradurga Page 56


Database Management Systems

Fourth normal form


A relation schema R is in fourth normal form (4NF) with respect to a
set D of functional & multivalued dependencies if for all multivalued
dependencies D⁺ of the form αβ where αR & βR at least one of the
following holds.
 αβ is a trivial multivalued dependency.
 α is a super key for schema R.
This definition is same as BCNF definition along with use of multivalued
dependencies instesd of functional dependencies. Every 4NF schema is in
BCNF.
Let R be a relation schema & R1,R2…Rn be a decomposition of R. To check
every relation schema Ri in the decomposition is in 4NF, we need to find
what multivalued dependencies hold on each Ri.
Consider a set D of both functional & multivalued dependencies the
restriction of D to Ri is the set Di consisting of
1. All functional dependencies in d+ that include only attributes of Ri.
2. All multivalued dependencies of the form α β∩Ri; where αRi and
αβ is in D+.
4NF Decomposition Algorithm
result:={R};
Done:=false;
Compute D+:given schema Ri let Di denote the restriction of d+ to Ri
while(not done)do
if(there is a scheme Ri in result that is not in 4NF w.r.t. Di)
then begin
let α β be a non trivial multivalued dependency
that holds on Ri such that αRi is not in Di & α∩β=Ø;
result:=(result-Ri)∪(Ri-β)∪(α,β);
end
else done:=true;

we can apply this algorithm to


loan_cust(loan_no, cust_id, cust_street, cust_city)
customer_idloan_no is a nontrivial multivalued dependency & cust_id
is not a super key for the schema. We can decompose it into two schemas
Loan_cust_id=(loan_no, cust_id)
Cust_residency=(cust_id, cust_street, cust_city)
These pair of schemas are in 4NF & eliminates redundancy also. The
algorithm for decomposing to 4NF generates loosless decompositions.
Let R be a relation schema & let D be a set of functional and
multivalued dependencies on R. let R1 & R2 form a decomposition of R. This
decomposition is loosless of R if & only if at least one of the following
multivalued dependencies is in D+:
R1∩R2R2
R1∩R2R2

Department of Computer Science, Govt Science College, Chitradurga Page 57


Database Management Systems

More normal forms


The 4NF is the ‘ultimate’ normal form so far.
There are types of constraints called join dependencies that
generalize multivalued dependencies. This is a normal form called (PJNF)
project join normal form, also called 5th normal form.
DKNF-Domain Key Normal Form is also one of the normal forms.
DKNF&PJNF uses generalized constraints & hard to reason also there are
no set of rules to form these constraints hence DKNF & PJNF are used quite
rarely.
2NF is also a very old & it is improved by later versions. A relation
R is in 2NF if & only if it is in 1NF & every non-key attribute is dependent
on the primary key.

5. Transaction Management.
Concept:
Collection of operations that form a single logical unit of work is
called transaction.
A transaction is a unit of program execution that accesses and
possibly updates various data items.
A transaction is initiated by a user program written in high level
data manipulation programming language such as SQL, C++ or JAVA in which
transaction is delimited by begin transaction and end transaction
statements.
Transactions accesses data uses two operations:
Read(x): transfers data item x from the database to a local buffer belonging
to the transaction.
Write(x): transfers the data item x from the local buffer of the transaction
to the database.
ACID Properties of transaction to ensure integrity of data:
1. Atomicity: suppose a transaction Ti has to transfer an amount $50
from account A which has $1000 to account B which has $2000. This operation
is a single transaction as per customers view but it includes two operations
in terms of a database view; 1. Write(A) to debit $50 from account A. and 2.
Write(B) to credit $50 to account B. suppose system fails soon after
write(A) executed but before write(B), then account A will have $950 and B
will have $2000. Thus the database state is inconsistent.
Atomicity ensures that all operations of the transaction must be
executed otherwise none has to be executed.
A transaction management component ensures atomicity; the component
keeps track of the old values of any data on the disk on which a transaction
performs a write & if the transaction does not complete its execution the
Database system restores the old values.
2. Consistency: execution of a transaction in isolation prevents
concurrently executing transactions and there by prevents the inconsistency
of the database.
Ensuring consistency for an individual transaction is the
responsibility of the application programmer who does the transaction.
3. Isolation: even though multiple transactions may execute
concurrently the system guarantees that for every pair of transactions Ti &
Tj, either Tj finishes execution before Ti starts or Tj starts execution after
Ti finishes. Thus each transaction is unaware of concurrently executing
transactions in the system.
Concurrency management component ensures the isolation.

Department of Computer Science, Govt Science College, Chitradurga Page 58


Database Management Systems

4. Durability: after a transaction completes successfully the changes


it has made to the database persists even if there is a system failure.
Recovery management component ensures the durability by either of following;
1. The updates carried out by the transactions have been written to disk
before the transaction completes.
2. Reconstructing the database whenever database system is restarted
after a failure by keeping information about the updates carried out
by the transaction and written to disk.
Transaction states:
A transaction starts in active state. When it finishes its final statement,
it enters into partially committed state, since the actual output may be
still in main memory it may fail or committed. Even in the event of a
failure the updates may be performed after the system restarts after a
failure. If the transaction is entered into failure state it will have two
options;
1. Restart the operation as a new transaction.
2. Kill the transaction if the failure is due to hardware failure or
logical errors. The transaction is rolled back and aborted.
Once a transaction has committed we cannot undo its effects by aborting. We
can undo a transaction by executing a compensating transaction. If an amount
of $50 is credited by the transaction it can be undo by again debiting $50
from the same account. The transaction is said to have terminated if it has
either committed or aborted.
Partially Active
commited

Active

Failed Aborted

State diagram of a transaction:


Active: The transaction stays in this initial state while it is executing.
Partially committed: After the final statement has been executed and the
output may be still in the memory.
Failed: Once it is discovered that the normal execution could not proceed.
Aborted: After the database is rolled back to the state prior to the start
of the transaction.
Committed: Successful completion of the transaction.

Concurrent Executions.
Allowing multiple transactions to update data causes several complications
with consistency of data. It is far easier to insist the transactions run
serially than ensuring consistency in concurrently executing transactions.
However there are two good reasons, that we require concurrency.
1. Improved throughput and resource utilization: A transaction consists
of many activities; I/O activity, CPU activity, Disk activity etc. A
CPU becomes idle when a transaction is in I/O activity this time may
be exploited and some other process may be allowed to execute the CPU
activity. When the first process is executing disk activity and the
second on executing I/O activity the third transaction may be allowed
to execute its CPU activity. As we can execute more transactions at a
time it increases the throughput of the system. Correspondingly the
processor and disk utilization also increases.

Department of Computer Science, Govt Science College, Chitradurga Page 59


Database Management Systems

2. Reduced waiting time: If the transactions run serially a short


transaction may have to wait for a preceding long transaction to
complete. This may lead to unpredictable delays in running a
transaction and if two transactions are accessing two different parts
of database both transactions may be allowed to disk activity at the
same time. Thus concurrent execution reduces average waiting time, and
average response time.
Concurrent control schemes in the database system must control the
interaction among the concurrent transactions to maintain consistency of the
database.
Consider the example: let transaction T1 transfers $50 from account A
to account B. transaction T2 transfers 10% of balance from account A to
account B. The transactions are written as;
T1: read(A);
A:=A-50;
Write(A);
Read(B);
B:=B+50;
Write(B);

T2: read(A);
Temp:=a*0.1;
A:=A-temp;
Write(A);
Read(B);
B:=B+temp;
Write(B);
Suppose A and B contains $1000 and $2000 respectively at initial time;
different possible schedules for executions of two transactions may be
written.
The execution sequences are called schedules they represent the
chronological order in which instructions are executed in the system.

T1 T2 T1 T2
read(A); read(A);
A:=A-50; Temp:=a*0.1;
Write(A); A:=A-temp;
Read(B); Write(A);
B:=B+50; Read(B);
Write(B); B:=B+temp;
read(A); Write(B);
Temp:=a*0.1; read(A);
A:=A-temp; A:=A-50;
Write(A); Write(A);
Read(B); Read(B);
B:=B+temp; B:=B+50;
Write(B); Write(B);
Schedule1: Serial Schedule2: Serial Schedule
Schedule

These Schedules are Serial, where the instructions belonging to one


single transaction appear together in that schedule. Thus for n set of
transactions there exist n! valid set of serial schedules.
When transactions are concurrently executed, the operating system may
execute some instructions from one transaction and may perform a context

Department of Computer Science, Govt Science College, Chitradurga Page 60


Database Management Systems

switching and execute the second the second transaction for some time, and
so on the CPU time is shared among all the transactions. Various
instructions from both transactions may be interleaved, one of the possible
schedule is below.
T1 T2
read(A);
A:=A-50;
Write(A);
read(A);
Temp:=a*0.1;
A:=A-temp;
Write(A);
Read(B);
B:=B+50;
Write(B);
Read(B);
B:=B+temp;
Write(B);
Schedule3: Concurrent
Schedule
The above scheduling preserves the database consistency. However all
the concurrent executions result in correct state of the database. Consider
the following scheduling;

T1 T2
read(A);
A:=A-50;
read(A);
Temp:=a*0.1;
A:=A-temp;
Write(A);
Read(B);
Write(A);
Read(B);
B:=B+50;
Write(B);
B:=B+temp;
Write(B);
Schedule4: Concurrent
Schedule with database
inconsistency
Before executing the above schedules account A has $1000 and Account B
has $2000. After the transactions executed Account A will have $950, and
account B will have $2100. So we could not preserve A+B values.
If control of concurrent execution is left to the operating system,
many possible schedules including the schedules similar to the above
schedules may be possible which lead to inconsistent database state. Hence
concurrency control component of the database system ensures the concurrent
transactions will not leave the database state into inconsistent state.
Serializability:
A transaction is a program, it is computationally difficult to
determine exactly what operation a transaction performs and how operations
of various transactions interact. So we will consider only two operations
read and write. Read(Q) and Write(Q) is an instruction on data item Q.

Department of Computer Science, Govt Science College, Chitradurga Page 61


Database Management Systems

A transaction schedule is serializable if its outcome is equal to the


outcome of its transactions executed serially.
Conflict Serializability:
Conflict-serializability: if serial and concurrent schedules have the
same sets of respective chronologically ordered pairs of conflicting
operations.
Let us consider a schedule S in which Ii and Ij are two instructions of
Ti and Tj respectively(i≠j). if Ii and Ij refers to two different data items
then we can swap Ii and Ij without affecting the results of any instruction
in the schedule. however if Ii and Ij refer to the same data item then we
must examine the operations they are performing.
1. Ii=read(Q) & Ij=read(Q): the order of Ii and Ij does not matter as
they only read the data.
2. Ii=read(Q),Ij=write(Q): if Ii comes before Ij, then Ti does not read
the value of Q that is written by Tj in instruction Ij. If Ij comes
before Ii, then Ti reads the value of Q that is written by Tj.thus
the order of Ii and Ij matters.
3. Ii=write(Q), Ij=read(Q).the order of Ii and Ij matters for reasons
similar to those of the previous case.
4. Ii=write(Q), Ij=write(Q), since both instructions are write operate
operations, the order of these instructions does not affect either
Ti or Tj. However the value obtained by the next read(Q) instruction
of S is affected, since the result of only the latter of the two
write instructions is preserved in the database. If there is
write(Q) instruction after Ii and Ij in S, then the order of Ii and
Ij directly affects the final value of Q in the database state that
results from schedule S.
Operations Ii and Ij conflict if they are operations by different
transactions on the same data item, and at least one of these instructions
is a write operation.
Let Ii and Ij be consecutive instructions of a schedule S. if Ii
and Ij are instructions of different transactions and Ii and Ij do not
conflict, then we can swap the order of Ii and Ij to produce a new
schedule S’. We expect S to be equivalent S’, since all instructions
appear in the same order in both schedules except for Ii and Ij, whose
order does not matter.
If a schedule S can be transformed into a schedule S’ by a
series of swaps of non-conflicting instructions, we say that S and S’
are conflict equivalent.
View-serializability is concurrent schedule equivalent to a serial
schedule with the same transactions, such that respective transactions in
the two schedules read and write the same data values.
Testing for Serializability:
Precedence graph is constructed on schedule S for testing
serializability. This graph consists of a pair G=(V,E), where V is a set of
vertices and E is a set of edges. The set of vertices consists of all the
transactions participating in the schedule. The set of edges consists of all
edges Ti->Tj for which one of three conditions holds:
1. Ti executes write(Q) before Tj executes read(Q).
2. Ti executes read(Q) before Tj executes write(Q).
3. Ti executes write(Q) before Tj executes write(Q).
If an edge Ti->Tj exists in the precedence graph, then in any serial
schedule S’ equivalent to S, Ti must appear before Tj.

Department of Computer Science, Govt Science College, Chitradurga Page 62


Database Management Systems

T1 T2 T2 T1

The single edge from T1-> T2 implies all the instructions of T1 are
executed before the first instruction of T2 is executed. Similarly edge from
T2-> T1 implies all the instructions of T2 are executed before the first
instruction of T1 is executed.

T1 T2

The precedence graph contains the edge from T1->T2 because T1 executes
read(A) before T2 executes write(A). it also contains edge from T2->T1
because T2 executes read(B) before T1 executes write(B).
If the precedence graph for S has a cycle S is not conflict
serializable. If the graph contains no cycles, then the schedule S is
conflict serializable.

A serializability order of the transactions can be obtained through


topological sorting, which determines a linear order consistent with the
partial order of the precedence graph.
Ti Ti

Tj Tk Tk

Tm Tj

Tm

To test for conflict serializability we need to construct the


precedence graph and invoke a cycle detection algorithm.

5. Security and integrity


Security &integrity threats:
Security & integrity violations may be caused due to, giving access to
unauthorized users, giving users more access than required for their Normal
operation dialing password, threatening bribery or black mail. The intension
of the DBMS is to ensure security on data & to stop all Previously for
mentioned problems.
To force security & integrity by the DBMS, Operating system has to
Support these additional requirements of DBMS.
 operating system must ensure that files belonging to the DBMS are not
used directly without proper authorization (provides password for
file)
 Operating system must also ensure that illegal users using public
Communication facilities are not allowed access to the system.

Department of Computer Science, Govt Science College, Chitradurga Page 63


Database Management Systems

 Users must be required to provide adequate identification and pass


word. Access to the computing facility &the storage medium must be
restricted to authorized person only.
 There must be adequate physical protection as in the case of valuable
asset.
 Old storage devices &data on it must be completely destroyed before
disposal.
 In a Telecommunication environment, data may be accessed by
eavesdroppers, wire trappers, &other illegal users. To prevent this
type of threat, data transmitted over public communication Channels
should be in a ciphered form.

Security & integrity violations may be classified as accidental &


intentional violations.
1 Accidental security & integrity violations.
 A user can get access to a portion of the database not normally
accessible to that user due to a system error or an error on the part of
another user. Exit an application programmer accidentally omits
appropriate verification routines.
 Filature of various forms during normal operation.
Ex: transaction processing or storage media loss.
 Concurrent usage anomalies proper synchronization mechanisms are used to
avoid data inconsistencies due to concurrent usage.
 System error: A dial in user may be assigned the identity of anther dial
in user who was disconnected accidentally or who was up without going
through a log off procedure.
 Improper authorization, the authorizer cans accidently give proper
authorization to a user which could lead to database security and for
integrity violations.
 Hardware failures: memory protection hardware that fails could leads to
software errors and culminates in database security and for integrity
violations.

Malicious or intentional security & integrity threats:


 A computer system operator or system programmer can intentionally by
pass the normal security and integrity mechanisms alter or destroyed the
data in the database, or make unauthorized copies of sensitive data.
 An Unauthorized user can get access to a secure terminal or to the
password of an authorized user and compromise the database such users
could also destroyed the data files.
 Authorized users could pass on sensitive information under or for
personal gain.
 System & application programmers could bypass normal security in their
programs by directly accessing database files and making changes &
copies for illegal use.
 An authorized person could get access to the computer system physically
or by using a communications channel, & compromise the database.

Defence Machanisms
Human factors: in advertent assignment of authorized to a wrong class
of users can result in possible security violations, the authorizer is

Department of Computer Science, Govt Science College, Chitradurga Page 64


Database Management Systems

responsible for granting proper database access authorization to the user


community.
Physical security: physical security mechanisms include appropriate
locks and keys and entry logs to computing facility and terminals.
Administrative controls: administrative controls are the security
and access control policies that determine what information will be
accessible to what class of users and the type of access that will be
allowed to that class.
DBMS & OS security mechanisms:
 Proper mechanisms for the identification & verification of users, each
user is assigned an account number & a password. The OS & DBMS could
allow access only if correct account number & pass words are provided
 Protection of data and programs both in primary and secondary memories
by the OS.

Authorization:
Authorization is the culmination of the administrative policies of the
organization, expressed as a set of rules that can be used to determine
which user has what type of access of which portion of the database.
The person who is in charge of specifying the authorization is usually
called the authorizer. The authorizer can be distinct from the DBA.
Authorization is usually maintained in the form of a table called an
access matrix, it contains rows called subjects and columns termed as
objects.
Object: an object is something that needs protection, a data field ,a
record, or a file could be considered an object, views also considered as
object.
Granularity: granularity can be choosen to a file. A record or a
data item. The smaller the protected abject finer the degree of specifying
protection but finer the granularity more is the size of the authorization
matrix & over head of enforcing database security.
Subject: a subject is an ‘user’ who is given some rights to access a
data object.
Privilages or access types:
1)Read: allows only reading object .
2)insert: allows inserting new occurance of the object.
3)delete: allows deleting on existing occurance of the object type.
4)update: allows the subject to change the value of the occurance of the
object .
5)add: allows the subject to add new object types (ex: new relations)
6)drop: allows the subject to drop/delete existing object types from the
database.
7)alter: allows the subject to add/delete attributes to/from the existing
type or relation.
8)propogate access control: this right determines if this subject is allowed
to propogate the right over the object to other subjects.

Granting of privilages:
A user who has been granted some form of authorization may be allowed
to pass on this authorization to other users. Passing of authorization from
one user to another can be represented by an ‘authorization graph’, the
nodes of this graph are users and edges indicate privilages given from user-
i to user-j.

Department of Computer Science, Govt Science College, Chitradurga Page 65


Database Management Systems

Consider an example of granting update authorization on loan relation.


Initially database administrator grants update authorization on loan to
users u1, u2,& u3 who may inturn pass this authorization to other users
U1 U4
DBA

U2 U5

U3

U5 is granted authorization by both u1 & u2. U4 is granted


authorization by u1.
A user has authorization if and only if there is a path from the root
of the authorization graph down to the node representing the users
Suppose DBA revokes authorization from u1 since u4 has authorization
from u1 that will revoked automatically but u5 will retain authorization,
because it has authorization from u2 also, if u2 revokes authorization or u2
is revoked its authorization. U5 looses authorization.
A pair of devious users may attempt to defeat rules for
revocation of authorization by granting authorization to each
U1
DBA DBA
U3

If DBA revokes grant from u2 it still retains authorization


since it has granted access through u1.
U1
DBA U2
U3
If u1 also looses authorization the edges from u1 to u2 and
u2 to u1 are no longer part of the path starting from DBA the edges
between u1 and u2 are deleted.
U1
DBA U2
U3
Granting privilages in SQL
Grant statements is used to confer authorization basic form:
Grant <privilege test> on <relation or view name> to <user /role list>
Example : Grant update (amount) on account to u1,u2,u3;
Statements grants update authority on account relation to users u1,u2,u3.
We can give, select, insert, and delete privileges to users,
references privileges granted to specific
Attributes allows creating relations that reference branch_name of relation
branch as a foreign key.
Grant references (branch_name) on branch to u2:
By default the privileges are not transferable. If the grant is given “with
grant option”
The user can pass his authorization to other user.
Grant select on branch to u1 with grant option:
Roles:

Department of Computer Science, Govt Science College, Chitradurga Page 66


Database Management Systems

Roles can hold certain set of authorization, which can be granted to


users, there by getting all authorization in that role. A role can be
granted to many users, as well many can be granted to a single user also.
Any authorization to be given to the user can be given to role.
Creating roles:
Create role teller;
Granting privileges to roles is same as granting to users
Grant select on account to teller.
Roles can be granted to users as well as other roles also
Grant teller to john.
Create role manager.
Grant teller to manager.
Grant manager to many;
Revoking privileges:
To revoke on authorization, we use the revoke statement; it is almost
identical to that of grant
Revoke <privilege-list> on <relation-name or view-name>
from <user / role list > [restrict/cascade].

Example:
Revoke update (amount) on loan from u1,u2,u3.
Revoke reference (branch_name) on branch from u1.

We can restrict the revoking of privileges to current user ,if the


rights are given to other user by current user will also revoked by default
this is called “cascading of the revoke”. To restrict revoking to current
user we can use “restrict”.
Ex: Revoke select on branch from u1,u2,u3 restrict.

Authorization on view
A view provides a means to a user with a personalized model of the
database. A view can hide data that a user does not need to see.
Views simplify system usage because they restrict the user’s attention
to the data of interest.
Although a user may be derived direct access to a rational that user
may be allowed to access part of that relation through a view. Thus a
combination of relation level security and view level security limit a
user‘s access to precisely the data that the user needs.
Ex: a clerk in the bank needs to know names of all customers who have
a loan at each and nothing more than this. He has to be restricted access to
loan and borrower relations but he can give access to view, Cust_loan:
Create view cust_loan as
(select branch_name, customer name
From borrower, lone
Where borrower.lone_no=lone.lone_no)
The creater of the view must have read authorization on both relation
borrower and loan.
Security of data: data must be protected while they are being
transmitted. Data may need to be protected from intruders, Who are able to
bypass operating system security. Encryption is one of the techniques to
enforce security.
Encryption: Encryption of data will make the data to be stored in
different way.
Ex: consider substitution of each character with next character

Department of Computer Science, Govt Science College, Chitradurga Page 67


Database Management Systems

Original data: chitradurga.


Encrypted data: dijushevshb.

This is a very simple technique of encryption and is weak encryption a


unauthorized person may try to find the original data from this.
A good encryption technique has the following properties.
 It is relatively simple for authorized users to encryption and
decryption data.
 It depends not on the secrecy of the algorithm but rather on a
parameter of the algorithm called encryption key.
 It encryption key is extremely difficult for an intruder to determine.
“Data encryption standard” (DES) issued in 1977.does both substitution
of characters and rearrangement of their order on the basis of an
encryption key. User require to provide a key which is weakness of the
algorithm but since it was good in other respect DES was reaffirmed in
1983, 1987 and in 1993. In 1993 a new standard advanced encryption
standard (AES) was introduced and in 2000. Rijndale algorithm
(v.rijmen and j.daemene _inventors) was selected for AES.
Rijndael algorithm is a shared _key or symmetric key algorithm in
which authorized users share a key.
Public_key encryption is an alternative scheme that avoids some of the
problems that we face with DES. It is based on two key a public key
and a private key.

Department of Computer Science, Govt Science College, Chitradurga Page 68

You might also like