0% found this document useful (0 votes)
18 views

DBMS Module1

Uploaded by

valo rant
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

DBMS Module1

Uploaded by

valo rant
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 52

DBMS Dept of CSE

MODULE 1: Chapter-1
Introduction to Database Management System
Overview:
 Data means known fact that can be recorded and that have implicit meaning.
 A database is a collection of related data which are known facts that can be recorded and
that have implicit meaning.
o For example, consider the names, telephone numbers, and addresses of the people
we know. We may have recorded this data in an indexed address book or you may
have stored it on a hard drive, using a personal computer and software such as
Microsoft Access or Excel. This collection of related data with an implicit meaning
is a database.
 A database management system (DBMS) is a collection of programs that enables users to
create and maintain a database. The DBMS is a general-purpose software system that
facilitates the processes of defining, constructing, manipulating, and sharing databases
among various users and applications.
o Defining a database involves specifying the data types, structures, and constraints
of the data to be stored in the database. The database definition or descriptive
information is also stored by the DBMS in the form of a database catalog or
dictionary, it is called meta-data.
o Constructing the database is the process of storing the data on some storage
medium that is controlled by the DBMS.
o Manipulating a database includes functions such as querying the database to
retrieve specific data, updating the database to reflect changes in the mini-world,
and generating reports from the data.
o Sharing a database allows multiple users and programs to access the database
simultaneously.
 An application program accesses the database by sending queries or requests for data to
the DBMS. A query typically causes some data to be retrieved. A transaction may cause
some data to be read and some data to be written into the database.
 Other important functions provided by the DBMS include protecting the database and
maintaining it over a long period of time.

The below figure illustrates the database system. The database and DBMS software together is
called as a database system.

Page 1
DBMS Dept of CSE

An example database
 Consider a UNIVERSITY database for maintaining information concerning students,
courses, and grades in a university environment.
 The database is organized as five files, each of which stores data records of the same type.
 To define this database, we must specify the structure of the records of each file by
specifying the different types of data elements to be stored in each record.
o For example: each STUDENT record includes data to represent the student’s Name,
Studentnumber, Class and Major(like CS)such as freshman or ‘1’, sophomore or ‘2’,
and so forth), and records and have many relationships among the records.
 We must also specify a data type for each data element within a record.
o For example, we can specify that Name of STUDENT is a string of alphabetic
characters, Studentnumber of STUDENT is an integer, and Grade of
GRADE_REPORT is a single character from the set {‘A’, ‘B’, ‘C’, ‘D’, ‘F’, ‘I’}.

Page 2
DBMS Dept of CSE

 To construct the UNIVERSITY database, we store data to represent each student, course,
section, grade report, and prerequisite as a record in the appropriate file.
 The records in the various files may be related. For example, the record for Smith in the
STUDENT file is related to two records in the GRADE_REPORT file that specify Smith’s
grades in two sections.

Figure: A database that store student and course information


 Database manipulation involves querying and updating.
o Examples of queries are as follows:
1. A list of all courses and grades of ‘Smith’ may be retrieved.
2. List the prerequisites of the ‘Database’ course.
o Examples of updates include the following:
1. Change the class of ‘Smith’ to sophomore.
2. Create a new section for the ‘Database’ course for this semester.

Page 3
DBMS Dept of CSE

File system v/s DBMS System:


File system: Data files are created and processed by programs written by programmers or
users of files. File created by different users of the organization often contain redundant data.
The application programs depend upon the structural properties of the files. The application
programs Interact directly with the operating system.

File system v/s DBMS


 Data redundancy and inconsistency – Redundancy is the concept of repetition of data
i.e. each data may have more than a single copy. The file system cannot control
redundancy of data as each user defines and maintains the needed files for a specific
application to run. There may be a possibility that two users are maintaining same files
data for different applications. Hence changes made by one user does not reflect in files
used by second users, which leads to inconsistency of data. Whereas DBMS controls
redundancy by maintaining a single repository of data that is defined once and is
accessed by many users. As there is no or less redundancy, data remains consistent.
 Limited Data Sharing–Data are scattered in various files. Also different files may have
different formats and these files may be stored in different folders may be of different
departments. So, due to this data isolation, it is difficult to share data among different
applications. Whereas in DBMS, data can be shared easily due to centralized system.
 Concurrent Access Anomalies– Concurrent access to data means more than one user
is accessing the same data at the same time. Anomalies occur when changes made by
one user gets lost because of changes made by other user. File system does not provide
any procedure to stop anomalies. Whereas DBMS provides a locking system to stop
anomalies to occur.
 Difficulty in Accessing Data– For every search operation performed on file system, a
different application program has to be written. While DBMS provides inbuilt searching
operations. User only has to write a small query to retrieve data from database.
 Data integrity problem– There may be cases when some constraints need to be applied
on the data before inserting it in database. The file system does not provide any
procedure to check these constraints automatically. Whereas DBMA maintains data
integrity by enforcing user defined constraints on data by itself.
 Atomicity Problems -Atomicity means a transaction must be all -or-nothing i.e. the
transaction must either fully happen, or not happen at all. It must not complete partially.

Page 4
DBMS Dept of CSE

Example: if A want to transfer 5000rs to B's account. In this case A's Account should
be debited and B's account should be credited with the same amount. Let suppose A's
account is debited with 5000rs and the transaction fails due to some problem. Now
transaction is incomplete because B's account is not credited. These type of problem
occurs in file system because there is no procedure to stop such type of anomalies.
Transaction atomicity is a special feature of DBMS. In DBMS either a transaction
completed fully or none of the action is performed.
 Security Problems -In file system there is no or very less security. General security
provided by file systems are locks, guards etc. Whereas DBMS provides a way for
different users have different level of access to database. It will provide security in
Different level.
 Data Isolation –In file system there is no standard format of data or we can say data is
scattered in various formats or files which also make data retrieval difficult. In DBMS
system, due to the centralized system the format of similar type of data remains same.

File Systems DBMS

1. Unstructured Data Structured Data


2. File location required for accessing files Location is not required to access data

3. Size of file Size of file


4. Access of data is not much flexible causes Flexible access of data possible using Query
redundant data
5. Redundancy is more Redundancy can be removed
6. Inconsistent data Consistent data- using keys

7. Concurrent access causes problems Concurrent access can be done

8. Integrity constraints cannot be provided Integrity constraints can be applied when data is
predefined created.

9. Security is less- backup, restore problems Security is more – backup and restore can be done.


Page 5
DBMS Dept of CSE

Advantages of DBMS:

 Minimize Data Redundancy -In File Processing System, duplicate data is created in
many places because all the programs have their own files. This creates data
redundancy which in turns wastes labor and space. In Database Management System, all
the files are integrated in a single database. The whole data is stored only once at a single
place so there is no chance of duplicate data.
For example: A student record in library or examination can contain duplicate values, but
when they are converted into a single database, all the duplicate values are removed.
Complete redundancy can be removed because somehow we need duplicate value to relate
tables with each other. But still DBMS controls data redundancy that saves lots of labor and
time.

 Sharing of Data - In DBMS, Data can be shared in between authorized user of


database. All the users have their own right to access the database up to a level.
Database Administration has complete access of database. He can assign users to access
the database. Others users are also authorized to access database and also they can share
data between them. Many users have same authority to access the database.

 Data Consistency - DBMS controls data redundancy which in turn controls data
consistency. Data consistency means if you want to update data in any files then all the
files should not be updated again. As in DBMS, data is stored in a single database so
data becomes more consistent in comparison to file processing system. Also updated
values are available to all the users immediately.

 Data Integrity - Data integrity means unification of so many files into a single file. In
DBMS data is stored in different tables. A database contains different tables that are
linked to each other. Many users feed entries in these tables so it is important to
maintain data items and association between data items. DBMS allows data integrity
that makes it easy to decrease data duplicity Data integration reduces redundancy as
well as data inconsistency.

 Search Capability - Users of database may require to fetch data from the database.
There are numerous queries users may ask about the data. Search speed of the database
must be fast to produce quick results. If users execute any query, then it is required that
he get fastest results from the database. It is an objective of database to maintain
flexible search capability.

 Security - Data security means protecting your precious data from unauthorized access.
Data in database should be kept secure and safe to unauthorized modifications. Only
authorized users should have the grant to access the database. There is a username set
for all the users who access the database with password so that no other guy can access
this information. DBMS always keep database tamperproof, secure and theft free.
Page 6
DBMS Dept of CSE

 Privacy - Privacy means up to what extent a user can access the data. It is
predetermined by the DBA that who will access the data and up to what level he will be
able to access it. Let say when you make a Facebook page then you have the power to
give rights to other users that who will be the promoter, editor and admin.

 Simplicity - Simplicity means to represent the overall logical view of data in a simple
and clear manner. DBMS is very simple for its users who use it. All the operations like
insert, delete, create and update are very easy to implement.

 Backup and Recovery - Data loss is a very big problem for all the organizations. In
traditional file processing system, a user needs to back up the database after a regular
interval of time that wastes lots of time and resources. If the volume of data is large then
this process may take a very long time.
DBMS solves this problem of taking back up again and again because it allows
automatic backup and recovery of database. For examples, if a system fails in the
middle of any process then DBMS stores the values of that state in which database were
before query execution.

 Integrity Constraints - Constraints are used to store accurate data because there are
many users who feed data in database. Data stored in database should always be correct
and accurate. DBMS provides the capability to enforce these constraints on database.
For example, the maximum marks obtained by the students can never be more than
100. Also account balance of Banks like Axis should not be less than 2500 otherwise
you will be penalized.

 Data Atomicity - Any complete transaction in database is called atomic unit. It is the
duty of DBMS to store a complete transaction in database. If any transaction is partially
completed, then it roll backs them.
For example, in railway reservation system, if user has completed the process of ticket
reservation then his record will be stored and amount of money will be deducted from
his account otherwise no amount will be deducted and if deducted it will be given back.

 Concurrency Control - If two users are accessing data simultaneously and they both
want to update values of same record then it may create concurrency. DBMS has the
power to control concurrency so that no transactions are lost.

 Maintaining Cost is lower - DBMS systems are costly but after purchasing them their
maintenance cost is very less. It can be maintained by few programmers that is not
costly for an enterprise.

 Very Less Chances of Data Loss - As there is lot of security constraint made on
database so chances of data loss are minimum. One can store their precious data or
Page 7
DBMS Dept of CSE

many years in DBMS without loss of any information.


Storage Data:
The user of a DBMS is ultimately concerned with some real-world enterprise, and the data to
be stored describes various aspects of this enterprise. For example, there are students, faculty,
and courses in a university, and the data in a university database describes these entities and
their relationships.
 Data Model
A data model is a collection of high-level data description constructs that hide many
low-level storage details. A DBMS allows a user to define the data to be stored in terms
of a data model. Most database management systems today are based on the relational
data model. While the data model of the DBMS hides many details, it is nonetheless
closer to how the DBMS stores data than to how a user thinks about the underlying
enterprise.
 A semantic data model is a more abstract, high-level data model that makes it easier
for a user to come up with a good initial description of the data in an enterprise. These
models contain a wide variety of constructs that help describe a real application
scenario. A widely used semantic data model called the entity-relationship (ER) model
allows us to pictorially denote entities and the relationships among them.
Relational Model
 The central data description construct in this model is a relation, which can be thought
of as a set of records. A description of data in terms of a data model is called a schema.
 In the relational model, the schema for a relation specifies its name, the name of each
field (or attribute or column), and the type of each field.
 Example, student information in a university database may be stored in a relation with
the following schema:
Students (sid:integer, name: string, login: string, age: integer, gpa: real)
 The preceding schema says that each record in the Students relation has five fields, with
field names and types as indicated.
 An example instance of the Students relation appears in Figure 1.1.

Figure 1.1 An Instance of the Students Relation

 Other Data Models In addition to the relational data model are the Hierarchical model
(e.g., used in IBM’s IMS DBMS), Network model (e.g., used in IDS and IDMS), the

Page 8
DBMS Dept of CSE

Object oriented model (e.g., used in Object store and Versant), and the object-
relational model (e.g., used in DBMS products from IBM, Informix, Object Store,
Oracle, Versant, and others). While there are many databases that use the hierarchical
and network models, and systems based on the object-oriented and object-relational
models are gaining acceptance in the marketplace. The dominant model today is the
relational model.

Levels of Abstraction in a DBMS(Three-Schema Architecture)


The data in a DBMS is described at three levels of abstraction, as illustrated in Figure 1.2. The
database description consists of a schema at each of these three levels of abstraction: the
conceptual, physical, and external schemas.

Figure 1.2 Levels of Abstraction in a DBMS


Conceptual Schema
 The conceptual schema (sometimes called the logical schema) describes the stored data
in terms of the data model of the DBMS. In a relational DBMS, the conceptual schema
describes all relations that are stored in the database.
 The conceptual schema hides the details of physical storage structures and concentrates
on describing entities, data types, relationships, user operations, and constraints.
 In our sample university database, these relations contain information about entities,
such as students and faculty, and about relationships, such as students’ enrollment in
courses. All student entities can be described using records in a Students relation. Each
collection of entities and each collection of relationships can be described as a relation,
leading to the following conceptual schema:
Students (sid: string, name: string, login: string, age: integer, gpa: real)
Faculty (fid: string, fname: string, sal: real)
Courses (cid: string, cname: string, credits: integer)
Rooms (rno: integer, address: string, capacity: integer)
Enrolled (sid: string, cid: string, grade: string)
Teaches (fid: string, cid: string)
Meets In (cid: string, rno: integer, time: string)

Page 9
DBMS Dept of CSE

Physical Schema
 It describes the physical storage structure of the database. Which uses a physical data
model and describes the complete details of data storage and access paths for the
database.
 The physical schema specifies additional storage details. Essentially, the physical
schema summarizes how the relations described in the conceptual schema are actually
stored on secondary storage devices such as disks and tapes. We must decide what file
organizations to use to store the relations, and create auxiliary data structures called
indexes to speed up data retrieval operations.
 A sample physical schema for the university database follows:
o Store all relations as unsorted files of records. (A file in a DBMS is either a
collection of records or a collection of pages, rather than a string of characters as in
an operating system.)
o Create indexes on the first column of the Students, Faculty, and Courses relations,
the sal column of Faculty, and the capacity column of Rooms.
 Decisions about the physical schema are based on an understanding of how the data is
typically accessed. The process of arriving at a good physical schema is called physical
database design.

External Schema
 External schemas allow data access to be customized (and authorized) at the level of
individual users or groups of users.
 Any given database has exactly one conceptual schema and one physical schema
because it has just one set of stored relations, but it may have several external schemas,
each tailored to a particular group of users.
 Each external schema consists of a collection of one or more views and relations from
the conceptual schema.
 A view is conceptually a relation, but the records in a view are not stored in the DBMS.
Rather, they are computed using a definition for the view, in terms of relations stored in
the DBMS. The external schema design is guided by end user requirements.
 For example, we might want to allow students to find out the names of faculty members
teaching courses, as well as course enrollments. This can be done by defining the
following view:
o Courseinfo (cid: string, fname: string, enrollment: integer)
 A user can treat a view just like a relation and ask questions about the records in the
view. Even though the records in the view are not stored explicitly, they are computed
as needed.

Data Independence

Page 10
DBMS Dept of CSE

 Data independence is the capacity to change the schema at one level of a database
system without having to change the schema at the next higher level.
 The following are the types of data independence:
o Logical data independence
o Physical data independence

 Logical data independence


 It is the capacity to change the conceptual schema without having to change the
external schemas or application programs.
 Conceptual schema may be changed to expand the database, to change constraints,
or to reduce the database by removing a record type or data item.
 Only the view definition and the mappings need to be changed in a DBMS that
supports logical data independence. After the conceptual schema undergoes a
logical reorganization, application programs that reference the external schema
constructs must work as before. Changes to constraints can be applied to the
conceptual schema without affecting the external schemas or application programs.
 Logical data independence is harder to achieve because it allows structural and
constraint changes without affecting application programs.

 Physical data independence


 It is the capacity to change the internal schema without having to change the
conceptual schema. Hence, the external schemas need not be changed as well.
 Changes to the internal schema may be needed because some physical files may be
reorganized, for example, by creating additional access structures to improve the
performance of retrieval or update.
 If the same data as before remains in the database, there is no need to change the
conceptual schema.
 For example, providing an access path to improve retrieval speed of section records
in university database by semester and year should not require a query such as list
all sections offered in fall 2008 to be changed, although the query would be
executed more efficiently by the DBMS by utilizing the new access path.
 Physical data independence exists in most databases and file environments where
physical details such as the exact location of data on disk, and hardware details of
storage encoding, placement, compression, splitting, merging of records, and so on
are hidden from the user. Applications remain unaware of these details.
 Data independence occurs because when the schema is changed at some level, the
schema at the next higher level remains unchanged, only the mapping between the
two levels is changed. Hence, application programs referring to the higher-level
schema need not be changed.
 The three-schema architecture can make it easier to achieve true data independence,
both physical and logical.

Page 11
DBMS Dept of CSE

Queries in DBMS :
 A query is a request for data or information from a database table or combination of
tables. Or A query is a question, often expressed in a formal way.
 A database query can be either a select query or an action query. A select query is
a data retrieval query, while an action query asks for additional operations on the data,
such as insertion, updating or deletion.
 Example :
1. What is the name of the student with student id 123456?
2. What is the average salary of professors who teach the course with cid CS564?
3. How many students are enrolled in course CS564?
4. What fraction of students in course CS564 received a grade better than B?
5. Is any student with a GPA less than 3.0 enrolled in course CS564?
 Such questions involving the data stored in a DBMS are called queries.
 A DBMS provides a specialized language, called the query language, in which queries
can be posed.
 Relational calculus is a formal query language based on mathematical logic, and
queries in this language have an intuitive, precise meaning.
 Relational algebra is another formal query language, based on a collection of operators
for manipulating relations, which is equivalent in power to the calculus.
 A DBMS enables users to create, modify, and query data through a data manipulation
language (DML). Thus, the query language is only one part of the DML, which also
provides constructs to insert, delete, and modify data.

Transaction Management :
 A transaction is any one execution of a user program in a DBMS. (Executing the same
program several times will generate several transactions.) Partial transactions are not
allowed, and the effect of a group of transactions is equivalent to some serial execution
of all transactions. 
 Transaction Management: A transaction is one or more SQL statements that make up
a unit of work performed against the database, and either all the statements in
a transaction are committed as a unit or all the statements are rolled back as a unit.
 Two main issues deals with transaction
1. Failure of various kinds, such as hardware failure and system crashes.
2. Concurrent execution of multiple transaction.

 Concurrent Execution of Transactions


 An important task of a DBMS is to schedule concurrent accesses to data so that
each user can safely ignore the fact that others are accessing the data
concurrently.

Page 12
DBMS Dept of CSE

 The importance of this task cannot be underestimated because a database is


typically shared by a large number of users, who submit their requests to the
DBMS independently, and simply cannot be expected to deal with arbitrary
changes being made concurrently by other users.
 A DBMS allows users to think of their programs as if they were executing in
isolation, one after the other in some order chosen by the DBMS.
 For example: Consider a database that holds information about airline
reservations. At any given instant, it is possible that several travel agents are
looking up information about available seats on various flights and making new
seat reservations. When several users access a database concurrently, the DBMS
must order their requests carefully to avoid conflicts. For example, when one
travel agent looks up Flight 100 on some given day and finds an empty seat,
another travel agent may simultaneously be making a reservation for that seat,
thereby making the information seen by the first agent obsolete.
 Another example of concurrent use is a bank’s database. While one user’s
application program is computing the total deposits, another application may
transfer money from an account that the first application has just ‘seen’ to an
account that has not yet been seen, thereby causing the total to appear larger than
it should be. Clearly, such anomalies should not be allowed to occur.
 A locking protocol is a set of rules to be followed by each transaction. A lock is
a mechanism used to control access to database objects. Two kinds of locks are
commonly supported by a DBMS: shared lock and exclusive lock.
 Shared lock on an object can be held by two different transactions at the same
time, but an exclusive lock on an object ensures that no other transactions hold
any lock on this object.
 Suppose that the following locking protocol is followed: Every transaction
begins by obtaining a shared lock on each data object that it needs to read and an
exclusive lock on each data object that it needs to modify, then releases all its
locks after completing all actions.
For example: Consider two transactions T1 and T2 such that T1 wants to modify
a data object and T2 wants to read the same object. Intuitively, if T1's request for
an exclusive lock on the object is granted first, T2 cannot proceed until T1
release this lock, because T2's request for a shared lock will not be granted by
the DBMS until then. Thus, all of T1's actions will be completed before any of
T2's actions are initiated.

 Incomplete Transactions and System Crashes


 Transactions can be interrupted before running to completion for a variety of
reasons, e.g., a system crash.

Page 13
DBMS Dept of CSE

 A DBMS must ensure that the changes made by such incomplete transactions are
removed from the database.
 For example, if the DBMS is in the middle of transferring money from account A to
account B, and has debited the first account but not yet credited the second when the
crash occurs, the money debited from account A must be restored when the system
comes back up after the crash.
 DBMS supports atomicity. Atomicity means a transaction must be all -or-nothing
i.e. the transaction must either fully happen, or not happen at all. It must not
complete partially.
 Example: if A want to transfer 5000rs to B's account. In this case A's Account
should be debited and B's account should be credited with the same amount. Let
suppose A's account is debited with 5000rs and the transaction fails due to some
problem. Now transaction is incomplete because B's account is not credited. These
type of problem occurs in file system because there is no procedure to stop such
type of anomalies.
 To handle incomplete transaction, the DBMS maintains a log of all writes to the
database. A crucial property of the log is that each write action must be recorded in
the log (on disk) before the corresponding change is reflected in the database itself.
Suppose if system crashes data can be restored with the help of this log file. If the
system crashes just after making the change in the database but before the change is
recorded in the log, the DBMS would be unable to detect and undo this change. This
property is called Write-Ahead Log.

Structure of DBMS:
 The structure of a DBMS based on the Relational Data Model.
 The DBMS acts as an interface between the user and the database. The user requests the DBMS to
perform various operations such as insert, delete, update and retrieval on the database.
 The components of DBMS perform these requested operations on the database and provide
necessary data to the users.
The DBMS accepts SQL commands generated from a variety of user interfaces.
The Query Evaluation Engine evaluates and executes a plan against the database, and returns the answer.
1. When a user issues a query, the parser, parses the given request i/p into machine understandable
language, and query optimizer optimizes and produces a plan for evaluating the query like how the
data is stored to produce an efficient plan for evaluating the query
2. An execution plan is a blueprint for evaluating a query, represented as a tree of relational operators
(operators serves as a building blocks for evaluating queries posed against data, which brings the
needed data to the main memory)

Page 14
DBMS Dept of CSE

1. Plan executor uses operator evaluator to evaluate the plan


2. The code that implements relational operators are on top of the file and access method.
3. This information is taken by file and access method for accessing the data which is requested by the
user that is present in the file system.
4. Buffer manager takes the responsibility for taking the data from disk to main memory for execution.
5. Disk space manager takes the responsibility by providing space in the disk when the data is
modified.

Structure of DBMS

6. Transaction manager ensures that transactions request and release locks according to a locking
protocol and schedules the execution transactions
7. Lock manager keeps track of request of locks and grants on database objects when they are available.
8. DBMS supports Concurrency control and Crash Recovery by carefully scheduling user requests and
maintaining a log of all changes to the database.
9. Recovery manager maintains a log and restores the system to a consistent state when a crash occurs.

Page 15
DBMS Dept of CSE

People who access DBMS:


 Users (End users, Naïve users, Sophisticated users)
 Data Implementers
 Application Programmers
 DBA
 Users: A primary goal of database system is to store the
 Data and retrieve the particular data from database whenever needed.
1. End users: users interact with system without writing programs, they generate request what they
want from system
2. Naïve users: who interact with system by invoking one of the application programs that have been
written previously. (ex: ATM, user simply enters PIN and then checks their account status or
withdraw the amount by invoking the application program ATM, but he/she doesn’t bother about the
appl. Program.
3. Sophisticated users/Data Analyst: uses SQL to generate answers for complex queries, they use data
mining tools and Online Analytical processing tools
4. Data Implementers: These are the people who actually develop DBMS software.
5. Application Programmers: These people develop packages that facilitate data access for end users
(various applications like university, Railways, Bank, etc.,)
6. Database Administrator: DBA takes the responsibility of designing and maintaining the databases

Responsibilities of DBA:
 DBA designs the conceptual schema (what relations to store) and physical schema (how to store
them)
 DBA ensures that unauthorized data access is not permitted.
 DBA ensures security by granting permissions to different users to access the only certain views
and relations.
 DBA ensures crash recovery and takes necessary steps to restore the data to a consistent state.
DBA ensures data tuning i.e, takes the responsibility for modifying the database, in particular
conceptual and physical schema (basing on users’ requirements)

Page 16
DBMS

MODULE 1
Chapter-2 Data Model

Data Model
 A Data Model is a logical structure of Database. It describes the design of database to reflect
entities, attributes, relationship among data, constrains etc. that determine how data can be
stored and accessed.
 Data models define how the logical structure of a database is modeled. Data Models are
fundamental entities to introduce abstraction in a DBMS.
 Data models define how data is connected to each other and how they are processed and
stored inside the system.
 Data models can facilitate interaction among the designer, the applications programmer, and
the end user.
 Depending on the levels of data, data model divided into 3 categories
1. Object Based
2. Physical and
3. Record based Data models.

1. Object based Data Models


 Object based Data Models are based on real world objects. It is designed using the entities in
the real world, attributes of each entity and their relationship. It picks up each thing/object in
the real world which is involved in the requirement.
 There are two types of object based data Models
a. Entity Relationship Model and
b. Object oriented data model.

Page 1
DBMS

Entity Relationship data model

 It is one of the important data model which forms the basis for the all the designs in the
database world. It defines the mapping between the entities in the database.
 ER model represents the all these entities, attributes and their relationship in the form of
picture to make the developer understand the system better.

Advantages

1. It makes the requirement simple and easily understandable by representing simple


diagrams.
2. One can covert ER diagrams into record based data model easily.
3. Easy to understand ER diagrams

Disadvantages

1. No standard notations are available for ER diagram. There is great flexibility in the
notation. It’s all depends upon the designer, how he draws it.
2. It is meant for high level designs. We cannot simplify for low level design like coding.

Object oriented data model


 Along with the mapping between the entities, describes the state of each entity and the
tasks performed by them.
 It considers each object in the world as objects and isolates it from each other. It groups
its related functionalities together and allows inheriting its functionality to other related
sub-groups.
 In this database we have different types of employees – Engineer, Accountant, Manager,
Clark. But all these employees belong to Person group. Person can have different
attributes like name, address, age and phone. What do we do if we want to get a person’s

Page 2
DBMS

address and phone number? We write two separate procedure sp_getAddress and
sp_getPhone.

Advantages
 Because of its inheritance property, we can re-use the attributes and functionalities. It
reduces the cost of maintaining the same data multiple times. Also, these informations are
encapsulated and, there is no fear being misused by other objects.
 If we need any new feature we can easily add new class inherited from parent class and
adds new features. Hence it reduces the overhead and maintenance costs.
 Because of the above feature, it becomes more flexible in the case of any changes.
 Codes are re-used because of inheritance.
 Since each class binds its attributes and its functionality, it is same as representing the
real world object. We can see each object as a real entity. Hence it is more
understandable.

Disadvantages
 It is not widely developed and complete to use it in the database systems. Hence it is not
accepted by the users.
 It is an approach for solving the requirement. It is not a technology. Hence it fails to put it
in the database management systems.

Page 3
DBMS

2. Physical Data Models


 Physical data model represent the model where it describes how data are stored in computer
memory, how they are scattered and ordered in the memory, and how they would be
retrieved from memory.
 Physical data model represents the data at data layer or internal layer. It represents each table,
their columns and specifications, constraints like primary key, foreign key etc.
 It is represented as UML diagram along with table and its columns. Primary key is
represented at the top. The relationship between the tables is represented by interconnected
arrows from table to table.
 Tables and its specifications – table names and their columns. Columns are represented
along with their data types and size.
 In addition primary key of each table is shown at the top of the column list.
 Foreign keys are used to represent the relationship between the tables. Mapping between
the tables are represented using arrows between them.

3. Record based Data Models


 These data models are based on application and user levels of data. They are modeled
considering the logical structure of the objects in the database. This data models defines the
actual relationship between the data in the entities.
 There are 3 types of record based data models
a. Hierarchical data model
b. Network data model and
c. Relational data models.
 Most widely used record based data model is relational data model.

Page 4
DBMS

a. Hierarchical Data Models


 In this data model, the entities are represented in a hierarchical fashion(tree like structure).
Here we identify a parent entity, and its child entity. Again we drill down to identify next
level of child entity and so on.
 Company is the parent and rests of them are its children. Department has employees and
project as its children and so on. This type of data modeling is called hierarchical data model.

 This type of relationship is best defined for 1:N type of relationships.


 Example: One company has multiple departments (1:N), one company has multiple suppliers
(1:N),one department has multiple employees (1:N), each department has multiple
projects(1:N) .
 If we have M:N relationships, then we have to duplicate the entities and show it in the
diagram.

Advantages

 This model groups the related data into tables and defines the relationship between the
tables, which is not addressed in flat files.

Page 5
DBMS

Disadvantages

 Redundancy: - In such case, we have to store same project information for more than
one department. This is duplication of data and hence a redundancy. So, this model does
not reduce the redundancy issue to a significant level.

 It fails to handle many to many relationships efficiently. It results in redundancy and


confusion. It can handle only parent-child kind of relationship.

 If we need to fetch any data in this model, we have to start from the root of the model and
traverse through its child till we get the result. In order to perform the traversing, either
we should know well in advance the layout of model or we should be very good
programmer. Hence fetching through this model becomes bit difficult.

b. Network Data Models


 It is designed to address the drawbacks of the hierarchical model. It helps to address M:N
relationship.
 This data model is also represented as hierarchical, but this model will not have single
parent concept. Any child in the tree can have multiple parents here.

Advantages
 Accessing the records in the tables is easy since it addresses many to many relationships.
 One can easily navigate among the tables and get any data.
 It is designed based on database standards – ANSI/SP ARC.

Page 6
DBMS

Disadvantages
 If there is any requirement for the changes to the entities, it requires entire changes to the
database.
 There is no independence between any objects. Hence any changes to the any of the object
will need changes to the whole model. Hence difficult to manage.
 It would be little difficult to design the relationship between the entities, since all the entities
are related in some way.

c. Relational Data Models


 This model is designed to overcome the drawbacks of hierarchical and network models.
 This models define how they are structured in the database physically and how they are inter-
related. It purely based on how the records in each table are related.
 It purely isolates physical structure from the logical structure. Logical structure is defines
records are grouped and distributed.

 A relational data model revolves around 5 important rules.

1. Order of rows / records in the table is not important.


Example, displaying the records for Joseph is independent of displaying the records for
Rose or Mathew in Employee table. It does not change the meaning or level of them.
Each record in the table is independent of other.
Similarly, order of columns in the table is not important.
That means, the value in each column for a record is independent of other. For example,
representing DEPT_ID at the end or at the beginning in the employee table does not have
any affect.
2. Each record in the table is unique. That is there is no duplicate record exists in the table.

Page 7
DBMS

This is achieved by the use of primary key or unique constraint.


3. Each column/attribute will have single value in a row. For example, in Department table,
DEPT_NAME column cannot have ‘Accounting’ and ‘Quality’ together in a single cell.
Both has to be in two different rows as shown above.
4. All attributes should be from same domain. That means each column should have
meaningful value. For example, Age column cannot have dates in it. It should contain
only valid numbers to represent individual’s age. Similarly, name columns should have
valid names, Date columns should have proper dates.
5. Table names in the database should be unique. In the database, same schema cannot
contain two or more tables with same name. But two tables with different names can have
same column names. But same column name is not allowed in the same table.

Advantages
 Structural independence:- Any changes to the database structure, does not affect the
way we are accessing the data.
 Simplicity: - This model is designed based on the logical data. It does not consider how
data are stored physically in the memory.
 Because of simplicity and data independence, this kind of data model is easy to maintain
and access.
 This model supports structured query language – SQL.

Disadvantages
Compared to the advantages above, the disadvantages of this model can be ignored.
 High hardware cost
 Sometimes, design will be designed till the minute level, which will lead to complexity in
the database.
Importance of data model :
1. Higher quality. A data model helps define the problem, enabling us to consider different

Approaches and choose the best one.

2. Reduced cost. You can build applications at lower cost via data models. Data modeling
typically consumes less than 10 percent of a project budget, and can reduce the 70 percent of
budget that is typically devoted to programming. The models promote clarity of thought and
provide the basis for generating much of the needed database and programming code.

Page 8
DBMS

3. Quicker time to market. You can also build software faster by catching errors early. In
addition, a data model can automate some tasks – design tools can take a model as an input and
generate the initial database structure, as well as some data access code.

4. Clearer scope. A data model provides a focus for determining scope. It provides something
tangible to help business sponsors and developers agree over precisely what is included with the
software and what is omitted.

5. Faster performance. A sound model simplifies database tuning. A well-constructed database


typically runs fast, often quicker than expected.

6. Better documentation. Models document important concepts and jargon, proving a basis for
long-term maintenance.

7. Fewer application errors. A data model causes participants to crisply define concepts and
resolve confusion. As a result, application development starts with a clear vision. Developers can
still make detailed errors as they write application code, but they are less likely to make deep
errors that are difficult to resolve.

8. Managed risk. You can use a data model to estimate the complexity of software, and gain
insight into the level of development effort and project risk. We should consider the size of a
model, as well as the intensity of inter-table connections.

9. A good start for data mining. The documentation inherent in a model serves as a starting
point for analytical data mining.

Page 9
Database Management System

MODULE 1 : CHAPTER 3
ENTITY-RELATIONSHIP MODEL
(ER-MODEL)
Entity-Relationship (ER) model :
 Which is a popular high-level conceptual data model, this model and its variations are
frequently used for the conceptual design of database applications, and many database design
tools employ its concepts.
 ER-Models are used in the design of conceptual schemas for database applications. The
diagrammatic notation associated with the ER model, known as ER diagrams.
ENTITY TYPES, ENTITY SETS, ATTRIBUTES, AND KEYS

 The ER model describes data as entities, relationships, and attributes.


 ENTITIES
 The basic object that the ER model represents is an entity, which is a thingin the real
world with an independent existence.
 An entity may be an object with a physical existence, such as an employee or it may
be an object with a conceptual existence such as a company.
 Examples: STUDENT, CAR, COMPANY, UNIVERSITY, etc.

 ATTRIBUTES
 Each entity has attributes, which are the particular properties that describe it.
 Example: EMPLOYEE entity may be described by the employee’s name, age,
address, salary, and job.
 A particular entity will have a value for each of its attributes.
 The below Figure shows two entities and the values of their attributes. The
EMPLOYEE entity e1 has four attributes: Name, Address, Age, and Home_phone
and their values are ‘John Smith,’ ‘2311 Kirby, Houston, Texas 77001’, ‘55’, and
‘713-749-2630’, respectively.
 The COMPANY entity c1 has three attributes: Name, Headquarters, and President
and their values are ‘Sunco Oil’, ‘Houston’, and ‘John Smith’, respectively.

Page 1
Database Management System

TYPES OF ATTRIBUTES
1. Simple attributes.
2. Composite attributes.
3. Single valued attributes.
4. Multi valued attributes.
5. Stored attributes.
6. Derived attributes

 Simple Attribute:
 An attribute that cannot be divided into smaller independent attribute is known as atomic
attribute.
 Example: assume Student is an entity and its attributes are Name, Age, Address and
Phone no. Here the age (attribute) of student (entity) cannot further divide. In this
example age is atomic attribute.

 Composite attribute:
 An attribute that can be divided into smaller independent attribute is known ascomposite
attribute.
 Example: assume Student is an entity and its attributes are Name, Age, Address and
Phone no. Here the address (attribute) of student (entity) can be further divide into House
no, city and so on. In this example address is not atomic attribute.
 The notation for this:

Page 2
Database Management System

 Single valued attribute:


 An attribute that has only single value for an entity is known as single valued
attribute.
 Example: Any manufactured product can have only one serial no, but the single
valued attribute cannot be simple valued attribute because it can be subdivided.
Likewise in the above example the serial no. can be subdivided on the basis of region,
part no. etc.

 Multi valued attribute:

 An attribute that can have multiple values for an entity is known as multi valued
attribute.
 Example: assume Student is an entity and its attributes are Name, Age, Address and
Phone no. Here the Phone no (attribute) of student (entity) can have multiple value
because a student may have many phone numbers. Here, Phone no is multi valued
attribute.
 The notation for this:

 Stored attribute
 An attribute that cannot be derived from another attribute is known as stored attribute.
 Example: birth date cannot derive from age of student.

 Derived attribute
 An attribute that can be derived from another attribute is known as derived attribute.
 Example: age cannot derive from birth date of student.
 The notation for this:

Some other types of attributes are

Page 3
Database Management System

 Null valued attribute:


 An attribute, which has not any value for an entity is known as null valued attribute.
 Example: assume Student is an entity and its attributes are Name, Age, Address and
Phone no. There may be chance when a student has no phone no. In that case, phone
no is called null valued attributes.
 Key attribute:
 An attribute that has unique value of each entity is known as key attribute.
 Example: every student has unique roll no. Here roll no is key attribute.

 Complex Attributes:
 Composite and Multivalued attributes can be nested arbitrarily. Such attributes are
called as complex attributes.
 Composite attributes can be represented by grouping components of a composite
attribute between parentheses () and separating the components with commas, and
by displaying multivalued attributes between braces { }.
 Example: if a person can have more than one residence and each residence can
have a single address and multiple phones, an attribute Address_phone for a
person can be specified as shown in below Figure 7.5. Both Phone and Address
are themselves composite attributes.

Generalization v/s Specialization


 The process of creation of group from various entities is called generalization.
 It is Bottom-up approach.

Page 4
Database Management System

 The process of taking the union of two or more lower level entity sets to produce a higher level
entity set.
 It starts from the number of entity sets and creates high level entity set using some common features.
 The process of creation of sub-groups within an entity is called specialization.
 It is Top-down approach.

 The process of taking a sub set of higher level entity set to form a lower level entity set.
 It starts from a single entity set and creates different low level entity sets using some different
features.

Data Model Basic Building Blocks


 The basic building blocks of all data models are entities, attributes, relationships and
constraints.
 An entity is real world thing, such as a person, place, thing, or event, about which data
are to be collected and stored.
Example: CUSTOMER, STUDENT, EMPLOYEE, DEPARTMENT
 An attribute is a characteristic of an entity.
Example: a CUSTOMER entity would be described by attributes such as customer last
name, customer first name, customer phone, customer address, and customer credit limit.
 A relationship describes an association among (two or more) entities.
Example: A relationship between customers and agents that may be described as “an
agent can serve many customers and each customer may be served by one agent. Here
serve is a relationship.

Page 5
Database Management System

 Data models use three types of relationships: one-to-many, many-to-many, and one-to-one.

The following examples illustrate the distinctions among the three.


1. One-to-many (1:M) relationship: Is where one occurrence in an entity relates to
many occurrences in another entity.
 A painter paints many different paintings, but each one of them is painted by only
one painter. Thus the painter (the “one”) is related to the paintings (the “many”).
Therefore, database designers label the relationship “PAINTER paints
PAINTING” as 1:M. Similarly.
 A customer (the “one”) might generate many invoices, but each invoice (the
“many”) is generated by only a single customer. The “CUSTOMER generates
INVOICE” relationship would also be labeled 1:M.
2. Many-to-many (M:N or M:M) relationship: This is where many occurrences in an
entity relate to many occurrences in another entity.
 An employee might learn many job skills, and each job skill might be learned by
many employees. Database designers label the relationship “EMPLOYEE learns
SKILL” as M: N.
 A student can take many classes, and each class can be taken by many students,
thus yielding the M:N relationship label for the relationship expressed by
“STUDENT takes CLASS.”
3. One-to-one (1:1) relationship: This is where one occurrence of an entity relates to
only one occurrence in another entity.
 A retail company’s management structure may require that each one of its stores
be managed by a single employee. In turn, each store manager—who is an
employee—only manages a single store.

 Constraint: is a restriction placed on the data. Constraints are important because they
help to ensure data integrity. And it's normally expressed in the form of rules.
Example: An employee's salary must have values that are between 6000 and 350000 Each class must have
one and only one teacher

Entity Types and Entity Sets

 An entity type defines a collection (or set) of entities that have the same attributes. Each
entity type in the database is described by its name and attributes.
 Example: a company employing hundreds of employees may want to store
similar information concerning each of the employees. These employee entities
share the same attributes, but each entity has its own value(s) for each attribute.
 The collection of all entities of a particular entity type in the database is called an entity
set. The entity set is usually referred to using the same name as the entity type.
 Example: EMPLOYEE refers to both a type of entity as well as the current set of
Page 6
Database Management System

all employee entities in the database.


 The collection of entities of a particular entity type is grouped into an entity set, which is
also called the extension of the entity type.
 Figure 7.6 shows two entity types: EMPLOYEE and COMPANY, and a list of some of
the attributes, along with the values of their attributes.


Key Attributes of an Entity Type
 Key plays an important role in database; it is used for identifying unique rows from
entity. It also establishes relationship entity set.
 An entity type usually has one or more attributes whose values are distinct for each
individual entity in the entity set. Such an attribute is called a key attribute, and its
values can be used to identify each entity uniquely.
 Example: the Name attribute is a key of the COMPANY entity type in above Figure 7.6
because no two companies are allowed to have the same name. For the PERSON entity
type, a typical key attribute is SSN (Social Security number), for STUDENT entity s_id
is a key attribute, for CAR entity type registration_no is key attribute.
 In ER diagrammatic notation, each key attribute has its name underlined inside the oval.
 In ER notation, if two attributes are underlined separately, then each is a key on its own.
 Some entity types have more than one key attribute. For example, each of the Vehicle_id
and Registration attributes of the entity type CAR .
 The Registration attribute is a composite key formed from two simple component
attributes, State and Number, neither of which is a key on its own.

Page 7
Database Management System

Entity-Relationship(ER) Diagram Notations :


Symbols Meaning

Relationship types, relationship sets and constraints on relationship


Relationship Types, Sets and Instances
 A Relationship is a association among two or more entity sets.
 Example: in a company database the attribute Manager of DEPARTMENT refers
to an employee who manages the department and the attribute
Controlling_department of PROJECT refers to the department that controls the
project and so on. This control and manages are the example for relationship.

Page 8
Database Management System

 A Relationship set is a set of relationships of the same type.


 A relationship type R among n entity types E1, E2, ..., En defines a set of associations
among entities from these entity types.
 The relationship set R is a set of relationship instances ri, where each ri associates n
individual entities (e1, e2, ..., en), and each entity ej in ri is a member of entity set Ej,
 Each of the entity types E1, E 2, ..., En is said to participate in the relationship type R;
similarly, each of the individual entities e1, e2, ..., en is said to participate in the
relationship instance ri = (e1, e2, ..., en).
 Example: consider a relationship type WORKS_FOR between the two entity types
EMPLOYEE and DEPARTMENT, which associates each employee with the
department for which the employee works in the corresponding entity set. Each
relationship instance in the relationship set WORKS_FOR associates one EMPLOYEE
entity and one DEPARTMENT entity.
 In Figure 7.9 each relationship participate in ri. Employees e1, e3, and e6 work for
department d1; employees e2 and e4 work for department d2; and employees e5 and e7
work for department d3.

Page 9
Database Management System

Relationship Degree
 The Degree of a Relationship Type is the number of participating entity types.
 3 different types of Degree of relationship
 Unary Relationship
 Binary Relationship
 Ternary relationship
 Unary Relationship: if number of participating entity type is only one then Its degree is
one. A relationship type of degree one is called Unary Relationship.
 Each entity type that participates in a relationship type plays a particular role in the
relationship. The role name signifies the role that a participating entity from the entity
type plays in each relationship instance
 For example, in the WORKS_FOR relationship type, EMPLOYEE plays the role of
employee or worker and DEPARTMENT plays the role of department or employer.
 In some cases the same entity type participates more than once in a relationship type
in different roles. In such cases the role name becomes essential for distinguishing the
meaning of the role that each participating entity plays. Such relationship types are
called recursive relationships.
 Example: The SUPERVISION relationship type relates an employee to a supervisor,
where both employee and supervisor entities are members of the same EMPLOYEE
entity set. Hence, the EMPLOYEE entity type participates twice in SUPERVISION:
once in the role of supervisor (or boss), and once in the role of supervisee (or
subordinate). In Figure 7.11, the lines marked ‘1’ represent the supervisor role, and
those marked ‘2’ represent the supervisee role.

Page 10
Database Management System

 Binary Relationship: if number of participating entity type is two then Its degree is
two. A relationship type of degree two is called Binary Relationship.
 Example: the WORKS_FOR relationship type is of degree two since two entity types
EMPLOYEE and DEPARTMENT participate.

 Ternary relationship : if number of participating entity type is three then Its degree is
three. A relationship type of degree three is called Ternary relationship
 Example of a ternary relationship is SUPPLY, shown in Figure in below 7.10, where
each relationship instance ri associates three entities—a supplier s, a part p, and a
project j, whenever s supplies part p to project j.

Page 11
Database Management System

Constraints on Binary Relationship Types


 Relationship types usually have certain constraints that limit the possible combinations of
entities that may participate in the corresponding relationship set.
 For example, if the company has a rule that each employee must work for exactly one
department, then we would like to describe this constraint in the schema.
 There are two main types of binary relationship constraints:
1. cardinality ratio and
2. participation Constraints
 The cardinality ratio for a binary relationship specifies the maximum number of relationship
instances that an entity can participate in.
 Example: in the WORKS_FOR binary relationship type, DEPARTMENT : EMPLOYEE is
of cardinality ratio 1:N, meaning that each department can be related to (that is, employs) any
number of employees, but an employee can be related to (work for) only one department.
 The possible cardinality ratios for binary relationship types are 1:1, 1:N, N:1, and M:N.
 Example: 1:1 binary relationship is MANAGES shown in below Figure 7.12, which relates
a department entity to the employee who manages that department. This represents the
constraints that an employee can manage one department only and a department can have one
manager only.

Page 12
Database Management System

 The relationship type WORKS_ON shown in below Figure 7.13 is of cardinality ratio
M:N, because the rule is that an employee can work on several projects and a project can
have several employees.

 Consider a relationship type WORKS_FOR is of cardinality ratio N:1, because the rule is
that an many employees can work for same department and a department can have
several employees.

Page 13
Database Management System

Participation Constraints and Existence Dependencies


 The participation constraint specifies whether the existence of an entity depends on its
being related to another entity via the relationship type. This constraint specifies the
minimum number of relationship instances that each entity can participate in, and is
sometimes called the minimum cardinality constraint.
 There are two types of participation constraints
1. Total participation constraints and
2. Partial participation constraints
 Example : If a company policy states that every employee must work for a department,
then an employee entity can exist only if it participates in at least one WORKS_FOR
relationship instance (Figure 7.9). Thus, the participation of EMPLOYEE in
WORKS_FOR is called total participation, meaning that every entity in the total set of
employee entities must be related to a department entity via WORKS_FOR. Total
participation is also called existence dependency.
 In Figure 7.12 we do not expect every employee to manage a department, so the
participation of EMPLOYEE in the MANAGES relationship type is partial, meaning
that some or part of the set of employee entities are related to some department entity via
MANAGES, but not necessarily all.
 In ER diagrams, total participation (or existence dependency) is displayed as a double
line connecting the participating entity type to the relationship, whereas partial
participation is represented by a single line.

Weak Entity Types


 Entity types that do not have key attributes of their own are called weak entity types.
 In contrast, regular entity types that do have a key attribute are called strong entity
types.
 Entities belonging to a weak entity type are identified by being related to specific entities
from another entity type in combination with one of their attribute values.
 This other entity type is the identifying or owner entity type, and the relationship type
that relates a weak entity type to its owner is the identifying relationship of the weak
entity type.

Page 14
Database Management System

 A weak entity type always has a total participation constraint (existence dependency)
with respect to its identifying relationship because a weak entity cannot be identified
without an owner entity.
 Example: entity type DEPENDENT, related to EMPLOYEE, which is used to
keep track of the dependents of each employee via a 1:N relationship (Figure 7.2).
DEPENDENT is the weak entity type, EMPLOYEE is the owner Entity type and
DEPENDENTS of is the identifying relationship.
 A weak entity type normally has a partial key, which is the attribute that can uniquely
identify weak entities that are related to the same owner entity.
 Example: if we assume that no two dependents of the same employee ever have
the same first name, the attribute Name of DEPENDENT is the partial key.

Attributes of Relationship Types


 Relationship types can also have attributes, similar to those of entity types. Those
attributes are called Descriptive attributes.
 Descriptive attributes are used to record information about the relationship, rather than
about any one of the participating entities;
 Example: Joseph works in the pharmacy department as of January 1991. This
information is captured by adding an attribute, since, to Works_In.
 A relationship must be uniquely identified by the participating entities, without reference
to the descriptive attributes.
 In the Works_In relationship set, for example, each Works_In relationship must be
uniquely identified by the combination of employee ssn and department d'id. Thus, for a
given employee-department pair, we cannot have more than one associated since value.

Page 15
Database Management System

 Another example, to record the number of hours per week that an employee works on a
particular project, we can include an attribute Hours for the WORKS_ON relationship
type. Another example is to include the date on which a manager started managing a
department via an attribute Start_date for the MANAGES relationship type.

Keys
 Any attribute in the table which uniquely identifies each record in the table is called
key. It can be a single attribute or a combination of attributes.
 For example, in STUDENT table, STUDENT_ID is a key, since it is unique for each
student.
 In PERSON table, his passport number, driving license number, phone number, SSN, email
address is keys since they are unique for each person.

Why we need a Key?

 In real world applications, number of tables required for storing the data is huge, and the
different tables are related to each other as well.

Page 16
Database Management System

 Also, tables store a lot of data in them. Tables generally extends to thousands of records stored
in them, unsorted and unorganized.

 Now to fetch any particular record from such dataset, you will have to apply some conditions,
but what if there is duplicate data present and every time you try to fetch some data by
applying certain condition, you get the wrong data.

 How many trials before you get the right data? To avoid all this, Keys are defined to easily
identify any row of data in a table.

PRIMARY KEY

 It is the first and foremost key which is used to uniquely identify a record.
 It can be a single attribute or a combination of attributes. For an entity, there could be
multiple keys as we saw in PERSON table.
 Most suitable key from those lists becomes a primary key.
 In the Person table above, we can select SSN as primary key, since it is unique for each
person.
 We can even select Passport Number or license number as primary key as they are also
unique for a person.
 However, selection of primary key for each entity is based on requirement and
developer.
 For a student, STUDENT_ID is a primary key and for an employee EMPLOYEE_ID is a

primary key.

CANDIDATE KEY
 The minimal set of attribute which can uniquely identify a tuple is known as candidate key.
 In the example, an employee is identified by his ID in his office. Apart from his ID, does he
have any other unique keys, so that he can be identified from others?
 Yes, he has passport number, PAN number, SSN number (if applicable), driving license
Page 17
Database Management System

number, email address etc. These are also identifies specific person uniquely.

 But we can choose any one of these unique attribute as primary key in the table. Rest of the
attributes, which holds as strong as primary key are considered as Candidate key/secondary
key.
 In our example of employee table, EMPLOYEE_ID is best suited for primary key as its from
his own employer. Rest of the attributes like passport number, SSN, license Number etc are
considered as candidate key.


SUPER KEY
 Super Key is defined as a set of attributes within a table that can uniquely identify each record
within a table. Super Key is a superset of Candidate key.


 In the table super key would include student_id, (student_id, name), phone etc.The student_id
is unique for every row of data, hence it can be used to identity each row uniquely.
 Next comes, (student_id, name), now name of two students can be same, but their
student_id can't be same hence this combination can also be a key.

 Similarly, phone number for every student will be unique, hence again, phone can also be a
key. So they all are super keys.

Page 18
Database Management System

FOREIGN KEY
 In a company there would be different departments - Accounting, Human Resource (HR),
development, Quality, etc.
 An employee, who works for that company, works in specific department. But we know
that employee and department are two different entities.
 So we cannot store his department information in employee table. Instead what we do is we
link these two tables by means of primary key of one of the table i.e.;
 In this case, we pick the primary key of department table - DEPARTMENT_ID and add it
as a new attribute/column in the Employee table.
 Now DEPARTMENT_ID is a foreign key for Employee table, and both the tables are
related!


 FEATURES OF ER-MODEL

1. KEY CONSTRAINTS
 Let’s consider relationship set called Manages between the Employees and Departments
entity sets such that each department has at most one manager, although a single employee
is allowed to manage more than one department.

 The restriction that each department has at most one manager is an example of a key
constraint, and it implies that each Departments entity appears in at most one Manages
relationship in any allowable instance of Manages.
 This restriction is indicated in the ER diagram of Figure below by using an arrow from
Departments to Manages. Intuitively, the arrow states that given a Departments entity, we
can uniquely determine the Manages relationship in which it appears.

Page 19
Database Management System

Fig: Key constraints on Manager

2. KEY CONSTRAINTS FOR TERNARY RELATIONSHIPS:


 If an entity set E has a key constraint in a relationship set R, each entity in an instance of E
appears in at most one relationship in (a corresponding instance of) R.
 To indicate a key constraint on entity set E in relationship set R, we draw an arrow from E
to R.
 In Figure, we show a ternary relationship with key constraints.
 Each employee works in at most one department and at a single location.

3. PARTICIPATION CONSTRAINTS:
 The key constraint on Manages tells us that a department has at most one manager. A natural
question to ask is whether every department has a Manager.
 Let us say that every department is required to have a manager. This requirement is an example
of a participation constraint.
 The participation of the entity set Departments in the relationship set Manages is said to be total.
A participation that is not total is said to be partial.
 As an example, the participation of the entity set Employees in Manages is partial, since not
every employee gets to manage a department.

Page 20
Database Management System

4. WEAK ENTITIES:
 A weak entity can be identified uniquely only by considering some of its attributes in conjunction
with the primary key of another entity, which is called the identifying owner.
 The following restrictions must hold:
 The owner entity set and the weak entity set must participate in a one -to-many relationship
set (one owner entity is associated with one or more weak entities, but each weak entity has a
single owner). This relationship set is called the identifying relationship set of the weak
entity set.
 The weak entity set must have total participation in the identifying relationship set.

Fig: weak entity set


5. CLASS HIERARCHIES/ISA (`is a’) Hierarchies:
 As in C++, or other PLs, attributes are inherited.
 If we declare A ISA B, every A entity is also considered to be a B entity. (Query
answers should reflect this: unlike C++ )
 The entity set Employees may also be classified using a different criterion.
 For example, we might identify a subset of employees as Senior_Emps.
 We can modify Figure to reflect this change by adding a second ISA node as a child of
Employees and making Senior_Emps a child of this node.
 Each of these entity sets might be classified further, creating a multilevel ISA
hierarchy.

Page 21
Database Management System

 Overlap constraints: Can Joe be anHourly_Emps as well as a Contract_Emps entity?


(Allowed/disallowed)
 Covering constraints: Does every Employees entity also have to be anHourly_Emps or a
Contract_Emps entity? (Yes/no)
 Reasons for using ISA:

– To add descriptive attributes specific to a subclass.

– To identify entities that participate in a relationship.

6. AGGREGATION:

 Used when we have to model a relationship involving (entity sets and) a


relationship set.
 This is illustrated in Figure, with a dashed box around Sponsors (and its
participating entity sets) used to denote aggregation.
 This effectively allows us to treat Sponsors as an entity set for purposes ofdefining
the Monitors relationship set.
 Aggregation allows us to treat a relationship set as an entity set for purposes of
participation in (other) relationships.

Page 22
Database Management System

 Monitors mapped to table like any other relationship set.

Aggregation vs. ternary relationship:

 Monitors are a distinct relationship, with a descriptive attribute.


 Also, can say that each sponsorship is monitored by at most one employee.

CONCEPTUAL DESIGN OF THE COMPANY DATABASE


 Figure 7.8 shows initial conceptual schema for the company database with 4 entity types
DEPARTMENT, PROJECT, EMPLOYEE and DEPENDENT.
 An entity type DEPARTMENT with attributes Name, Number, Locations, Manager, and
Manager_start_date. Locations is the only multivalued attribute. We can specify that both
Name and Number are (separate) key attributes because each was specified to be unique.
 An entity type PROJECT with attributes Name, Number, Location, and
Controlling_department. Both Name and Number are (separate) key attributes.
 An entity type EMPLOYEE with attributes Name, Ssn, Sex, Address, Salary, Birth_date,
Department, and Supervisor. Both Name and Address may be composite attributes
 An entity type DEPENDENT with attributes Employee, Dependent_name, Sex,
Birth_date, and Relationship (to the employee).

Page 23
Database Management System

Specify the following relationship types:


 MANAGES, a 1:1 relationship type between EMPLOYEE and DEPARTMENT.
EMPLOYEE participation is partial. DEPARTMENT participation is not clear from the
requirements. We question the users, who say that a department must have a manager at
all times, which implies total participation. The attribute Start_date is assigned to this
relationship type.
 WORKS_FOR, a 1:N relationship type between DEPARTMENT and EMPLOYEE. Both
participations are total.
 CONTROLS, a 1:N relationship type between DEPARTMENT and PROJECT. The
participation of PROJECT is total, whereas that of DEPARTMENT is determined to be

Page 24
Database Management System

partial, after consultation with the users indicates that some departments may control no
projects.
 SUPERVISION, a 1:N relationship type between EMPLOYEE (in the supervisor role)
and EMPLOYEE (in the supervisee role). Both participations are determined to be
partial, after the users indicate that not every employee is a supervisor and not every
employee has a supervisor.
 WORKS_ON, determined to be an M:N relationship type with attribute Hours, after the
users indicate that a project can have several employees working on it. Both
participations are determined to be total.
 DEPENDENTS_OF, a 1:N relationship type between EMPLOYEE and DEPENDENT,
which is also the identifying relationship for the weak entity DEPENDENT is total.

Page 25
Database Management System

BUS RESERVATION ER DIAGRAM

Page 26
Database Management System

LIBRARY ER DIAGRAM

Page 27

You might also like