0% found this document useful (0 votes)
11 views36 pages

Unit I

It is about Database Management System.

Uploaded by

hajareomkar34
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views36 pages

Unit I

It is about Database Management System.

Uploaded by

hajareomkar34
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

Database System

Unit 1–Introduction
Data:
 Data is nothing but facts and statistics stored or free flowing over a network, generally it's raw
and unprocessed.
 Data becomes information when it is processed ,turning it into something
meaningful. Example:
Customer-1.cname,2.cno.3.city.

Record: Collection of related data items


Roll Name Age
1 ABC 19

Table or Relation: Collection of related records.


Roll Name Age
1 ABC 19
2 DEF 22
3 GHI 28
The columns of this relation are called Fields, Attributes or Domains. The rows are called
Tuples or Records.

Database:
The database is a collection of inter-related data which is used to retrieve, insert and delete the
data efficiently.
 It is also used to organize the data in the form of a table, schema, views, and reports, etc.
 Using the database, you can easily retrieve, insert, and delete the information.

1
 For example: The college Database organizes the data about the admin, staff, students and
faculty etc.

Database System:
It is computerized system, whose overall purpose is to maintain the information and to make that
the information is available on demand.
1. Database Management System (DBMS) and its applications:
A Database management system is a computerized record-keeping system. It is a repository or a
container for collection of computerized data files. The overall purpose of DBMS is to allow he
users to define, store, retrieve and update the information contained in the database on demand.
Information can be anything that is of significance to an individual or organization.
Databases touch all aspects of our lives. Some of the major areas of application are as follows:
1. Banking 2.Airlines 3.Universities 4.Manufacturing and selling 5.Human resources
 Enterprise Information
o Sales: For customer, product, and purchase information.
o Accounting: For payments, receipts, account balances, assets and other accounting
information.
o Human resources: For information about employees, salaries, payroll taxes, and benefits,
and for generation of paychecks.
o Manufacturing: For management of the supply chain and for tracking production of items in
factories, inventories of items in warehouses and stores, and orders for items.
o Online retailers: For sales data noted above plus online order tracking, generation of
recommendation lists, and maintenance of online product evaluations.
 Banking and Finance
o Banking: For customer information, accounts, loans, and banking transactions.
o Credit card transactions: For purchases on credit cards and generation of monthly
statements.
o Finance: For storing information about holdings, sales, and purchases of financial
instruments such as stocks and bonds; also for storing real-time market data to enable online
trading by customers and automated trading by the firm.
 Universities: For student information, course registrations, and grades (in addition to standard
enterprise information such as human resources and accounting).
 Airlines: For reservations and schedule information. Airlines were among the first to use
databases in a geographically distributed manner.
 Telecommunication: For keeping records of calls made, generating monthly bills,
maintaining balances on prepaid calling cards, and storing information about the
communication networks.
2. Purpose of Database Systems
 One way to keep the information on a computer was to store it in operating system files. To
allow users to manipulate the information, the system has a number of application programs that
manipulate the files, including programs to:
1. Add new students, instructors, and courses
2. Register students for courses and generate class rosters

2
3. Assign grades to students, compute grade point averages (GPA), and generate
transcripts
In the early days, database applications were built directly on top of file systems, which had
many disadvantages.
Disadvantages of file oriented approach:
1) Data redundancy and inconsistency:The same information may be written in several files.
This redundancy leads to higher storage and access cost. It may lead data inconsistency that is
the various copies of the same data may no longer agree. For example a changed customer
address may be reflected in single file but not else where in the system.
2) Difficulty in accessing data :The conventional file processing system do not allow data to
retrieved in a convenient and efficient manner according to user choice.
3) Data isolation :Because data are scattered in various file and files may be in different
formats,writing new application programs to retrieve the appropriate data is difficult.
4) Integrity Problems: The data values stored in the database must satisfy certain types of
consistency constraints(rules).Developers enforce data validation in the system by adding
appropriate code in the various application program. However when new constraints are added, it
isdifficult to change the programs to enforce them.
5) Atomicity:
It is difficult to ensure atomicity in a file processing system when transaction failure occurs due
to power failure, networking problems etc.(atomicity: either all operations of the transaction are
reflected properly in the database or non are)
Consider a program to transfer $500 from the account balance of department A to the account
balance of department B. If a system failure occurs during the execution of the program, it is
possible that the $500 was removed from the balance of department A but was not credited to the
balance of department B, resulting in an inconsistent database state. Clearly, it is essential to
database consistency that either both the credit and debit occur, or that neither occur. That is, the
funds transfer must be atomic—it must happen in its entirety or not at all. It is difficult to ensure
atomicity in a conventional file-processing system.
6) Concurrent access:In the file processing system it is not possible to access a same file for
transaction at same time
7) Security problems: There is no security provided in file processing system to secure the data
from unauthorized user access.

Function of DBMS:
1. Defining database schema: it must give facility for defining the database structure also
specifies access rights to authorized users.
2. Manipulation of the database: The dbms must have functions like insertion of record into
database updation of data, deletion of data, retrieval of data
3. Sharing of database:The DBMS must share data items for multiple users by maintaining
consistency of data.
4. Protection of database: It must protect the database against unauthorized users.
5. Database recovery: If for any reason the system fails DBMS must facilitate data base
recovery.

Advantages of DBMS:
1. Reduction of redundancies:

3
Data redundancy refers to the duplication of data (i.e storing same data multiple
times).Centralized control of data by the DBA avoids unnecessary duplication of data and
effectively reduces the total amount of data storage required avoiding duplication in the
elimination of the inconsistencies that tend to be present in redundant data files.
2. Sharing of data:
A database allows the sharing of data under its control by any number of application programs or
users.
3. Data Integrity:
Data integrity means that the data contained in the database is both accurate and consistent.
Therefore data values being entered for storage could be checked to ensure that they fall with in
a specified range and are of the correct format. For example: in customer database we can can
enforce an integrity that it must accept the customer only from Noida and Meerut city.
4. Data Security:
The DBA who has the ultimate responsibility for the data in the dbms can ensure that proper
access procedures are followed including proper authentication schemas for access to the DBS
and additional check before permitting access to sensitive data.
5. Data Consistency :
By eliminating data redundancy, we greatly reduce the opportunities for inconsistency. For
example: is a customer address is stored only once, we cannot have disagreement on the stored
values. Also updating data values is greatly simplified when each value is stored in one place
only. Finally, we avoid the wasted storage that results from redundant data storage.
6. Efficient Data Access :
In a database system, the data is managed by the DBMS and all access to the data is through the
DBMS providing a key to effective data processing
7. Concurrent Access and Crash Recovery:
A DBMS schedules concurrent accesses to the data in such a manner that users can think of the
data as being accessed by only one user at a time. The DBMS protects users from the effects of
system failures.
8. Reduced Application Development and Maintenance Time :
DBMS supports many important functions that are common to many applications, accessing data
stored in the DBMS, which facilitates the quick development of application.
9. Data Independence:
Data independence is usually considered from two points of views; physically data independence
and logical data independence.
Physical data Independence allows changes in the physical storage devices or organization of
the files to be made without requiring changes in the conceptual view or any of the external
views and hence in the application programs using the data base.
Logical data independence indicates that the conceptual schema can be changed without
affecting the existing external schema or any application program.

Disadvantages of DBMS
1) It is bit complex. Since it supports multiple functionality to give the user the best, the
underlying software has become complex. The designers and developers should have thorough
knowledge about the software to get the most out of it.
2) Because of its complexity and functionality, it uses large amount of memory. It also needs
large memory to run efficiently.

4
3) DBMS system works on the centralized system, i.e.; all the users from all over the world
access this database. Hence any failure of the DBMS, will impact all the users.
4) DBMS is generalized software, i.e.; it is written work on the entire systems rather specific
one. Hence some of the application will run slow.

3. View of Data
A database system is a collection of interrelated data and a set of programs that allow users to
access and modify these data. A major purpose of a database system is to provide users with an
abstract view of the data. That is, the system hides certain details of how the data are stored and
maintained.
Data Abstraction
For the system to be usable, it must retrieve data efficiently. The need for efficiency has led
designers to use complex data structures to represent data in the database. Since many database-
system users are not computer trained, developers hide the complexity from users through
several levels of abstraction, to simplify users‘ interactions with the system:

 Physical level (or Internal View / Schema): The lowest level of abstraction describes how
the data are actually stored. The physical level describes complex low-level data structures in
detail. This level is also responsible for allocating space to the data.
 Logical level (or Conceptual View / Schema): The next-higher level of abstraction describes
what data are stored in the database, and what relationships exist among those data. The
logical level thus describes the entire database in terms of a small number of relatively simple
structures. Although implementation of the simple structures at the logical level may involve
complex physical-level structures, the user of the logical level does not need to be aware of
this complexity. This is referred to as physical data independence. Database administrators,
who must decide what information to keep in the database, use the logical level of abstraction.
 View level (or External View / Schema): The highest level of abstraction describes only part
of the entire database. It is also called view level. The reason this level is called ―view‖ is
because several users can view their desired data from this level which is internally fetched
from database with the help of conceptual and internal level mapping.

5
The user doesn‘t need to know the database schema details such as data structure, table definition
etc. user is only concerned about data which is what returned back to the view level after it has
been fetched from database (present at the internal level). The view level of abstraction exists to
simplify their interaction with the system. The system may provide many views for the same
database.

Example:
An analogy to the concept of data types in programming languages may clarify the distinction
among levels of abstraction. Many high-level programming languages support the notion of a
structured type. For example, we may describe a record as follows:
type instructor = record
ID : char (5);
name : char (20);
dept name : char (20);
salary : numeric (8,2);
end;
This code defines a new record type called instructor with four fields. Each field has a name and
a type associated with it. A university organization may have several such record types,
including
• department, with fields dept_name, building, and budget
• course, with fields course_id, title, dept_name, and credits
• student, with fields ID, name, dept_name, and tot_cred
At the physical level, an instructor, department, or student record can be described as a block of
consecutivestorage locations. The compiler hides this level of detail from programmers.
Similarly, the database system hides many of the lowest-level storage details from database
programmers. Database administrators, on the other hand, may be aware of certain details of the
physical organization of the data.
At the logical level, each such record is described by a type definition, as in the previous code
segment, and the interrelationship of these record types is defined as well. Programmers using a
programming language work at this level of abstraction. Similarly, database administrators
usually work at this level of abstraction.
At the view level, computer users see a set of application programs that hide details of the data
types.At the view level, several views of the database are defined, and a database user sees some
or all of these views.
In addition to hiding details of the logical level of the database, the views also provide a security
mechanism to prevent users from accessing certain parts of the database. For example, clerks in
the university registrar office can see only that part of the database that has information about
students; they cannot access information about salaries of instructors.

Instances and Schemas


Databases change over time as information is inserted and deleted. The collection of information
stored in the database at a particular moment is called an instance of the database. The overall
design of the database is called the database schema.
Each variable has a particular value at a given instant. The values of the variables in a program at
a point in time correspond to an instance of a database schema. Database systems have several
schemas, partitioned according to the levels of abstraction. The physical schema describes the

6
database design at the physical level, while the logical schema describes the database design at
the logical level. A database may also have several schemas at the view level, sometimes called
subschemas, which describe different views of the database.

Example:
schema: instructor (ID, name, dept_name, salary)
Instance:

Data Models
The structure of a database is the data model: a collection of conceptual tools for describing data,
data relationships, data semantics, and consistency constraints. A data model provides a way to
describe the design of a database at the physical, logical, and view levels.

The data models can be classified into different categories:


Relational Model:The relational model uses a collection of tables to represent both data and the
relationships among those data. Each table has multiple columns, and each column has a unique
name. Tables are also known as relations. The relational model is an example of a record-based
model.
Record-based models are so named because the database is structured in fixed-format records of
several types. Each table contains records of a particular type. Each record type defines a fixed
number of fields, or attributes. The columns of the table correspond to the attributes of the record
type. The relational data model is the most widely used data model, and a vast majority of
current database systems are based on the relational model.

Entity-Relationship Model:The entity-relationship (E-R) data model uses a collection of basic


objects, called entities, and relationships among these objects.An entity is a ―thing‖ or ―object‖
in the real world that is distinguishable from other objects. The entity relationship
model is widely used in database design.

Object-Based Data Model: Object-oriented programming (especially in Java, C++, or C#) has
become the dominant software-development methodology. This led to the development of an
object-oriented data model that can be seen as extending the E-R model with notions of
encapsulation, methods (functions), and object identity. The object-relational data model
combines features of the object-oriented data model and relational data model.

Semi-structured Data Model:The semi-structured data model permits the specification of data
where individual data items of the same type may have different sets of attributes. This is in
contrast to the data models mentioned earlier, where every data item of a particular type must
have the same set of attributes. The Extensible Markup Language (XML) is widely used to
represent semi-structured data.

Hierarchical Data model:In a Hierarchical database, model data is organized in a tree-like


structure. Data is Stored Hierarchically (top down or bottom up) format. Data is represented

7
using a parent-child relationship. In Hierarchical DBMS parent may have many children, but
children have only one parent.

Network Model
The network database model allows each child to have multiple parents. It helps you to address
the need to model more complex relationships like as the orders/parts many-to-many
relationship. In this model, entities are organized in a graph which can be accessed through
several paths.

4. Database Languages
A database system provides a data-definition language to specify the database schema and a
data-manipulation language to express database queries and updates. In practice, the data
definition and data-manipulation languages are not two separate languages; instead they simply
form parts of a single database language, such as the widely used SQL language.

Data-Manipulation Language
A data-manipulation language (DML) is a language that enables users to access or manipulate
data as organized by the appropriate data model. The types of access are:
• Retrieval of information stored in the database
• Insertion of new information into the database
• Deletion of information from the database
• Modification of information stored in the database
There are basically two types:
• Procedural DMLs require a user to specify what data are needed and how to get those data.
• Declarative DMLs (also referred to as nonprocedural DMLs) require a user to specify what
data are needed without specifying how to get those data. Declarative DMLs are usually easier to

8
learn and use than are procedural DMLs. However, since a user does not have to specify how to
get the data, the database system has to figure out an efficient means of accessing data. A query
is a statement requesting the retrieval of information. The portion of a DML that involves
information retrieval is called a query language.
 To read records from table(s) – SELECT
 To insert record(s) into the table(s) – INSERT
 Update the data in table(s) – UPDATE
 Delete all the records from the table – DELETE

Data-Definition Language (DDL)


We specify a database schema by a set of definitions expressed by a special language called a
data-definition language (DDL). The DDL is also used to specify additional properties of the
data.We specify the storage structure and access methods used by the database system by a set of
statements in a special type of DDL called a data storage and definition language. These
statements define the implementation details of the database schemas, which are usually hidden
from the users.
The data values stored in the database must satisfy certain consistency constraints.
 To create the database instance – CREATE
 To alter the structure of database – ALTER
 To drop database instances – DROP
 To delete tables in a database instance – TRUNCATE
 To rename database instances – RENAME
 To drop objects from database such as tables – DROP

5. Relational Database

A relational database is a type of database. It uses a structure that allows us to identify and access
data in relation to another piece of data in the database. Often, data in a relational database is
organized into tables.

Tables: Rows and Columns


Tables can have hundreds, thousands, sometimes even millions of rows of data. These rows are
often called records.Tables can also have many columns of data. Columns are labeled with a
descriptive name (say, age for example) and have a specific data type. For example, a column
called age may have a type of INTEGER (denoting the type of data it is meant to hold).

9
In the table above, there are three columns (name, age, and country).

The name and country columns store string data types, whereas age stores integer data types.
The set of columns and data types make up the schema of this table.The table also has four rows,
or records, in it (one each for Natalia, Ned, Zenas, and Laura).

What is a Relational Database Management System (RDBMS)?


A relational database management system (RDBMS) is a program that allows you to create,
update, and administer a relational database. Most relational database management systems use
the SQL language to access the database.

6. Database design

Database design can be generally defined as a collection of tasks or processes that enhance the
designing, development, implementation, and maintenance of enterprise data management
system. The main objectives behind database designing are to produce physical and logical
design models of the proposed database system.
The goals database designs are as below.
1.Satisfy the information content requirements of the specified users and applications.
2.Provide a natural and easy to understand structuring of the information.
3.Support processing requirements and any performance objectives such as 'response time,
processing time, storage space etc.
The database design process can be divided into six steps. The ER model is most relevant to the
first three steps.
Requirements Collection and Analysis:
This is the first step in designing any database application. This is an informal process that
involves discussions and studies and analyzing the expectations of the users & the intended uses
of the database. Under this, we have to understand the following.
What data is to be stored n a database? What applications must be built? What operations
can be used?
Example: For customer database, data is cust-name, cust-city, and cust-no.
Conceptual database design:
The information gathered in the requirements analysis step is used to develop a higher-level
description of the data.
The goal is to create a simple description of the data that closely matches how users and
developers think of the data ( The goal of conceptual database design is a complete
understanding of the database structure, meaning (semantics), inter-relationships and constraints.
Example:
1. Cust_name : string;
2. Cust_no : integer;
3. Cust_city : string;
Logical Database Design:
Under this, we must choose a DBMS to implement our database design and convert the
conceptual database design into a database schema.
Example: Customer database can be represented in the form of tables or diagrams.
Schema Refinement:

10
We have to analyze the collection of relations in our relational database schema to identify the
potential problems and to refine it..
Physical Database Design:
Physical database design is the process of choosing specific storage structures and access paths
for the database files to achieve good performance for the various database applications.
This step involves building indexes on some tables and clustering some tables.
Application and Security Design:
In this step, we must identify different user groups and different roles played by various users.
For each role, and user group, we must identify the parts of the database that they must be able to
access, and the parts of the database that must not be accessible, and we must take steps to ensure
that these access rules sare enforced.
Two Types of Database Techniques
Normalization: is the process of minimizing redundancy from a relation or set of relations.
Redundancy in relation may cause insertion, deletion, and update anomalies.
ER Modeling: Represented diagrammatically by an entity-relationship diagram.

Structure of DBMS

• A database system is partitioned into modules that deal with each of the responsibilities
of the overall system.

The functional components of a database system can be broadly divided into the storage
manager , query processor components and and Disk Storage.

Database users and administrators :


Naive users : Users who need not be aware of the presence of the database system or any other
system supporting their usage are considered naïve users . A user of an automatic teller machine
falls on this category
Application programmers – interact with system through DML calls. They are computer
professionals who write application programs.
Sophisticated users -form requests in a database query language or by using tools such as data
analysis software.
Database Administrators (DBA):
The DBA is responsible for authorizing access to the database, for Coordinating and monitoring
its use and for acquiring software and hardware resources as needed. These are the people, who
maintain and design the database daily.
DBA is responsible for the following issues.
a. Design of the conceptual and physical schemas: The DBA is responsible for interacting
with the users of the system to understand what data is to be stored in the DBMS and how it is
likely to be used. The DBA creates the original schema by writing a set of definitions and is
Permanently stored in the 'Data Dictionary'.
b. Security and Authorization: The DBA is responsible for ensuring the unauthorized data
access is not permitted. The granting of different types of authorization allows the DBA to
regulate which parts of the database various users can access.

11
c. Storage structure and Access method definition: The DBA creates appropriate storage
structures and access methods by writing a set of definitions, which are translated by the DDL
compiler.
d. Data Availability and Recovery from Failures: The DBA must take steps to ensure that if
the system fails, users can continue to access as much of the uncorrupted data as possible. The
DBA also work to restore the data to consistent state.
e. Database Tuning: The DBA is responsible for modifying the database to ensure adequate
Performance as requirements change.
f. Integrity Constraint Specification: The integrity constraints are kept in a special system
structure that is consulted by the DBA whenever an update takes place in the system.
g.Routine Maintenance: Taking Backup

Data Storage & Querying:


 Query Processor: ·Query Processor translates statements in a query language into low-
level instructions the database manager understands.
 The Query Processor simplifies and facilitates access to data.
Query Evaluation Engine
 The DDL interpreter interprets DDL statements and records the definition in the data
dictionary.
 The DML compiler translates DML statements in a query language into an evaluation
plan consisting of low-level instructions that the query evaluation engine understands
 The DML compiler also performs query optimization, which is it picks the lowest cost
evaluation plan from among the alternatives.

12
Storage Manager: The storage manager is responsible for storing, retrieving, and updating
data in the database. The storage manager components include the following.
 Authorization and Integrity Manager tests for the satisfaction of integrity constraints
and checks the authority of users to access data.
 Transaction Manager ensures that the database remains in a consistent state (correct)
and allowing concurrent transactions to proceed without conflicting.
 File Manager manages the allocation of space on disk storage and the data structures
used to represent information stored on disk
 Buffer Manager is responsible for fetching the data from disk storage into
main memory and deciding what data to cache in main memory.
Data Storage:
 Data Files: store the database itself.
 Data Dictionary: stores information about the structure of the database.
 Indices: provide fast access to data items holding particular values.

7. Transaction Management
A transaction is a collection of operations that performs a single logical function in a database
application.
Example – Transaction to be performed to withdraw cash from an ATM vestibule. Each
transaction is a unit of both atomicity and consistency. Thus, we require that transactions do not
violate any database-consistency constraints. That is, if the database was consistent when a
transaction started, the database must be consistent when the transaction successfully terminates.
 Transaction-management component ensures that the database remains in a consistent
(correct) state despite system failures (e.g., power failures and operating system crashes)
and transaction failures.
 Concurrency-control manager controls the interaction among the concurrent
transactions, to ensure the consistency of the database.
8. DBMS Architecture
There are three types of DBMS architecture:
1. Single tier architecture
2. Two tier architecture
3. Three tier architecture

1. Single tier architecture:


In this type of architecture, the database is readily available on the client machine, any request
made by client doesn‘t require a network connection to perform the action on the database.
For example, lets say you want to fetch the records of employee from the database and the
database is available on your computer system, so the request to fetch employee details will be
done by your computer and the records will be fetched from the database by your computer as
well. This type of system is generally referred as local database system.

13
2.Two-tier architecture:
In two-tier architecture, the Database system is present at the server machine and the DBMS
application is present at the client machine, these two machines are connected with each other
through a reliable network as shown in the above diagram.
Whenever client machine makes a request to access the database present at server using a query
language like sql, the server perform the request on the database and returns the result back to the
client. The application connection interface such as JDBC, ODBC are used for the interaction
between server and client

3.Three-tier architecture:
In three-tier architecture, another layer is present between the client machine and server machine.
In this architecture, the client application doesn‘t communicate directly with the database
systems present at the server machine, rather the client application communicates with server
application and the server application internally communicates with the database system present
at the server.

9. Data modelling using Entity Relationship Model:


ER model concepts:
What is ER Modeling?
A graphical technique for understanding and organizing the data independent of the actual
database implementation We need to be familiar with the following terms to go further.
Entities
1. It is a collection of objects.
2. An entity is an object that is distinguishable from other objects by a set of attributes.
3. This is the basic object of E-R Model, which is a 'thing' in the real world with an independent
existence.
4. An entity may be an 'object' with a physical existence.

Entity instance: Entity instance is a particular member of the entity type. Example for entity
instance : A particular employee
Example: Person,Customer

14
Attributes: Properties/characteristics which describe entities are called attributes.

Domain of Attributes :The set of possible values that an attribute can take is called the domain
of the attribute. For example, the attribute day may take any value from the set {Monday,
Tuesday ... Friday}. Hence this set can be termed as the domain of the attribute day.

Key attribute:The attribute (or combination of attributes) which is unique for every entity
instance is called key attribute. E.g the Roll no of student
If the key attribute consists of two or more attributes in combination, it is called a composite key.

Simple attribute:If an attribute cannot be divided into simpler components, it is a simple


attribute. Example for simple attribute : employee_id of an employee.

Composite attribute: If an attribute can be split into components, it is called a composite


attribute. Example for composite attribute : Address of the employee which can be split into
Street, City,State and Country.

Single valued Attributes:If an attribute can take only a single value for each entity instance, it is
a single valued attribute. example for single valued attribute : age of a student. It can take only
one value for a particular student.

Multi-valued Attributes: If an attribute can take more than one value for each entity instance, it
is a multi-valued attribute. Multi-valued example for multi valued attribute : telephone number of
an employee, a particular employee may have multiple telephone numbers.

NULL valued attribute: The attribute value which is unknown to user is called NULL valued
attribute.
Derived Attribute:
An attribute which can be calculated or derived based on other attributes is a derived attribute.
Example for derived attribute : age of employee which can be calculated from date of birth and
current date.

15
Relationships:
Associations between entities are called relationships Example : An student is enrolled in a
course. Here "Enrolled in" is a relation between the entities student and course.

A. Degree of a Relationship: Degree of a relationship is the number of entity types involved.


The n-ary relationship is the general form for degree n. Special cases are unary, binary, and
ternary ,where the degree is 1, 2, and 3, respectively.
Example for unary relationship : An person is a manrried of another person.

Example for binary relationship : Student is enrolled in Course..

Example for ternary relationship : customer purchase item from a shop keeper

Relationship Sets: It is a collection of relationships.


A relationship set is a set of relationships of the same type. Formally, it is a mathematical
relation on n>=2 entity sets. If E1,E2…En are entity sets, then a relation ship set R is a subset of
{(e1,e2,…en)|e1Є E1,e2 Є E2..,en Є En} where (e1,e2,…en) is a relation ship

Advantages and Disadvantages of ER Modeling


Advantages
1. ER Modeling is simple and easily understandable. It is represented in business users language
and it can be understood by non-technical specialist.
2. Intuitive and helps in Physical Database creation.
3. Can be generalized and specialized based on needs.
4. Can help in database design.
5. Gives a higher level description of the system.
Disadvantages
1. Physical design derived from E-R Model may have some amount of ambiguities or
inconsistency.
2. Sometime diagrams may lead to misinterpretations

16
10.Notations for ER Diagram

11.Constraints:
Constraints enforce limits to the data or type of data that can be inserted/updated/deleted from a
table. The whole purpose of constraints is to maintain the data integrity during an update
/delete/insert into a table.
Types of constraints
NOT NULL
UNIQUE
DEFAULT
CHECK
Key Constraints – PRIMARY KEY, FOREIGN KEY
Domain constraints
Mapping constraints
NOT NULL:
This constraint makes sure that a column does not hold NULL value. When we don‘t provide
value for a particular column while inserting a record into a table, it takes NULL value by
default. By specifying NULL constraint, we can be sure that a particular column(s) cannot have
NULL values.
Example:
CREATE TABLE STUDENT(ROLL_NO INT NOT NULL);

17
UNIQUE:
UNIQUE Constraint enforces a column or set of columns to have unique values. If a column has
a unique constraint, it means that particular column cannot have duplicate values in a table.
Example:
CREATE TABLE STUDENT(STU_ADDRESS VARCHAR (35) UNIQUE);
DEFAULT:
The DEFAULT constraint provides a default value to a column when there is no value provided
while inserting a record into a table.
Example:
CREATE TABLE STUDENT(EXAM_FEE INT DEFAULT 10000);
CHECK:
This constraint is used for specifying range of values for a particular column of a table. When
this constraint is being set on a column, it ensures that the specified column must have the value
falling in the specified range.
Example:
CREATE TABLE STUDENT(ROLL_NO INT NOT NULL CHECK(ROLL_NO >1000) ;
In the above example we have set the check constraint on ROLL_NO column of STUDENT
table. Now, the ROLL_NO field must have the value greater than 1000
Key constraints:
PRIMARY KEY:Primary key uniquely identifies each record in a table. It must have unique
values and cannot contain nulls. In the below example the ROLL_NO field is marked as primary
key, that means the ROLL_NO field cannot have duplicate and null values.
Example:
CREATE TABLE STUDENT(ROLL_NO INT NOT NULL, PRIMARY KEY(ROLL_NO));
FOREIGN KEY:
Foreign keys are the columns of a table that points to the primary key of another table. They act
as a cross-reference between tables.
Domain constraints:
Each table has certain set of columns and each column allows a same type of data, based on its
data type. The column does not accept values of any other data type.Domain constraints are user
defined data type and we can define them like this:
Domain Constraint = data type + Constraints (NOT NULL / UNIQUE / PRIMARY KEY /
FOREIGN KEY / CHECK / DEFAULT)

Mapping constraints:

12.Mapping Cardinality:
Cardinality refers to the relationship between two tables.
Relationship cardinalities specify how many of each entity type is allowed.
Relationships can have four possible connectivities as given below.
1. One to one (1:1) relationship :
Employee is assigned with a parking space.

18
One employee is assigned with only one parking space and one parking space is assigned to only
one employee. Hence it is a 1:1 relationship and cardinality is One-To-One (1:1).In ER model it
is represented as

2.One to many (1:N) relationship :

One organization can have many employees , but one employee works in only one organization.
Hence it is a 1:N relationship and cardinality is One-To-Many (1:N)
In ER model it is represented as

3.Many to one (M:1) relationship : It is the reverse of the One to Many relationship. Employee
works in organization.

19
One employee works in only one organization But one organization can have many employees.
Hence it is a M:1 relationship and cardinality is Many-to-One (M :1).

4.Many to many (M:N) relationship : One student can enroll for many courses and one course
can be enrolled by many students.

Hence it is a M:N relationship and cardinality is Many-to-Many (M:N).

The minimum and maximum values of this connectivity is called the cardinality of the
relationship

Participation Constraint:
Define the least number of relationship instances in which an entity must compulsorily
participate.
There are two types of participation constraints-
1.Total Participation-It specifies that each entity in the entity set must compulsorily participate
in at least one relationship instance in that relationship set.That is why, it is also called
as mandatory participation.
Total participation is represented using a double line between the entity set and relationship set

20
.
Every employee works for some or the other department. So employee entity's participation is
total in the said relationship
2.Partial Participation- It specifies that each entity in the entity set may or may not participate
in the relationship instance in that relationship set.That is why, it is also called as optional
participation.Partial participation is represented using a single line between the entity set

Consider the relationship - Employee is head of the department. Here all employees will not be
the head of the department. Only one employee will be the head of the department. In other
words, only few instances of employee entity participate in the above relationship. So employee
entity's participation is partial in the said relationship.

13.Keys:
It is used to uniquely identify any record or row of data from the table. It is also used to establish
and identify relationships between tables. For example, ID is used as a key in the Student table
because it is unique for each student. In the PERSON table, passport_number, license_number,
SSN are keys since they are unique for each person.

Types of keys:

1. Super Key: A super key is a set of attributes that can identify each tuple uniquely in the given
relation.A super key is not restricted to have any specific number of attributes.Thus, a super key
may consist of any number of attributes.
Example-
Consider the following Student schema-
Student ( roll , name , gender, age , address , class , section )

21
Given below are the examples of super keys since each set can uniquely identify each student in
the Student table-
( roll , name , gender, age , address , class , section )
( class , section , roll )
(class , section , roll , gender )
( name , address )
( roll , name , address )
2. Candidate Key-
A minimal super key is called as a candidate key. A set of minimal attribute(s) that can identify
each tuple uniquely in the given relation is called as a candidate key.
Example-
Consider the following Student schema-
Student ( roll , name , gender, age , address , class , section )
Given below are the examples of candidate keys since each set consists of minimal attributes
required to identify each student uniquely in the Student table-
( roll , name , address )
( name , address )
All the attributes in a candidate key are sufficient as well as necessary to identify each tuple
uniquely.
3. Primary Key-
A primary key is a candidate key that the database designer selects while designing the database.
• The value of primary key can never be NULL.
• The value of primary key must always be unique.
• The values of primary key can never be changed i.e. no updation is possible.
• The value of primary key must be assigned when inserting a record.
• A relation is allowed to have only one primary key.

4. Alternate Key-
Candidate keys that are left unimplemented or unused after implementing the primary key are
called as alternate keys.
5. Foreign Key-
An attribute ‗X‘ is called as a foreign key to some other attribute ‗Y‘ when its values are
dependent on the values of attribute ‗Y‘.
 The attribute ‗X‘ can assume only those values which are assumed by the attribute ‗Y‘.
 Here, the relation in which attribute ‗Y‘ is present is called as the referenced relation.

22
 The relation in which attribute ‗X‘ is present is called as the referencing relation.
 The attribute ‗Y‘ might be present in the same table or in some other table.
Consider the following two schemas-

Here, t_dept can take only those values which are present in dept_no in Department table since
only those departments actually exist.
 Foreign key references the primary key of the table.
 Foreign key can take only those values which are present in the primary key of the referenced
relation.
 Foreign key may have a name other than that of a primary key.
 Foreign key can take the NULL value.
 There is no restriction on a foreign key to be unique.
 In fact, foreign key is not unique most of the time.
 Referenced relation may also be called as the master table or primary table.
 Referencing relation may also be called as the foreign table.
6. Composite Key-
A primary key comprising of multiple attributes and not just a single attribute is called as a
composite key.
7. Unique Key-
Unique key is a key with the following properties-
It is unique for all the records of the table.
Once assigned, its value can not be changed i.e. it is non-updatable.
It may have a NULL value.
Example-
The best example of unique key is Adhaar Card Numbers.
The Adhaar Card Number is unique for all the citizens (tuples) of India (table).
If it gets lost and another duplicate copy is issued, then the duplicate copy always has the same
number as before.Thus, it is non-updatable.Few citizens may not have got their Adhaar cards, so
for them its value is NULL.

23
14.ER DIAGRAMS:
ER diagram for hospital

ER Diagram for car insurance company

15.Concepts of Super Key:


A super key is a set of one or more attributes (columns), which can uniquely identify a row in
a table.
Candidate keys are selected from the set of super keys, the only thing we take care while
selecting candidate key is: It should not have any redundant attribute. That‘s the reason they are
also termed as minimal super key. Let‘s take an example to understand this:
Table: Employee

24
Emp_Number Emp_Name
Emp_SSN

226 Steve
123456789
227 Ajeet
999999321
228 Chaitanya
888997212
229 Robert
777778888
Super keys: The above table has following super keys.All of the following sets of super key are
able to uniquely identify a row of the employee table.
{Emp_SSN}
{Emp_Number}
{Emp_SSN, Emp_Number}
{Emp_SSN, Emp_Name}
{Emp_SSN, Emp_Number, Emp_Name}
{Emp_Number, Emp_Name}

16. Candidate Key


A candidate key is a minimal super key with no redundant attributes. The following two set of
super keys are chosen from the above sets as there are no redundant attributes in these sets.
{Emp_SSN}
{Emp_Number}
Only these two sets are candidate keys as all other sets are having redundant attributes that are not
necessary for unique identification.
Super key vs Candidate Key
All the candidate keys are super keys. This is because the candidate keys are chosen out of the
super keys.How we choose candidate keys from the set of super keys? We look for those keys
from which we cannot remove any fields. In the above example, we have not chosen {Emp_SSN,
Emp_Name} as candidate key because {Emp_SSN} alone can identify a unique row in the table
and Emp_Name is redundant.

17. Weak entity sets


An entity type should have a key attribute which uniquely identifies each entity in the entity set,
but there exists some entity type for which key attribute can‘t be defined. These are called Weak
Entity type.
The entity sets which do not have sufficient attributes to form a primary key are known as weak
entity sets and the entity sets which have a primary key are known as strong entity sets.
• As the weak entities do not have any primary key, they cannot be identified on their own,
so they depend on some other entity (known as owner entity).
• The weak entities have total participation constraint (existence dependency) in its
identifying relationship with owner identity. Weak entity types have partial keys.

25
• Partial Keys are set of attributes with the help of which the tuples of the weak entities can
be distinguished and identified.
• Weak entity is depend on strong entity to ensure the existence of weak entity. Like strong
entity, weak entity does not have any primary key,
• It has partial discriminator key.
• Weak entity is represented by double rectangle. The relation between one strong and one
weak entity is represented by double diamond.

Example-1:
In the below ER Diagram, ‗Payment‘ is the weak entity. ‗Loan Payment‘ is the identifying
relationship and ‗Payment Number‘ is the partial key. Primary Key of the Loan along with the
partial key would be used to identify the records.
Payment number a partial key

18. Codd’s rules


Codd rules were proposed by E.F. Codd which should be satisfied by relational model.Dr Edgar F.
Codd, after his extensive research on the Relational Model of database systems, came up with
twelve rules of his own, which according to him, a database must obey in order to be regarded as a
true relational database.
Rule 0: Foundation Rule
For any system that is advertised as, or claimed to be, a relational data base management system,
that system must be able to manage data bases entirely through its relational capabilities.
Rule 1: Information Rule
The data stored in a database, may it be user data or metadata, must be a value of some table cell.
Everything in a database must be stored in a table format.
Rule 2: Guaranteed Access Rule

26
Every single data element (value) is guaranteed to be accessible logically with a combination of
table-name, primary-key (row value), and attribute-name (column value). No other means, such as
pointers, can be used to access data.
Rule 3: Systematic Treatment of NULL Values
The NULL values in a database must be given a systematic and uniform treatment. This is a very
important rule because a NULL can be interpreted as one the following − data is missing, data is
not known, or data is not applicable.
Rule 4: Active Online Catalog
The structure description of the entire database must be stored in an online catalog, known as data
dictionary, which can be accessed by authorized users. Users can use the same query language to
access the catalog which they use to access the database itself.
Rule 5: Comprehensive Data Sub-Language Rule
A database can only be accessed using a language having linear syntax that supports data
definition, data manipulation, and transaction management operations. This language can be used
directly or by means of some application. If the database allows access to data without any help of
this language, then it is considered as a violation.
Rule 6: View Updating Rule
All the views of a database, which can theoretically be updated, must also be updatable by the
system.
Rule 7: High-Level Insert, Update, and Delete Rule
A database must support high-level insertion, updation, and deletion. This must not be limited to a
single row, that is, it must also support union, intersection and minus operations to yield sets of
data records.
Rule 8: Physical Data Independence
The data stored in a database must be independent of the applications that access the database. Any
change in the physical structure of a database must not have any impact on how the data is being
accessed by external applications.
Rule 9: Logical Data Independence
The logical data in a database must be independent of its user‘s view (application). Any change in
logical data must not affect the applications using it. For example, if two tables are merged or one
is split into two different tables, there should be no impact or change on the user application. This
is one of the most difficult rule to apply.
Rule 10: Integrity Independence
A database must be independent of the application that uses it. All its integrity constraints can be
independently modified without the need of any change in the application. This rule makes a
database independent of the front-end application and its interface.
Rule 11: Distribution Independence
The end-user must not be able to see that the data is distributed over various locations. Users
should always get the impression that the data is located at one site only. This rule has been
regarded as the foundation of distributed database systems.
Rule 12: Non-Subversion Rule
If a system has an interface that provides access to low-level records, then the interface must not
be able to subvert the system and bypass security and integrity constraints.

27
19. Extended ER model
Enhanced entity-relationship diagrams are advanced database diagrams very similar to regular
ER diagrams which represent requirements and complexities of complex databases.
It is a diagrammatic technique for displaying the Sub Class and Super Class; Specialization and
Generalization; Union or Category; Aggregation etc.
In addition to ER model concepts EE-R includes −

 Subclasses and Super classes.


 Specialization and Generalization.
 Category or Union
 Aggregation

Subclasses and Super class


Super class is an entity that can be divided into further subtype.
For example − consider Shape super class.

Super class shape has sub groups: Triangle, Square and Circle.
Sub classes are the group of entities with some unique attributes.Sub class inherits the
properties and attributes from super class.

Category or Union
Relationship of one super or sub class with more than one super class.

Owner is the subset of two super class: Vehicle and House.

20. DBMS Generalization

28
 It is a process of generalizing two or more lower-level entity types into a higher-level
entity type.
 Entities are clubbed or grouped together to represent a more generalized view. In this
process, the common attributes of two or more entities combine to form a new entity
type.
 The new entity type formed is called a generalized entity.
 It may be possible that this generalized entity may combine further with other entity types
to form another higher-level entity type. It is a bottom-up approach. It is the reverse
process of Specialization.
 For example, Whales, Sharks, and Dolphins can be generalized as Fish. Similarly,
Bicycle, Bike, and Car can be generalized as Vehicles.
Example: Suppose we have two entity types, Employee and Customer. The attributes of
Employee entity type are Name, Phone, Salary, Employee_id, and Address .The attributes of
Customer entity type are Name, Phone, Address, Credit, Customer_id, and Email.

The three attributes i.e. Name, Phone, and Address are common here. When we generalize these
two entities, we form a new higher-level entity type Person. The new entity type formed is
a generalized entity.

We can see in the below E-R diagram that after generalization the new generalized entity Person
contains only the common attributes i.e. Name, Phone, and Address. Employee entity contains
only the specialized attribute like Employee_id and Salary. Similarly, the Customer entity type
contains only specialized attributes like Customer_id, Credit, and Email.

29
21. Specialization
Specialization, as the name suggests, is a process of specializing an entity type into a more
specified entity. In this process, we specialize a higher-level entity type by adding some
additional attributes to the entity type. In this approach, we add some additional attributes and
specialize a higher-level entity type into some other entity type. It is a top-down approach in
which a higher-level entity is broken into smaller entities.
For example, Animals can be mammals or reptiles. Then these mammals can either be tiger,
elephant or humans. The reptiles can be snakes or lizards. So we are specializing as we are
moving towards any particular type of animal.
Example: If we have a person entity type who has attributes such as Name, Phone_no, Address.
Now, suppose we are making software for a shop then the person can be of two types. This
Person entity can further be divided into two entity i.e. Employee and Customer. How we will
specialize it? We will specialize the Person entity type to Employee entity type by adding
attributes like Emp_no and Salary. Also, we can specialize the Person entity type to Customer
entity type by adding attributes like Customer_no, Email, and Credit. These lower-level
attributes will inherit all the properties of the higher-level attribute. The Customer entity type
will also have attributes like Name, Phone_no, and Address which was present in the higher-
level entity.

30
22. Aggregation

In aggregation, the relation between two entities is treated as a single entity. In aggregation,
relationship with its corresponding entities is aggregated into a higher level entity

In real world, we know that a manager not only manages the employee working under them but
he has to manage the project as well. In such scenario if entity ―Manager‖ makes a ―manages‖
relationship with either ―Employee‖ or ―Project‖ entity alone then it will not make any sense
because he has to manage both. In these cases the relationship of two entities acts as one entity.
In our example, the relationship ―Works-On‖ between ―Employee‖ & ―Project‖ acts as one
entity that has a relationship ―Manages‖ with the entity ―Manager‖.

23. Reduction of ER diagram to Table


 ER diagram is converted into the tables in relational model.
This is because relational models can be easily implemented by RDBMS like MySQL , Oracle
etc. Following rules are used for converting an ER diagram into the tables-
Rule-01: For Strong Entity Set With Only Simple Attributes-
 A strong entity set with only simple attributes will require only one table in relational model.
o Attributes of the table will be the attributes of the entity set.
o The primary key of the table will be the key attribute of the entity set.

31
Rule-02: For Strong Entity Set With Composite Attributes-
A strong entity set with any number of composite attributes will require only one table in
relational model.
o While conversion, simple attributes of the composite attributes are taken into account and
not the composite attribute itself.

Rule-03: For Strong Entity Set With Multi Valued Attributes-


A strong entity set with any number of multi valued attributes will require two tables in relational
model.
o One table will contain all the simple attributes with the primary key.
o Other table will contain the primary key and all the multi valued attributes.

Employee table

Rule-04: Translating Relationship Set into a Table-


A relationship set will require one table in the relational model.
Attributes of the table are-

32
o Primary key attributes of the participating entity sets
o Its own descriptive attributes if any.

Rule-05: For Binary Relationships With Cardinality Ratios-


The following four cases are possible-
Case-01: Binary relationship with cardinality ratio m:n
Case-02: Binary relationship with cardinality ratio 1:n
Case-03: Binary relationship with cardinality ratio m:1
Case-04: Binary relationship with cardinality ratio 1:1
Case-01: For Binary Relationship With Cardinality Ratio m:n

Here, three tables will be required-


1. A ( a1 , a2 )
2. R ( a1 , b1 )
3. B ( b1 , b2 )
Case-02: For Binary Relationship With Cardinality Ratio 1:n

33
Here, two tables will be required-
1. A ( a1 , a2 )
2. BR ( a1 , b1 , b2 )
Here, combined table will be drawn for the entity set B and relationship set R.

Case-03: For Binary Relationship With Cardinality Ratio m:1

Here, two tables will be required-


AR ( a1 , a2 , b1 )
B ( b1 , b2 )
Here, combined table will be drawn for the entity set A and relationship set R.

Case-04: For Binary Relationship With Cardinality Ratio 1:1

Here, two tables will be required. Either combine ‗R‘ with ‗A‘ or ‗B‘

Way-01:

1. AR ( a1 , a2 , b1 )
2. B ( b1 , b2 )
Way-02:
1. A ( a1 , a2 )
2. BR ( a1 , b1 , b2 )

Rule-06: For Binary Relationship With Both Cardinality Constraints and Participation
Constraints-

34
 Cardinality constraints will be implemented as discussed in Rule-05.
 Because of the total participation constraint, foreign key acquires NOT NULL constraint
i.e. now foreign key can not be null.

Case-01: For Binary Relationship With Cardinality Constraint and Total Participation
Constraint From One Side-

Because cardinality ratio = 1 : n , so we will combine the entity set B and relationship set R.
Then, two tables will be required-
A ( a1 , a2 )
BR ( a1 , b1 , b2 )
Because of total participation, foreign key a1 has acquired NOT NULL constraint, so it can‘t be
null now.
Case-02: For Binary Relationship With Cardinality Constraint and Total Participation
Constraint From Both Sides-
If there is a key constraint from both the sides of an entity set with total participation, then that
binary relationship is represented using only single table.

Here, Only one table is required.


ARB ( a1 , a2 , b1 , b2 )

Rule-07: For Binary Relationship With Weak Entity Set-

Weak entity set always appears in association with identifying relationship with total
participation constraint.

35
Here, two tables will be required-

1. A ( a1 , a2 )
2. BR ( a1 , b1 , b2 )

36

You might also like