Unit I DBMS
Unit I DBMS
Data: Known facts that can be recorded that have implicit meaning.
E.g. Student roll no, names, address, etc,.
Database: Collection of inter-related data stored in a secondary storage device, organized
meaningfully for a specific purpose.
DBMS: DBMS is a collection of interrelated data and a set of program to access those data.
Database System: Database and DBMS collectively known as database system.
Functions of a DBMS:
A DBMS makes it possible for users to create, edit and update data in database files. Once
created, the DBMS makes it possible to store and retrieve data from those database files. More
specifically, a DBMS provides the following functions:
Concurrency: concurrent access (meaning 'at the same time') to the same database by
multiple users
Security: security rules to determine access rights of users
Backup and recovery: processes to back-up the data regularly and recover data if a
problem occurs
Integrity: database structure and rules improve the integrity of the data
Data descriptions: a data dictionary provides a description of the data
Within an organization, the development of the database is typically controlled by
database administrators (DBAs) and other specialists. This ensures the database
structure is efficient and reliable.
The database system arose in response to early methods of file –processing system.
Applications of DBMS:
a) Banking
b) Airlines
c) Universities
d) Credit card transactions
e) Tele communication
f) Finance
g) Sales
h) Manufacturing
i) Human resources
File processing system:
It is supported by conventional OS. It stores permanent records in various files if needs
different application programs to extract records from and add records to appropriate file.
The file processing system has number of disadvantage .They are
1. Data Redundancy and Inconsistency
Redundancy:
Duplication of data in several files.
Leads to higher storage and access
cost. Incosistency:
Changes made in one data is not reflected in all the copies of the same data.
2. Difficult in accessing data:
Does not allow needed data to be retrieved in convenient & efficient manner.
3. Data Isolation:
Since data are scattered & files are in different format writing new application
programs to retrieve appropriate data is difficult.
4. Integrity Problem:
The data values stored in the database must satisfy certain type of consistency
constraints
.The constraints are enforced by adding appropriate code in programs.
When new constraints are added it is difficult to change programs to enforce them.
5. Atomicity Problem:
If any failure occurs the data is to be restored to the consistent state that existed prior
to failure .
It must be atomic happen entirely or not at all.
6. Concurrent Anamolies:
Interaction of multiple user to update the data simultaneously is possible and may
result in inconsistent data.
7. Security Problems:
Not every user of database system is allowed to access data.
Enforcing security constraint is difficult in file processing system.
Advantages of DBMS:
1. Improved security: Database security is the protection of the database from an unauthorized
access.
2. Improved data integrity: Database integrity refers to the validity and consistency of stored
data. Integrity isexpressed in terms of constraints, which are consistency rules that the database
isnot permitted to violate.
3. Data consistency: If a data item is stored more than once and the system is aware of this,
the systemcan ensure that all copies of the item are kept consistent.
4. Improved data accessibility and responsiveness: Many DBMSs provide query language
that allows users to enquire questions andto obtain the required information at their terminals,
without any need of separatesoftware.
5. Increased concurrency: Many DBMSs manage concurrent database access and ensure the
data in thedatabase is consistent and valid.
6. Improved backup and recovery services: Modern DBMSs provide facilities to minimize the
amount of processing that islost following a failure.
Disadvantages of DBMS:
1. Cost of DBMSs: The cost of DBMSs varies significantly, depending on the
environment and functionality provided
2. Complexity and Size: The provision of the functionality makes DBMS an extremely
complex piece of software. Failure to understand the system can lead to bad design
decisions.
3. 3.Higher impact of a failure: The centralization of resources increases the vulnerability
of the system. Since all users and applications rely on the availability
4. Performance: The DBMS is written to be more general in order to support applications in all
domains. The effect is that some applications may not run as they used to.
5.
DATABASE TERMINOLOGIES
8
Forms display one record at a time
Reports give and organize why of presenting information.
Views of data
Levels of Abstraction
1. Physical level : Figure 1.2 shows the three level architecture for database systems.
describes how a record (E.g., customer) is stored.
2. Logical level: describes data stored in database, and the relationships among the data.
Type customer = record
name : string;
street : string;
city : integer;
end;
3. View level: application programs hide details of data types. Views can also hide
information
(E.g., salary) for security purposes.
9
Two classes of languages
Procedural – user specifies what data is required and how to get those data
Nonprocedural – user specifies what data is required without specifying how
to get those data
SQL is the most widely used query language
SQL (Structured Query language)
SQL: widely used non-procedural language
E.g. find the name of the customer with customer-id 192-83-7465
select customer.customer-name
from customer
where customer.customer-id = ‘192-83-7465’
E.g. find the balances of all accounts held by the customer with customer-id
192-83-7465
select account.balance
from depositor, account
where depositor.customer-id = ‘192-83-7465’ and
depositor.account-number = account.account-number
Application programs generally access databases through one of
Language extensions to allow embedded SQL
Application program interface (e.g. ODBC/JDBC) which allow SQL
queries to be sent to a database
Database Users
Users are differentiated by the way they expect to interact with the system
Application programmers – interact with system through DML calls
Sophisticated users – form requests in a database query language
Specialized users – write specialized database applications that do not fit into the
traditional data processing framework
Naïve users – invoke one of the permanent application programs that have been written
previously
E.g. people accessing database over the web, bank tellers, clerical staff
Database Administrator
Coordinates all the activities of the database system; the database administrator has a
good understanding of the enterprise’s information resources and needs.
Database administrator's duties include:
Schema definition
Storage structure and access method definition
Schema and physical organization modification
Granting user authority to access the database
Specifying integrity constraints
Acting as liaison with users
Monitoring performance and responding to changes in requirements
Data Independence
• Ability to modify a schema definition in one level without affecting a schema
definition in the other levels.
• The interfaces between the various levels and components should be well defined so
that changes in some parts do not seriously influence others.
10
• Two levels of data independence
– Physical data independence
It is the ability to modify the physical scheme without causing
application programs to be rewritten. Modifications at the physical level are
occasionally necessary in order to improve performance.
– Logical data independence
It is the ability to modify the conceptual (logical) scheme without
causing application programs to be rewritten. Modifications at the conceptual
level are necessary whenever the logical structure of the database is altered.
Instances and Schemas
Similar to types and variables in programming languages
Schema – the logical structure of the database
e.g., the database consists of information about a set of customers and accounts
and the relationship between them)
Analogous to type information of a variable in a program
Physical schema: database design at the physical level
Logical schema: database design at the logical level
Instance – the actual content of the database at a particular point in time
Analogous to the value of a variable
System Catalog (Data Dictionary)
Data dictionary is the repository of information describing the data in the database,
that is the metadata or the “data about the data”.
11
DATA MODELS
A collection of tools for describing
data
data relationships
data semantics
data constraints
It provides a way to describe the design of a database at a physical, logical &new level.
TYPES OF DATA MODELS
Explain the types of data models in detail.(16) (NOV/DEC 2014)
1. The Entity-Relationship Model
The entity-relationship (E-R) data model is based on a perception of a real world that
consists of a collection of basic objects, called entities, and of relationships among these
objects.
It uses a collection of basic objects called entities and relationship among these object ,an
entity is a thing or object in the real world that is distinguished from other object.
Entity
An entity is a “thing” or “object” in the real world that is distinguishable from
other objects. For example, each person is anentity, and bank accounts can be
considered as entities.
Entities are described in a database by a set of attributes. For example, the attributes
account-number and balance may describe one particular account in a bank, and they
form attributes of the account entity set. Similarly, attributes customer-name, customer-
street address and customer-city may describe a customer entity.
An extra attribute customer-id is used to uniquely identify customers (since it may be
possible to have two customers with the same name, street address, and city).
A unique customer identifier must be assigned to each customer. In the United States,
many enterprises use the social-security number of a person (a unique number the U.S.
government assigns to every person in the United States) as a customer identifier.
Relationship
A relationship is an association among several entities. For example, a depositor
relationship associates a customer with each account that she has. The set of all entities
of the same type and the set of all relationships of the same type are termed an entity
set and relationship set, respectively.
The overall logical structure (schema) of a database can be expressed graphically by an
E-R diagram.
12
2. Relational Model
The relational model uses a collection of tables to represent both data and the
relationships among those data.
Each table has multiple columns, and each column has a unique name.
The data is arranged in a relation which is visually represented in a two dimensional
table. The data is inserted into the table in the form of tuples (which are nothing but
rows).
A tuple is formed by one or more than one attributes, which are used as basic building
blocks in the formation of various expressions that are used to derive a meaningful
information.
The relational model is implemented in database where a relation is represented by a
table, a tuple is represented by a row, an attribute is represented by a column of the
table, attribute name is the name of the column such as ‘identifier’, ‘name’, ‘city’ etc.,
attribute value contains the value for column in the row.
13
Account records is defined as
Type account=record
Acc_number : string;
Balance :integer;
End
In network model the two records are represented as
4. Hierarchical Model
Hierarchical model consists of a collection of records that are connected to each other
through links records are organized as a collection of trees.
The hierarchical database model looks like an organizational chart or a family tree. It
has a single root segment (Employee) connected to lower level segments
(Compensation, Job Assignments, and Benefits). Each subordinate segment, in turn,
may connect to other subordinate segments. Here, Compensation connects to
Performance Ratings and Salary History. Benefits connects to Pension, Life Insurance,
and Health. Each subordinate segment is the child of the segment directly above it.
14
Other Data Models
The object-oriented data model is another data model that has seen increasing attention.
The object-oriented model can be seen as extending the E-R model with notions
object- oriented data model.
The object-relational data model combines features of the object-oriented data model and
relational data model.
Semi structured data models permit the specification of data where individual data items of
the same type may have different sets of attributes.
This is in contrast with the data models mentioned earlier, where every data item of a
particular type must have the same set of attributes.
The extensible markup language (XML) is widely used to represent semistructured data.
COMPONENTS OF DBMS
Explain the components of database in detail.(16)
Database system is partitioned into modules that deal with each of the responsibilities
of the overall system.
The functional components of the database system are,
Storage Manager
Query Processor
1. Storage Manager:
It is a component of database system that provides the interface between the low-
level data stored in the database and the application programs and queries
submitted to the system.
Storage manager translates the narrow DML statements into low-level file system
commands.
It is responsibility for the interaction with file manager.
Thus, the storage manager is responsible for storing, retrieving and updating data
is the database.
Components of storage manager are,
Authentication & Integrity Manager
File Manager
Buffer Manager
Transaction Manager
Transaction Manager
It ensures that database remains in a consistent state despite system failure and
the concurrent transaction executions proceed without conflicting.
File Manager
It manages the allocation of space on disk storage and the data structures used to
represent information stored on disk.
Buffer Manager
It is responsibility for fetching data from disk storage to main memory &
deciding what data to cache in main memory.
15
The storage manager implements several data structure such as,
Data files –Stores the database itself.
Data dictionary- Stores the metadata about the structure of the database(i.e) Schema
of the database
Indices-It provides fast access to data items, The Database provides pointes to those
data items that hold a particular index value
2. Query processor:
It helps the database system to simplify and facilitate access to data components
of query processor are,
DDL Interpreter
Interprets DDL statements and records the definitions in data dictionary.
16
DML Compiler
Translates DML statements into an evaluation plan consisting of low-level
instruction that the query evaluation engine understands. It also performs query
optimization (i.e) it picks the lowest cost evaluation plan from among the alternates
Query Evaluation Engine
Executes low-level instructions generated by the DML compiler Database
systems can be centralized as client –server .Based on this database applications are
portioned into
Three-tier architecture
Client machine acts as merely a front end and does not contain any direct
database calls .The client end communicates with an application server through
forms interface and the application server in turn communicate with the database
system to access data E.g. web-based applications, and applications built using
“middleware”
D. Descriptive attributes:
The attribute present in the relationship is called as descriptive attribute.
Figure 14: ER diagram with descriptive attribute
NULL value:
The attribute takes a NULL value, when an entity does not have a value for it.
Constraints
o Mapping cardinalities o
Participation constraints
o keys
Mapping cardinalities:
A mapping cardinality is a data constraint that specifies how many entities
an entity can be related to in a relationship set.
Types of mapping cardinalities are
- one to one
- one to many
31
- many to one
- many to many
Consider a binary relationship set R on entity sets A and B. There are four
possible mapping cardinalities in this case:
1. one-to-one (MAY/JUNE 2013)
An entity in A is related to at most one entity in B, and an entity in B is related
to At most one entity in A.
EXAMPLE
EXAMPLE
32
3. many-to-one –
An entity in A is related to at most one entity in B, but an entity in B is related to
any number of entities in A.
EXAMPLE
4. many-to-many
An entity in A is related to any number of entities in B, but an entity in B is
related to any number of entities in A.
EXAMPLE
33
Participation Constraints
1. Total Participation:
The participation of an entity set E in a relationship set R is said to be total ,if
every entity in E participates in atleast one relationship in R
2. Partial Participation:
If only some entities in E participate in relationship in R ,then the participation
of entity E in relationship R is said to be partial
34
5. Referential integrity:
It requires that the values appearing in specified attribute of any tuple in
referencing relation also appear in specified attributes of atleast one tuple in
referenced relation.
35
S
The edge between loan & borrower has a cardinality constraint of 1…1 means that each
loan must have exactly one customer. The limit 0..* indicates that customer can have zero or
more loans
Identifying Relationship
The relationship associating the weak entity set with the identifying entity set is
called identifying relationship
36
EXTENDED E-R FEATURES
The E-R model that is supported with the additional semantic concepts called as
extended entity relationship model or EER model. The EER model includes all the concepts of
the original E-R model together with the following additional concepts.
Specialization
Generalization
Attribute inheritance
Aggregation
1. Specialization
An entity set may include sub grouping of entities that are distinct in some way
from other entities in the set.
The process of designating sub groupings within an entity set is called
specialization.
2. Generalization
It is a containment relationship that exists between a higher level entity set and one or
more lower level entity set.
Attributes that are conceptually the same had different names in the two lower level
entity set. To create a generalization, the attribute must be given in common name.
37
Constraints on generalization:
Constraint 1:
Condition defined
User defined
This two determines which entities can be members of given lower entity set.
Condition defined
In lower level entity sets , membership is evaluated on the basis of whether or
not an entity satisfies an explicit condition or predicate.
Ex: For the higher level entity set account, account type attribute is set. The
account entities are evaluated based on account type. Entities that satisfy account
type= “saving” , belong to low level entity saving account. Entities that satisfy
account type =”checking” belongs to low level entity set checking account. Since
the lower level entities are evaluated on the basics of same attribute it is called as
attribute defined.
User defined
Not constrained by a membership condition database user assign entities to a
given entity set.
Constraint 2:
Disjoint
Overlapping
Disjoint:
A disjointness constraint requires that an entity belong to no more one low level
entity set.
Ex: Account entity satisfy only one condition either a saving account or checking
account , but cannot be both.
Overlapping:
The same entity may belong to more than one lower level entity set within a
single generalization.
Ex: Generalization applied to customer & employee leads to higher level entity
set person.
Constraint 3:
Total generalization or specialization:
Each higher level entity set must belong to an lower level entity set.
Partial generalization or specialization:
Some higher level entity set may not belong to any lower level entity set.
3. Attribute Inheritance:
The attributes of higher level entity are said to be inherited by lower level entity
set. Ex: Customer & employee inherit the attributer of a person.
4. Aggregation:
The limitation of ER model is that it cannot express relationships among relationship.
One alternative for representing the relationship is to create a quaternary relationship
Aggregation is an abstraction through which relationship are treated as higher level
entities.
38
EXAMPLE: ER diagram with redundant relationship
39