Lecture Note 420507181042070-Copy Export
Lecture Note 420507181042070-Copy Export
1. Introduction
2. Disadvantages of file oriented approach
3. Database
4. Why Database
5. Database Management System(DBMS)
6. Function of DBMS
7. Advantages of DBMS and disadvantage of DBMS
8. Database Basics
9. Three level architecture of DBMS
10. Database users
11. Database language
12. Database structure
Introduction:
What is data:
Data is the known facts or figures that have implicit meaning. It can also be defined as it
is the representation of facts ,concepts or instruction in a formal manner, which is suitable
for understanding and processing. Data can be represented in alphabets(A-Z, a-z),in
digits(0-9) and using special characters(+,-.#,$, etc)
e.g: 25, “ajit” etc.
Information:
Information is the processed data on which decisions and actions are based. Information
can be defined as the organized and classified data to provide meaningful values.
File:
File is a collection of related data stored in secondary memory.
Prepared by: Dr. Subhendu Kumar Rath, BPUT.
3) Data isolation :
Because data are scattered in various file and files may be in different formats
with new application programs to retrieve the appropriate data is difficult.
4) Integrity Problems:
Developers enforce data validation in the system by adding appropriate code in
the various application program. How ever when new constraints are added, it is
difficult to change the programs to enforce them.
5) Atomicity:
Database:
A database is organized collection of related data of an organization stored in
formatted way which is shared by multiple users.
The main feature of data in a database are:
Prepared by: Dr. Subhendu Kumar Rath
Persistent:
If data is removed from database due to some explicit request from user to remove.
Integrated:
A database can be a collection of data from different files and when any redundancy
among those files are removed from database is said to be integrated data.
Sharing Data:
The data stored in the database can be shared by multiple users simultaneously with out
affecting the correctness of data.
Why Database:
In order to overcome the limitation of a file system, a new approach was required.
Hence a database approach emerged. A database is a persistent collection of logically
related data. The initial attempts were to provide a centralized collection of data. A
database has a self describing nature. It contains not only the data sharing and integration
of data of an organization in a single database.
A small database can be handled manually but for a large database and having
multiple users it is difficult to maintain it, In that case a computerized database is useful.
The advantages of database system over traditional, paper based methods of record
keeping are:
compactness:
No need for large amount of paper files
speed:
The machine can retrieve and modify the data more faster way then human being
Less drudgery: Much of the maintenance of files by hand is eliminated
Accuracy: Accurate,up-to-date information is fetched as per requirement of the
user at any time.
Function of DBMS:
Prepared by: Dr. Subhendu Kumar Rath, BPUT.
1. Defining database schema: it must give facility for defining the database
structure also specifies access rights to authorized users.
2. Manipulation of the database: The dbms must have functions like insertion of
record into database updation of data, deletion of data, retrieval of data
3. Sharing of database: The DBMS must share data items for multiple users by
maintaining consistency of data.
4. Protection of database: It must protect the database against unauthorized users.
5. Database recovery: If for any reason the system fails DBMS must facilitate data
base recovery.
Advantages of dbms:
Reduction of redundancies:
Centralized control of data by the DBA avoids unnecessary duplication of data and
effectively reduces the total amount of data storage required avoiding duplication in the
elimination of the inconsistencies that tend to be present in redundant data files.
Sharing of data:
A database allows the sharing of data under its control by any number of application
programs or users.
Data Integrity:
Data integrity means that the data contained in the database is both accurate and
consistent. Therefore data values being entered for storage could be checked to ensure
that they fall with in a specified range and are of the correct format.
Data Security:
The DBA who has the ultimate responsibility for the data in the dbms can ensure that
proper access procedures are followed including proper authentication schemas for access
to the DBS and additional check before permitting access to sensitive data.
Conflict resolution:
DBA resolve the conflict on requirements of various user and applications. The DBA
chooses the best file structure and access method to get optional performance for the
application.
Data Independence:
Prepared by: Dr. Subhendu Kumar Rath
Data independence is usually considered from two points of views; physically data
independence and logical data independence.
Physical data Independence allows changes in the physical storage devices or
organization of the files to be made without requiring changes in the conceptual view or
any of the external views and hence in the application programs using the data base.
Logical data independence indicates that the conceptual schema can be changed without
affecting the existing external schema or any application program.
Disadvantage of DBMS:
1. DBMS software and hardware (networking installation) cost is high
2. The processing overhead by the dbms for implementation of security, integrity
and sharing of the data.
3. centralized database control
4. Setup of the database system requires more knowledge, money, skills, and time.
5. The complexity of the database may result in poor performance.
Database Basics:
Data item:
The data item is also called as field in data processing and is the smallest unit of data
that has meaning to its users.
Eg: “e101”,”sumit”
An entity is a thing or object in the real world that is distinguishable from all other
objects
Eg:
Bank,employee,student
Attributes are properties are properties of an entity.
Eg:
Empcode,ename,rolno,name
Logical data are the data for the table created by user in primary memory.
Physical data refers to the data stored in the secondary memory.
A subschema is derived schema derived from existing schema as per the user
requirement. There may be more then one subschema create for a single conceptual
schema.
Conceptual
level Mapping supplied by DBMS
Conceptual view
Internal level
A database management system that provides three level of data is said to follow three-
level architecture .
External level
Conceptual level
Internal level
External level :
Prepared by: Dr. Subhendu Kumar Rath
The external level is at the highest level of database abstraction . At this level, there will
be many views define for different users requirement. A view will describe only a subset
of the database. Any number of user views may exist for a given global or subschema.
for example , each student has different view of the time table. the view of a student of
Btech (CSE) is different from the view of the student of Btech(ECE).Thus this level of
abstraction is concerned with different categories of users.
Each external view is described by means of a schema called schema or
schema.
Conceptual level :
At this level of database abstraction all the database entities and the
relationships among them are included . One conceptual view represents the entire
database . This conceptual view is defined by the conceptual schema.
The conceptual schema hides the details of physical storage structures and concentrate on
describing entities , data types, relationships, user operations and constraints.
It describes all the records and relationships included in the conceptual view
. There is only one conceptual schema per database . It includes feature that specify the
checks to relation data consistency and integrity.
Internal level :
It is the lowest level of abstraction closest to the physical storage method used .
It indicates how the data will be stored and describes the data structures and access
methods to be used by the database . The internal view is expressed by internal schema.
The following aspects are considered at this level:
1. Storage allocation e.g: B-tree,hashing
2. access paths eg. specification of primary and secondary keys,indexes etc
3. Miscellaneous eg. Data compression and encryption techniques,optimization of
the internal structures.
Database users :
Naive users :
Users who need not be aware of the presence of the database system or any other
system supporting their usage are considered naïve users . A user of an automatic teller
machine falls on this category.
Prepared by: Dr. Subhendu Kumar Rath, BPUT.
Online users :
These are users who may communicate with the database directly via an online
terminal or indirectly via a user interface and application program. These users are
aware of the database system and also know the data manipulation language system.
Application programmers :
Database Administration :
A person who has central control over the system is called database administrator .
The function of DBA are :
1. creation and modification of conceptual Schema
definition
2. Implementation of storage structure and access method.
3. schema and physical organization modifications .
4. granting of authorization for data access.
5. Integrity constraints specification.
6. Execute immediate recovery procedure in case of failures
7. ensure physical security to database
Database language :
Elements of DBMS:
DML pre-compiler:
DDL compiler:
The DDL compiler converts the data definition statements into a set of tables. These
tables contains information concerning the database and are in a form that can be used by
other components of the dbms.
File manager:
File manager manages the allocation of space on disk storage and the data structure used
to represent information stored on disk.
Database manager:
A database manager is a program module which provides the interface between the low
level data stored in the database and the application programs and queries submitted to
the system.
The responsibilities of database manager are:
1. Interaction with file manager: The data is stored on the disk using the file
system which is provided by operating system. The database manager translate
the the different DML statements into low-level file system commands. so The
database manager is responsible for the actual storing,retrieving and updating
of data in the database.
2. Integrity enforcement:The data values stored in the database must satisfy
certain constraints(eg: the age of a person can't be less then zero).These
constraints are specified by DBA. Data manager checks the constraints and if
it satisfies then it stores the data in the database.
3. Security enforcement:Data manager checks the security measures for
database from unauthorized users.
4. Backup and recovery:Database manager detects the failures occurs due to
different causes (like disk failure, power failure,deadlock,s/w error) and
restores the database to original state of the database.
5. Concurrency control:When several users access the same database file
simultaneously, there may be possibilities of data inconsistency. It is
Prepared by: Dr. Subhendu Kumar Rath, BPUT.
Database manager
File manager
DBMS
Data file
Data dictionary
Answer: Some main differences between a database management system and a file-
processing system are:
• Both systems contain a collection of data and a set of programs which access that
data. A database management system coordinates both the physical and the logical
Prepared by: Dr. Subhendu Kumar Rath
access to the data, whereas a file-processing system coordinates only the physical
access.
• A database management system reduces the amount of data duplication by
ensuring that a physical piece of data is available to all programs authorized to
have access to it, where as data written by one program in a file-processing system
may not be readable by another program.
• A database management system is designed to allow flexible access to data (i.e.,
queries), whereas a file-processing system is designed to allow predetermined
access to data (i.e., compiled programs).
• A database management system is designed to coordinate multiple users accessing
the same data at the same time. A file-processing system is usually designed to
allow one or more programs to access different data files at the same time. In a
file-processing system, a file can be accessed by two programs concurrently only
if both programs have read-only access to the file.
Answer:
• Physical data independence is the ability to modify the physical scheme without
making it necessary to rewrite application programs. Such modifications include
changing from unblocked to blocked record storage, or from sequential to random
access files.
• Logical data independence is the ability to modify the conceptual scheme without
making it necessary to rewrite application programs. Such a modification might
be adding a field to a record; an application program’s view hides this change
from the program.
Q. List six major steps that you would take in setting up a database for a particular
enterprise.
Answer: Six major steps in setting up a database for a particular enterprise are:
• Define the high level requirements of the enterprise (this step generates a
document known as the system requirements specification.)
• Define a model containing all appropriate types of data and data
relationships.
• Define the integrity constraints on the data.
• Define the physical level.
• For each known problem to be solved on a regular basis (e.g., tasks to be
carried out by clerks or Web users) define a user interface to carry out the
task, and write the necessary application programs to implement the user
interface.
• Create/initialize the database.
EXERCISES:
CHAPTER-2
ER-MODEL
Data model:
The data model describes the structure of a database. It is a collection of conceptual tools
for describing data, data relationships and consistency constraints and various types of
data model such as
1. Object based logical model
2. Record based logical model
3. Physical model
The entity-relationship data model perceives the real world as consisting of basic objects,
called entities and relationships among these objects. It was developed to facilitate data
base design by allowing specification of an enterprise schema which represents the
overall logical structure of a data base.
Main features of ER-MODEL:
• Entity relationship model is a high level conceptual model
• It allows us to describe the data involved in a real world enterprise in terms of
objects and their relationships.
• It is widely used to develop an initial design of a database
• It provides a set of useful concepts that make it convenient for a developer to
move from a baseid set of information to a detailed and description of information
that can be easily implemented in a database system
• It describes data as a collection of entities, relationships and attributes.
Prepared by: Dr. Subhendu Kumar Rath
Basic concepts:
The E-R data model employs three basic notions : entity sets, relationship sets and
attributes.
Entity sets:
An entity is a “thing” or “object” in the real world that is distinguishable from all other
objects. For example, each person in an enterprise is an entity. An entity has a set
properties and the values for some set of properties may uniquely identify an entity.
BOOK is entity and its properties(calles as attributes) bookcode, booktitle, price etc .
An entity set is a set of entities of the same type that share the same properties, or
attributes. The set of all persons who are customers at a given bank, for example, can be
defined as the entity set customer.
Attributes:
Customer is an entity and its attributes are customerid, custmername, custaddress etc.
An attribute as used in the E-R model , can be characterized by the following attribute
types.
a) Simple and composite attribute:
simple attributes are the attributes which can’t be divided into sub parts
eg: customerid,empno
composite attributes are the attributes which can be divided into subparts.
eg: name consisting of first name, middle name, last name
address consisting of city,pincode,state
b) single-valued and multi-valued attribute:
The attribute having unique value is single –valued attribute
eg: empno,customerid,regdno etc.
The attribute having more than one value is multi-valued attribute
eg: phone-no, dependent name, vehicle
c) Derived Attribute:
The values for this type of attribute can be derived from the values of existing
attributes
eg: age which can be derived from (currentdate-birthdate)
experience_in_year can be calculated as (currentdate-joindate)
Relationship sets:
A relationship is an association among several entities.
A relationship set is a set of relationships of the same type. Formally, it is a mathematical
relation on n>=2 entity sets. If E1,E2…En are entity sets, then a relation ship set R is a
subset of
{(e1,e2,…en)|e1Є E1,e2 Є E2..,en Є En}
where (e1,e2,…en) is a relation ship.
Consider the two entity sets customer and loan. We define the relationship set borrow to
denote the association between customers and the bank loans that the customers have.
Mapping Cardinalities:
Mapping cardinalities or cardinality ratios, express the number of entities to which
another entity can be associated via a relationship set.
Mapping cardinalities are most useful in describing binary relationship sets, although they
can contribute to the description of relationship sets that involve more than two entity
sets.
For a binary relationship set R between entity sets A and B, the mapping cardinalities
must be one of the following:
one to one:
One to many:
Many to one:
An entity in A is associated with at most one entity in B. An entity in B is associated with
any number in A.
1 M
Course Teach Faculty
es
Many –to-many:
Entities in A and B are associated with any number of entities from each other.
1 M
Customer Depos Account
it
• The weak entity set must have total participation in the identifying relationship.
Example:
Consider the entity type dependent related to employee entity, which is used to keep
track of the dependents of each employee. The attributes of dependents are : name
,birthrate, sex and relationship. Each employee entity set is said to its own the
dependent entities that are related to it. How ever, not that the ‘dependent’ entity does
not exist of its own., it is dependent on the employee entity. In other words we can say
that in case an employee leaves the organization all dependents related to without the
entity ‘employee’. Thus it is a weak entity.
Keys:
Super key:
A super key is a set of one or more attributes that taken collectively, allow us to
identify uniquely an entity in the entity set.
For example , customer-id,(cname,customer-id),(cname,telno)
Candidate key:
In a relation R, a candidate key for R is a subset of the set of attributes of R, which
have the following properties:
• Uniqueness: no two distinct tuples in R have the same values for
the candidate key
• Irreducible: No proper subset of the candidate key has the
uniqueness property that is the candidate key.
Eg: (cname,telno)
Primary key:
The primary key is the candidate key that is chosen by the database designer as the
principal means of identifying entities with in an entity set. The remaining candidate
keys if any, are called alternate key.
Prepared by: Dr. Subhendu Kumar Rath
ER-DIAGRAM:
The overall logical structure of a database using ER-model graphically with the help
of an ER-diagram.
Symbols use ER- diagram:
entity
Weak entity
composite attribute
attribute Relationship
1 m
1 1
RELATIONAL MODEL
Relational model is simple model is simple model in which database is represented as a
collection of “relations” where each relation is represented by two-dimensional table.
The relational model was founded by E.F.Codd of the IBM in 1972.The basic concept in
the relational model is that of a relation.
Properties:
o It is column homogeneous. In other words, in any given column of a table, all
items are of the same kind.
o Each item is a simple number or a character string. That is a table must be in first
normal form.
o All rows of a table are distinct.
o The ordering of rows with in a table is immaterial.
Prepared by: Dr. Subhendu Kumar Rath
o The column of a table are assigned distinct names and the ordering of these
columns in immaterial.
Relational schema:
A relational schema specifies the relation’ name, its attributes and the domain of each
attribute. If R is the name of a relation and A1,A2,… and is a list of attributes
representing R then R(A1,A2,…,an) is called a relational schema. Each attribute in
this relational schema takes a value from some specific domain called domain(Ai).
Example:
PERSON(PERSON_IDinteger,NAME: STRING,AGE:INTEGER,ADDRESS:string)
Total number of attributes in a relation denotes the degree of a relation.since the
PERSON relation schemea contains four attributes ,so this relation is of degree 4.
Relation Instance:
A relational instance denoted as r is a collection of tuples for a given relational
schema at a specific point of time.
A relation state r to the relations schema R(A1,A2…,An) also denoted by r® is a set
of n-tuples
R{t1,t2,…tm}
Where each n-tuple is an ordered list of n values
T=<v1,v2,….vn>
Where each vi belongs to domain (Ai) or contains null values.
The relation schema is also called ‘intension’ and the relation state is also called
‘extension’.
Eg:
Relation instance:
Student:
Rollno Name City Age
101 Sujit Bam 23
102 kunal bbsr 22
Keys:
Super key:
A super key is an attribute or a set of attributes used to identify the records uniquely in
a relation.
Eg: (cname,telno)
Primary key:
The primary key is the candidate key that is chosen by the database designer as the
principal means of identifying entities with in an entity set. The remaining candidate
keys if any are called alternate key.
RELATIONAL CONSTRAINTS:
There are three types of constraints on relational database that include
o DOMAIN CONSTRAINTS
o KEY CONSTRAINTS
o INTEGRITY CONSTRAINTS
DOMAIN CONSTRAINTS:
It specifies that each attribute in a relation an atomic value from the corresponding
domains. The data types associated with commercial RDBMS domains include:
Prepared by: Dr. Subhendu Kumar Rath
Key constraints:
This constraints states that the key attribute value in each tuple msut be unique .i.e, no
two tuples contain the same value for the key attribute.(null values can allowed)
Integrity constraints:
Department(deptcode,dname)
Here the deptcode is the primary key.
Emp(empcode,name,city,deptcode).
Here the deptcode is foreign key.
Prepared by: Dr. Subhendu Kumar Rath, BPUT.
CODD'S RULES
To access any data-item you specify which column within which table it exists, there is
no reading of characters 10 to 20 of a 255 byte string.
Rule 3 : Systematic treatment of null values.
"Null values (distinct from the empty character string or a string of blank characters and
distinct from zero or any other number) are supported in fully relational DBMS for
representing missing information and inapplicable information in a systematic way,
independent of data type."
If data does not exist or does not apply then a value of NULL is applied, this is
understood by the RDBMS as meaning non-applicable data.
Rule 4 : Dynamic on-line catalog based on the relational model.
"The data base description is represented at the logical level in the same way as-ordinary
data, so that authorized users can apply the same relational language to its interrogation as
they apply to the regular data."
The Data Dictionary is held within the RDBMS, thus there is no-need for off-line
volumes to tell you the structure of the database.
Rule 5 : Comprehensive data sub-language Rule.
"A relational system may support several languages and various modes of terminal use
(for example, the fill-in-the-blanks mode). However, there must be at least one language
whose statements are expressible, per some well-defined syntax, as character strings and
that is comprehensive in supporting all the following items
• Data Definition
• View Definition
• Integrity Constraints
• Authorization.