Unit1 CSE
Unit1 CSE
Definition of Data: Data, we mean known facts that can be recorded and that have implicit
meaning. For example, consider the names, telephone numbers, and addresses of the people you
know.
Database Management System (DBMS) is a combination of two words that is database &
management system. Combining the meaning of both gives the definition of DBMS.
A database management system (DBMS) is a collection of programs that enables users to create
and maintain a database. The DBMS is hence a general-purpose software system that facilitates
the processes of defining, constructing, manipulating, and sharing databases among various
users and applications. Defining a database involves specifying the data types, structures, and
constraints for the data to be stored in the database.Constructing the database is the process of
storing the data itself on some storage medium that is controlled by the DBMS. Manipulating a
database includes such functions as querying the database to retrieve specific data, updating the
database to reflect changes in the miniworld, and generating reports from the data. Sharing a
database allows multiple users and programs to access the database concurrently.
1)Data redundancy and inconsistency: Since different programmers create the files and
application programs over a long period, the various files are likely to have different formats and
the programs may be written in several programming languages. Moreover, the same information
may be duplicated in several places (files). For example, the address and telephone number of a
particular customer may appear in a file that consists of savings-account records and in a file that
consists of checking-account records. This redundancy leads
to higher storage and access cost. In addition, it may lead to data inconsistency; that is, the
various copies of the same data may no longer agree. For example, a changed customer address
may be reflected in savings-account records but not elsewhere in the system.
2) Difficulty in accessing data: conventional file-processing environments do not allow
needed data to be retrieved in a convenient and efficient manner .Suppose that one of the bank
officers needs to find out the names of all customers who live within a particular postal-code
area. The officer asks the data-processing department to generate such a list. Because the
designers of the original system did not anticipate this request, there is no application program on
hand to meet it. There is, however, an application program to generate the list of all customers.
The bank officer has now two choices: either obtain the list of all customers manually and extract
the needed information manually or ask a system programmer to write the necessary application
program. Both alternatives are obviously unsatisfactory.
3) Data isolation. Because data are scattered in various files, and files may be in different
formats, writing new application programs to retrieve the appropriate data is difficult.
4)Integrity problems. The data values stored in the database must satisfy certain
types of consistency constraints. For example, the balance of a bank account
may never fall below a prescribed amount (say, $25).
5) Atomicity problems. A computer system, like any other mechanical or electrical device, is
subject to failure. In many applications, it is crucial that, if a failure occurs, the data be restored
to the consistent state that existed prior to the failure. Consider a program to transfer $50 from
account A to account B. If a system failure occurs during the execution of the program, it is
possible that the $50 was removed from account A but was not credited to account B, resulting in
an inconsistent database state. That is, the funds transfer must be atomic—it must happen in its
entirety or not at all. It is difficult to ensure atomicity in a conventional file-processing system.
6)Concurrent-access anomalies. For the sake of overall performance of the system and faster
response, many systems allow multiple users to update the data simultaneously. In such an
Advantages of DBMS:
Functions of DBMS:
DBMS performs several important functions that guarantee the integrity and consistency of the
data in the database. The most important functions of Database Management System are
The DBMS creates and manages the complex structures required for data storage, thus relieving
you from the difficult task of defining and programming the physical data characteristics.
A modern DBMS system provides storage not only for the data, but also for related data entry
forms or screen definitions, report definitions, data validation rules, procedural code, structures
to handle video and picture formats, and so on.Data storage management is also important for
database performance tuning. Performance tuning relates to the activities that make the database
perform more efficiently in terms of storage and access speed. So, the data storage management
is another important function of Database Management System.
The DBMS transforms entered data in to required data structures. The DBMS relieves you of
the chore of making a distinction between the logical data format and the physical data format.
That is, the DBMS formats the physically retrieved data to make it conform to the user’s logical
expectations.For example, imagine an enterprise database used by a multinational company. An
end user in England would expect to enter data such as July 11, 2009, as “11/07/2009.” In
contrast, the same date would be entered in the United States as “07/11/2009.” Regardless of the
data presentation format, the DBMS system must manage the date in the proper format for each
country.
4. Security Management
Security Management is another important function of the Database Management System. The
DBMS creates a security system that enforces user security and data privacy. Security rules
determine which users can access the database, which data items each user can access, and which
data operations (read, add, delete, or modify) the user can perform. This is especially important
in multiuser database systems.
To provide data integrity and data consistency, the DBMS uses sophisticated algorithms to
ensure that multiple users can access the database concurrently without compromising the
integrityofthedatabase.
The DBMS provides backup and data recovery to ensure data safety and integrity. Current
DBMS systems provide special utilities that allow the DBA to perform routine and special
backup and restore procedures. Recovery management deals with the recovery of the database
after a failure, such as a bad sector in the disk or a power failure. Such capability is critical to
preservingthedatabase’sintegrity.
The DBMS promotes and enforces integrity rules, thus minimizing data redundancy and
maximizing data consistency. The data relationships stored in the data dictionary are used to
enforce data integrity. Ensuring data integrity is especially important in transaction-oriented
databasesystems.
The DBMS provides data access through a query language. A query language is a non procedural
language—one that lets the user specify what must be done without having to specify how it is to
be done. Structured Query Language (SQL) is the defacto query language and data access
standardsupportedbythemajorityofDBMSvendors.
- End users can generate answers to queries by filling in screen forms through their preferred
Web browser.
Data Abstraction:
For the system to be usable, it must retrieve data efficiently. The need for efficiency has led
designers to use complex data structures to represent data in the database. Since many database-
systems users are not computer trained, developers hide the complexity from users through
several levels of abstraction, to simplify users’ interactions with the system:
Physical level- The lowest level of abstraction describes how the data are actually stored.
The physical level describes complex low-level data structures in detail.
Logical level- The next-higher level of abstraction describes what data are stored in the
database, and what relationships exist among those data. Database administrators, who must
decide what information to keep in the database, use the logical level of abstraction.
Unit 1: Introduction to Database Systems 7
View level- The highest level of abstraction describes only part of the entire database.
Even though the logical level uses simpler structures, complexity remains because of the variety
of information stored in a large database. Many users of the database system do not need all this
information; instead, they need to access only a part of the database. The view level of
abstraction exists to simplify their interaction with the system. The system may provide many
views for the same database.
Figure 1.1 shows the relationship among the three levels of abstraction.
Eg.An analogy to the concept of data types in programming languages may clarify the
distinction among levels of abstraction. Most high-level programming languages support the
notion of a record type. For example, in a Pascal-like language, we may declare a record as
follows:
type customer = record
customer-id : string;
customer-name : string;
customer-street : string;
customer-city : string;
end;
This code defines a new record type called customer with four fields. Each field has a name and
a type associated with it. A banking enterprise may have several such record types, including
• account, with fields account-number and balance
• employee, with fields employee-name and salary
*At the physical level, a customer, account, or employee record can be described as a block of
consecutive storage locations (for example, words or bytes). The language compiler hides this
level of detail from programmers.
*At the logical level, each such record is described by a type definition, as in the previous code
segment, and the interrelationship of these record types is defined as well. Programmers using a
programming language work at this level of abstraction. Similarly, database administrators
usually work at this level of abstraction.
*Finally, at the view level, computer users see a set of application programs that hide details of
the data types. Similarly, at the view level, several views of the database are defined, and
database users see these views. In addition to hiding details of the logical level of the database,
the views also provide a security mechanism to prevent users from accessing certain parts of the
database. For example, tellers in a bank see only that part of the database that has information on
customer accounts; they cannot access information about salaries of employees.
Underlying the structure of a database is the data model: a collection of conceptual tools for
describing data, data relationships, data semantics, and consistency constraints.
A data model provides a way to describe the design of database at physical, logical and view
level.
1) Object based logical model
2) Record based logical model
3) Physical data model
b)Object oriented model : This model is based on a collection of object. An object contains
values store in a variable with the object. The object oriented model accept the values from
variables & functions.
2)Record based logical model :
Record based logical model are used at the conpetual & view levels. It use all the object oriented
base logical data. They are classified in 3 groups.
i)Relational model
ii)Network model .
iii)Hierarchical model
i)Relational model : Relational model represent data & relationship among data by a collection
of table. Each table contains number of rows & columns & they are logically group up into the
single unit.
ii) Network model : Data in the network model are represented by collection of records &
relationship as parent-child. It records or organize in a arbitrary graphs. Hence, each node
logically connect to each other.
Schemas Diagrams : Most data models have certain conventions for displaying schemas as
diagrams. A displayed schema is called a schema diagram.
Instances / Database State :The data in the database at a particular moment in time is called a
database state or snapshot. It is also called the current set of occurrences or instances in the
database.
The goal of the three-schema architecture, illustrated in Figure 2.2, is to separate the user
applications and the physical database. In this architecture, schemas can be defined at the
following three levels:
1. The internal level has an internal schema, which describes the physical storage structure of the
database. The internal schema uses a physical data model and describes the complete details of
data storage and access paths for the database.
2. The conceptual level has a conceptual schema, which describes the structure of the whole
database for a community of users. The conceptual schema hides the details of physical storage
structures and concentrates on describing entities, data types, relationships, user operations, and
constraints. Usually, a representational data model is used to describe the conceptual schema
when a database system is implemented.
3. The external or view level includes a number of external schemas or user views. Each external
schema describes the part of the database that a particular user group is interested in and hides
the rest of the database from that user group.
*MAPPING : The processes of transforming requests and results between levels are called
mappings.
Database Languages:
A database system provides a data definition language to specify the database schema and a data
manipulation language to express database queries and updates.
Data-Definition Language
-We specify a database schema by a set of definitions expressed by a special language called a
data-definition language (DDL).
For instance, the following statement in the SQL language defines the account table:
create table account
(account-number char(10),
balance integer)
Execution of the above DDL statement creates the account table. In addition, it updates a special
set of tables called the data dictionary or data directory.
Data-Manipulation Language
Data manipulation is
• The retrieval of information stored in the database
• The insertion of new information into the database
• The deletion of information from the database
• The modification of information stored in the database
-A data-manipulation language (DML) is a language that enables users to access or manipulate
data as organized by the appropriate data model.
There are basically two types:
• Procedural DMLs require a user to specify what data are needed and how to get those data.
• Declarative DMLs (also referred to as nonprocedural DMLs) require a user to specify what data
are needed without specifying how to get those data.
i) Storage Manager :
A storage manager is a program module that provides the interface between the low level
data stored in the database and the application programs and queries submitted to the system.
The storage manager is responsible for the interaction with the file manager. The raw data are
stored on the disk using the file system, which is usually provided by a conventional operating
system. The storage manager translates the various DML statements into low-level file-system
commands. Thus, the storage manager is responsible for storing, retrieving, and updating data in
the database.
The storage manager components include • Authorization and integrity manager-which tests for
the satisfaction of integrity constraints and checks the authority of users to access data.
• Transaction manager- which ensures that the database remains in a consistent (correct) state
despite system failures, and that concurrent transaction executions proceed without conflicting.
• File manager- which manages the allocation of space on disk storage and the data structures
used to represent information stored on disk.
• Buffer manager- which is responsible for fetching data from disk storage into main memory,
and deciding what data to cache in main memory. The buffer manager is a critical part of the
database system, since it enables the database
DBMS environment:
We can identify five major components in the DBMS environment: hardware, software, data,
procedures, and people:
(1) Hardware: The computer system(s) that the DBMS and the application programs run on. This
can range from a single PC, to a single mainframe, to a network of computers.
(2) Software: The DBMS software and the application programs, together with the operating
system, including network software if the DBMS is being used over a network.
(3) Data: The data acts as a bridge between the hardware and software components and the
human components. As we’ve already said, the database contains both the operational data and
the meta-data (the ‘data about data’).
(4) Procedures: The instructions and rules that govern the design and use of the database. This
may include instructions on how to log on to the DBMS, make backup copies of the database,
and how to handle hardware or software failures.
(5) People :This includes the database designers, database
administrators (DBAs), application programmers, and the end-users.