Unit 1
Unit 1
A database typically has many users, each of whom may require a different perspective or view of the database. A
multiuser DBMS whose users have a variety of distinct applications must provide facilities for defining multiple views.
4. Sharing of Data and Multiuser Transaction Processing
A multiuser DBMS, as its name implies, must allow multiple users to access the database at the same time. The DBMS
must include concurrency control software to ensure that several users trying to update the same data do so in a
controlled manner so that the result of the updates is correct. For example, when several reservation agents try to
assign a seat on an airline flight, the DBMS should ensure that each seat can be accessed by only one agent at a time
for assignment to a passenger. These types of applications are generally called online transaction processing (OLTP)
applications. A fundamental role of multiuser DBMS software is to ensure that concurrent transactions operate
correctly and efficiently.
A transaction is an executing program or process that includes one or more database accesses, such as reading or
updating of database records. Each transaction is supposed to execute a logically correct database access if executed
in its entirety without interference from other transactions. The DBMS must enforce several transaction properties.
The isolation property ensures that each transaction appears to execute in isolation from other transactions, even
though hundreds of transactions may be executing concurrently. The atomicity property ensures that either all the
database operations in a transaction are executed or none are.
Bank tellers check account balances and post withdrawals and deposits.
Page |3
Reservation agents for airlines, hotels, and car rental companies check availability for a given request
and make reservations.
■ Sophisticated end users include engineers, scientists, business analysts, and others who thoroughly familiarize
themselves with the facilities of the DBMS in order to implement their own applications to meet their complex
requirements.
■ Standalone users maintain personal databases by using ready-made program packages that provide easy-to-use
menu-based or graphics-based interfaces.
4. System Analysts and Application Programmers (Software Engineers)
System analysts determine the requirements of end users, especially naive and parametric end users, and develop
specifications for standard canned transactions that meet these requirements. Application programmers
implement these specifications as programs; then they test, debug, document, and maintain these canned
transactions. Such analysts and programmers—commonly referred to as software developers or software
engineers—should be familiar with the full range of capabilities provided by the DBMS to accomplish their tasks.
Representational (or implementation) data models, provide concepts that may be easily understood by end
users but that are not too far removed from the way data is organized in computer storage.
The actual data in a database may change quite frequently. For example, the database changes every time we add a
new student or enter a new grade. The data in the database at a particular moment in time is called a database state
or snapshot or instance.
The distinction between database schema and database state is very important. When we define a new database,
we specify its database schema only to the DBMS. At this point, the corresponding database state is the empty state
with no data. We get the initial state of the database when the database is first populated or loaded with the initial
data. From then on, every time an update operation is applied to the database, we get another database state.
The DBMS stores the descriptions of the schema constructs and constraints—also called the meta-data—in the
DBMS catalog so that DBMS software can refer to the schema whenever it needs to. The schema is sometimes called
the intension, and a database state is called an extension of the schema.
1. The internal level has an internal schema, which describes the physical storage structure of the database. The
internal schema uses a physical data model and describes the complete details of data storage and access paths for
the database.
2. The conceptual level has a conceptual schema, which describes the structure of the whole database for a
community of users. The conceptual schema hides the details of physical storage structures and concentrates on
describing entities, data types, relationships, user operations, and constraints.
3. The external or view level includes a number of external schemas or user views. Each external schema describes
the part of the database that a particular user group is interested in and hides the rest of the database from that user
group.
Page |5
The three-schema architecture is a convenient tool with which the user can visualize the schema levels in a database
system. Notice that the three schemas are only descriptions of data; the stored data that actually exists is at the
physical level only. In a DBMS based on the three-schema architecture, each user group refers to its own external
schema.
Hence, the DBMS must transform a request specified on an external schema into a request against the conceptual
schema, and then into a request on the internal schema for processing over the stored database. If the request is a
database retrieval, the data extracted from the stored database must be reformatted to match the user’s external
view. The processes of transforming requests and results between levels are called mappings.
Data Independence
Data independence, can be defined as the capacity to change the schema at one level of a database system
without having to change the schema at the next higher level. We can define two types of data independence:
1. Logical data independence is the capacity to change the conceptual schema without having to change
external schemas or application programs. We may change the conceptual schema to expand the
database (by adding a record type or data item), to change constraints, or to reduce the database (by
removing a record type or data item). In the last case, external schemas that refer only to the remaining
data should not be affected.
2. Physical data independence is the capacity to change the internal schema
without having to change the conceptual schema. Hence, the external
schemas need not be changed as well. Changes to the internal schema may be
needed because some physical files were reorganized—for example, by creating additional access structures—
to improve the performance of retrieval or
update. If the same data as before remains in the database, we should not
have to change the conceptual schema.
Generally, physical data independence exists in most databases and file environments where physical details
such as the exact location of data on disk, and hardware details of storage encoding, placement, compression,
splitting, merging of records, and so on are hidden from the user. Applications remain unaware of these
Page |6
details. On the other hand, logical data independence is harder to achieve because it allows structural and
constraint changes without affecting application programs—a much stricter requirement. The three-schema
architecture can make it easier to achieve true data independence, both physical and logical.
Database Languages
In many DBMSs where no strict separation of levels is maintained, one language, called the data definition
language (DDL), is used by the DBA and by database designers to define both schemas. The DBMS will have a
DDL compiler whose function is to process DDL statements in order to identify descriptions of the schema
constructs and to store the schema description in the DBMS catalog.
In DBMSs where a clear separation is maintained between the conceptual and internal levels, the DDL is used
to specify the conceptual schema only. Another language, the storage definition language (SDL), is used to
specify the internal schema. The mappings between the two schemas may be specified in either one of these
languages.
For a true three-schema architecture, we would need a third language, the view definition language (VDL), to
specify user views and their mappings to the conceptual schema.
Once the database schemas are compiled and the database is populated with data, users must have some
means to manipulate the database. Typical manipulations include retrieval, insertion, deletion, and
modification of the data. The DBMS provides a set of operations or a language called the data manipulation
language (DML) for these purposes.
classification of DBMS
Based on the data model
Relational database – This is the most popular data model used in industries. It is based on the SQL. They are
table oriented which means data is stored in different access control tables, each has the key field whose
task is to identify each row. The tables or the files with the data are called as relations that help in designating
the row or record, and columns are referred to attributes or fields. Few examples are MYSQL(Oracle, open
source), Oracle database (Oracle), Microsoft SQL server(Microsoft) and DB2(IBM).
Object oriented database – The information here is in the form of the object as used in object oriented
programming. It adds the database functionality to object programming languages. It requires less code,
use more natural data and also code bases are easy to maintain. Examples are ObjectDB (ObjectDB
software).
Object relational database – Relational DBMS are evolving continuously and they have been incorporating
many concepts developed in object database leading to a new class called extended relational database or
object relational database.
Hierarchical database – In this, the information about the groups of parent or child relationships is present
in the records which is similar to the structure of a tree. Here the data follows a series of records, set of
values attached to it. They are used in industry on mainframe platforms. Examples are IMS(IBM), Windows
registry(Microsoft).
Network database – Mainly used on a large digital computers. If there are more connections, then this
database is efficient. They are similar to hierarchical database, they look like a cobweb or interconnected
network of records. Examples are CA-IDMS(COMPUTER associates), IMAGE(HP).