0% found this document useful (0 votes)
2 views

Unit 1

The document provides an overview of database architecture, emphasizing the characteristics and purposes of database systems, including their self-describing nature, data abstraction, and support for multiple user views. It discusses the roles of various stakeholders in database management, such as database administrators, designers, and end users, as well as the importance of data models, schemas, and instances. Additionally, it introduces the three-schema architecture and data independence, along with a classification of database management systems based on data models.

Uploaded by

iqacgfgch2024
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Unit 1

The document provides an overview of database architecture, emphasizing the characteristics and purposes of database systems, including their self-describing nature, data abstraction, and support for multiple user views. It discusses the roles of various stakeholders in database management, such as database administrators, designers, and end users, as well as the importance of data models, schemas, and instances. Additionally, it introduces the three-schema architecture and data independence, along with a classification of database management systems based on data models.

Uploaded by

iqacgfgch2024
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Page |1

Unit 1: Database Architecture


Introduction
A database is a collection of related data. By data, we mean known facts that can be recorded and that have implicit
meaning. For example, consider the names, telephone numbers, and addresses of the people you know.
A database has the following implicit properties:
■ A database represents some aspect of the real world, sometimes called the miniworld or the universe of discourse
(UoD). Changes to the miniworld are reflected in the database.
■ A database is a logically coherent collection of data with some inherent meaning. A random assortment of data
cannot correctly be referred to as a database.
■ A database is designed, built, and populated with data for a specific purpose. It has an intended group of users and
some preconceived applications in which these users are interested.

Characteristics and Purpose of database approach


The main characteristics of the database approach versus the file-processing approach are the following:
■ Self-describing nature of a database system
■ Insulation between programs and data, and data abstraction
■ Support of multiple views of the data
■ Sharing of data and multiuser transaction processing
1. Self-Describing Nature of a Database System
A fundamental characteristic of the database approach is that the database system contains not only the database
itself but also a complete definition or description of the database structure and constraints. This definition is stored
in the DBMS catalog which is not database specific. The catalog is used by the DBMS software and also by database
users who need information about the database structure.
In traditional file processing, data definition is typically part of the application programs themselves. Hence,
these programs are constrained to work with only one specific database, whose structure is declared in the application
programs.
2. Insulation between Programs and Data, and Data Abstraction
In traditional file processing, the structure of data files is embedded in the application programs, so any changes to
the structure of a file may require changing all programs that access that file. By contrast, DBMS access programs do
not require such changes in most cases. The structure of data files is stored in the DBMS catalog separately from the
access programs. We call this property program-data independence.
In object-oriented and object-relational systems, users can define operations on data as part of the database
definitions. An operation is specified in two parts. The interface of an operation includes the operation name and the
data types of its arguments. The implementation of the operation is specified separately and can be changed without
affecting the interface. User application programs can operate on the data by invoking these operations through their
names and arguments, regardless of how the operations are implemented. This is termed program-operation
independence.
The characteristic that allows program-data independence and program-operation independence is called data
abstraction.
3. Support of Multiple Views of the Data
Page |2

A database typically has many users, each of whom may require a different perspective or view of the database. A
multiuser DBMS whose users have a variety of distinct applications must provide facilities for defining multiple views.
4. Sharing of Data and Multiuser Transaction Processing
A multiuser DBMS, as its name implies, must allow multiple users to access the database at the same time. The DBMS
must include concurrency control software to ensure that several users trying to update the same data do so in a
controlled manner so that the result of the updates is correct. For example, when several reservation agents try to
assign a seat on an airline flight, the DBMS should ensure that each seat can be accessed by only one agent at a time
for assignment to a passenger. These types of applications are generally called online transaction processing (OLTP)
applications. A fundamental role of multiuser DBMS software is to ensure that concurrent transactions operate
correctly and efficiently.
A transaction is an executing program or process that includes one or more database accesses, such as reading or
updating of database records. Each transaction is supposed to execute a logically correct database access if executed
in its entirety without interference from other transactions. The DBMS must enforce several transaction properties.
The isolation property ensures that each transaction appears to execute in isolation from other transactions, even
though hundreds of transactions may be executing concurrently. The atomicity property ensures that either all the
database operations in a transaction are executed or none are.

People associated with Database system


In large organizations, many people are involved in the design, use, and maintenance of a large database with
hundreds of users. The people whose jobs involve the day-to-day use of a large database are called the actors on the
scene. People who are called workers behind the scene are those who work to maintain the database system
environment but who are not actively interested in the database contents as part of their daily job.

Actors on the scene


1. Database Administrators
In a database environment, the primary resource is the database itself, and the secondary resource is the DBMS
and related software. Administering these resources is the responsibility of the database administrator (DBA). The
DBA is responsible for authorizing access to the database, coordinating and monitoring its use, and acquiring
software and hardware resources as needed. The DBA is accountable for problems such as security breaches and
poor system response time. In large organizations, the DBA is assisted by a staff that carries out these functions.
2 Database Designers
Database designers are responsible for identifying the data to be stored in the database and for choosing
appropriate structures to represent and store this data. These tasks are mostly undertaken before the database is
actually implemented and populated with data. Database designers typically interact with each potential group of
users and develop views of the database that meet the data and processing requirements of these groups. Each
view is then analyzed and integrated with the views of other user groups. The final database design must be
capable of supporting the requirements of all user groups.
3 End Users
End users are the people whose jobs require access to the database for querying, updating, and generating reports;
the database primarily exists for their use. There are several categories of end users:
■ Casual end users occasionally access the database, but they may need different information each time. They
use a sophisticated database query language to specify their requests and are typically middle- or high-level
managers or other occasional browsers.
■ Naive or parametric end users constantly query and update the database, using canned transactions—that
have been carefully programmed and tested. The tasks that such users perform are varied:

 Bank tellers check account balances and post withdrawals and deposits.
Page |3

 Reservation agents for airlines, hotels, and car rental companies check availability for a given request
and make reservations.
■ Sophisticated end users include engineers, scientists, business analysts, and others who thoroughly familiarize
themselves with the facilities of the DBMS in order to implement their own applications to meet their complex
requirements.
■ Standalone users maintain personal databases by using ready-made program packages that provide easy-to-use
menu-based or graphics-based interfaces.
4. System Analysts and Application Programmers (Software Engineers)
System analysts determine the requirements of end users, especially naive and parametric end users, and develop
specifications for standard canned transactions that meet these requirements. Application programmers
implement these specifications as programs; then they test, debug, document, and maintain these canned
transactions. Such analysts and programmers—commonly referred to as software developers or software
engineers—should be familiar with the full range of capabilities provided by the DBMS to accomplish their tasks.

Workers behind the scene


These persons are typically not interested in the database content itself. We call them the workers behind the
scene, and they include the following categories:
■ DBMS system designers and implementers design and implement the DBMS modules and interfaces as a
software package. A DBMS consists of many components, or modules, including modules for implementing the
catalog, query language processing, interface processing, accessing and buffering data, controlling concurrency,
and handling data recovery and security. The DBMS must interface with other system software such as the
operating system and compilers for various programming languages.
■ Tool developers design and implement tools—the software packages that facilitate database modeling and
design, database system design, and improved performance. Tools are optional packages that are often
purchased separately. They include packages for database design, performance monitoring etc
■ Operators and maintenance personnel (system administration personnel) are responsible for the actual running
and maintenance of the hardware and software environment for the database system.

Data models, Schema and Instances


Data abstraction generally refers to the suppression of
details of data organization and storage, and the highlighting of the essential features for an improved understanding
of data. One of the main characteristics of the database approach is to support data abstraction so that different users
can perceive data at their preferred level of detail.
A data model—a collection of concepts that can be used to describe the structure of a database—provides the
necessary means to achieve this abstraction.[By structure of a database we mean the data types, relationships, and
constraints that apply to the data.]

Categories of Data Models


Many data models have been proposed, which we can categorize according to the
types of concepts they use to describe the database structure.
 High-level or conceptual data models provide concepts that are close to the way many users perceive data
 Low-level or physical data models provide concepts that describe the details of how data is stored on the
computer storage media, typically magnetic disks. Concepts provided by low-level data models are generally
meant for computer specialists, not for end users.
Page |4

 Representational (or implementation) data models, provide concepts that may be easily understood by end
users but that are not too far removed from the way data is organized in computer storage.

Schemas, Instances, and Database State


The description of a database is called the database schema, which is specified during database design and is not
expected to change frequently.6 Most data models have certain conventions for displaying schemas as diagrams.
A displayed schema is called a schema diagram. Figure 1.1 shows a schema diagram

Figure 1: Schema diagram

The actual data in a database may change quite frequently. For example, the database changes every time we add a
new student or enter a new grade. The data in the database at a particular moment in time is called a database state
or snapshot or instance.

The distinction between database schema and database state is very important. When we define a new database,
we specify its database schema only to the DBMS. At this point, the corresponding database state is the empty state
with no data. We get the initial state of the database when the database is first populated or loaded with the initial
data. From then on, every time an update operation is applied to the database, we get another database state.

The DBMS stores the descriptions of the schema constructs and constraints—also called the meta-data—in the
DBMS catalog so that DBMS software can refer to the schema whenever it needs to. The schema is sometimes called
the intension, and a database state is called an extension of the schema.

Database architecture/Three-Schema Architecture


The goal of the three-schema architecture is to separate the user applications from the physical database. In this
architecture, schemas can be defined at the following three levels:

1. The internal level has an internal schema, which describes the physical storage structure of the database. The
internal schema uses a physical data model and describes the complete details of data storage and access paths for
the database.

2. The conceptual level has a conceptual schema, which describes the structure of the whole database for a
community of users. The conceptual schema hides the details of physical storage structures and concentrates on
describing entities, data types, relationships, user operations, and constraints.

3. The external or view level includes a number of external schemas or user views. Each external schema describes
the part of the database that a particular user group is interested in and hides the rest of the database from that user
group.
Page |5

Figure 2 The three-schema architecture.

The three-schema architecture is a convenient tool with which the user can visualize the schema levels in a database
system. Notice that the three schemas are only descriptions of data; the stored data that actually exists is at the
physical level only. In a DBMS based on the three-schema architecture, each user group refers to its own external
schema.

Hence, the DBMS must transform a request specified on an external schema into a request against the conceptual
schema, and then into a request on the internal schema for processing over the stored database. If the request is a
database retrieval, the data extracted from the stored database must be reformatted to match the user’s external
view. The processes of transforming requests and results between levels are called mappings.

Data Independence
Data independence, can be defined as the capacity to change the schema at one level of a database system
without having to change the schema at the next higher level. We can define two types of data independence:

1. Logical data independence is the capacity to change the conceptual schema without having to change
external schemas or application programs. We may change the conceptual schema to expand the
database (by adding a record type or data item), to change constraints, or to reduce the database (by
removing a record type or data item). In the last case, external schemas that refer only to the remaining
data should not be affected.
2. Physical data independence is the capacity to change the internal schema
without having to change the conceptual schema. Hence, the external
schemas need not be changed as well. Changes to the internal schema may be
needed because some physical files were reorganized—for example, by creating additional access structures—
to improve the performance of retrieval or
update. If the same data as before remains in the database, we should not
have to change the conceptual schema.

Generally, physical data independence exists in most databases and file environments where physical details
such as the exact location of data on disk, and hardware details of storage encoding, placement, compression,
splitting, merging of records, and so on are hidden from the user. Applications remain unaware of these
Page |6

details. On the other hand, logical data independence is harder to achieve because it allows structural and
constraint changes without affecting application programs—a much stricter requirement. The three-schema
architecture can make it easier to achieve true data independence, both physical and logical.

Database Languages
In many DBMSs where no strict separation of levels is maintained, one language, called the data definition
language (DDL), is used by the DBA and by database designers to define both schemas. The DBMS will have a
DDL compiler whose function is to process DDL statements in order to identify descriptions of the schema
constructs and to store the schema description in the DBMS catalog.

In DBMSs where a clear separation is maintained between the conceptual and internal levels, the DDL is used
to specify the conceptual schema only. Another language, the storage definition language (SDL), is used to
specify the internal schema. The mappings between the two schemas may be specified in either one of these
languages.

For a true three-schema architecture, we would need a third language, the view definition language (VDL), to
specify user views and their mappings to the conceptual schema.

Once the database schemas are compiled and the database is populated with data, users must have some
means to manipulate the database. Typical manipulations include retrieval, insertion, deletion, and
modification of the data. The DBMS provides a set of operations or a language called the data manipulation
language (DML) for these purposes.

classification of DBMS
Based on the data model
Relational database – This is the most popular data model used in industries. It is based on the SQL. They are
table oriented which means data is stored in different access control tables, each has the key field whose
task is to identify each row. The tables or the files with the data are called as relations that help in designating
the row or record, and columns are referred to attributes or fields. Few examples are MYSQL(Oracle, open
source), Oracle database (Oracle), Microsoft SQL server(Microsoft) and DB2(IBM).

Object oriented database – The information here is in the form of the object as used in object oriented
programming. It adds the database functionality to object programming languages. It requires less code,
use more natural data and also code bases are easy to maintain. Examples are ObjectDB (ObjectDB
software).

Object relational database – Relational DBMS are evolving continuously and they have been incorporating
many concepts developed in object database leading to a new class called extended relational database or
object relational database.

Hierarchical database – In this, the information about the groups of parent or child relationships is present
in the records which is similar to the structure of a tree. Here the data follows a series of records, set of
values attached to it. They are used in industry on mainframe platforms. Examples are IMS(IBM), Windows
registry(Microsoft).

Network database – Mainly used on a large digital computers. If there are more connections, then this
database is efficient. They are similar to hierarchical database, they look like a cobweb or interconnected
network of records. Examples are CA-IDMS(COMPUTER associates), IMAGE(HP).

You might also like