0% found this document useful (0 votes)
130 views

Data Abstraction in DBMS

The document discusses data abstraction in database management systems (DBMS). It explains that data abstraction hides irrelevant details from users and provides an abstract view of the data. It also describes the three levels of abstraction in a DBMS - physical, logical, and view levels - and gives examples to illustrate each level.

Uploaded by

viswanadh173
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
130 views

Data Abstraction in DBMS

The document discusses data abstraction in database management systems (DBMS). It explains that data abstraction hides irrelevant details from users and provides an abstract view of the data. It also describes the three levels of abstraction in a DBMS - physical, logical, and view levels - and gives examples to illustrate each level.

Uploaded by

viswanadh173
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 13

Data Abstraction in DBMS

Database systems are made-up of complex data structures. To ease the user interaction
with database, the developers hide internal irrelevant details from users. This process of
hiding irrelevant details from user is called data abstraction. The term “irrelevant” used
here with respect to the user, it doesn’t mean that the hidden data is not relevant with
regard to the whole database. It just means that the user is not concerned about that
data.

For example: When you are booking a train ticket, you are not concerned how data is
processing at the back end when you click “book ticket”, what processes are happening
when you are doing online payments. You are just concerned about the message that
pops up when your ticket is successfully booked. This doesn’t mean that the process
happening at the back end is not relevant, it just means that you as a user are not
concerned what is happening in the database.

Three levels of abstraction

Physical level: This is the lowest level of data abstraction. It describes how data is
actually stored in database. You can get the complex data structure details at this level.
Logical level: This is the middle level of 3-level data abstraction architecture. It describes
what data is stored in database(Entities/Relationships/datatypes) etc.

View level: Highest level of data abstraction. This level describes the user interaction
with database system.(Only a part of entire database)

Example: Let’s say we are storing customer information in a customer table. At physical
level these records can be described as blocks of storage (bytes, gigabytes, terabytes etc.)
in memory. These details are often hidden from the programmers.

At the logical level these records can be described as fields and attributes along with their
data types, their relationship among each other can be logically implemented. The
programmers generally work at this level because they are aware of such things about
database systems.

At view level, user just interact with system with the help of GUI and enter the details at
the screen, they are not aware of how the data is stored and what data is stored; such
details are hidden from them.

View of data in DBMS


Abstraction is one of the main features of database systems. Hiding irrelevant details
from user and providing abstract view of data to users, helps in easy and efficient user-
database interaction. In the previous tutorial, we discussed the three level of DBMS
architecture, The top level of that architecture is “view level”. The view level provides
the “view of data” to the users and hides the irrelevant details such as data relationship,
database schema, constraints, security etc from the user.

DBMS Schema
Definition of schema: Design of a database is called the schema. For example:
An employee table in database exists with the following attributes:

EMP_NAME EMP_ID EMP_ADDRESS EMP_CONTACT


-------- ------ ----------- -----------
This is the schema of the employee table. Schema defines the attributes of tables in the
database

In the following diagram, we have a schema that shows the relationship between three
tables: Course, Student and Section. The diagram only shows the design of the
database, it doesn’t show the data present in those tables. Schema is only a structural
view(design) of a database as shown in the diagram below.

The design of a database at physical level is called physical schema, how the data stored
in blocks of storage is described at this level.

Design of database at logical level is called logical schema, programmers and database
administrators work at this level, at this level data can be described as certain types of
data records gets stored in data structures, however the internal details such as
implementation of data structure is hidden at this level (available at physical level).

Design of database at view level is called view schema. This generally describes end user
interaction with database systems.

DBMS Instance
Definition of instance: The data stored in database at a particular moment of time is
called instance of database. Database schema defines the attributes in tables that belong
to a particular database. The value of these attributes at a moment of time is called the
instance of that database.
For example, we have seen the schema of table “employee” above. Let’s see the table
with the data now. At this moment the table contains two rows (records). This is the the
current instance of the table “employee” because this is the data that is stored in this table
at this particular moment of time.

EMP_NAME EMP_ID EMP_ADDRESS EMP_CONTACT


------- ------ ----------- -----------
Chaitanya 101 Noida 95********
Ajeet 102 Delhi 99********

Data models in DBMS


Data Model is a logical structure of Database. It describes the design of database to
reflect entities, attributes, relationship among data, constrains etc.

Types of Data Models


There are several types of data models in DBMS.

Object based logical Models – Describe data at the conceptual and view levels.

 E-R Model

 Object oriented Model


Record based logical Models – Like Object based model, they also describe data at the
conceptual and view levels. These models specify logical structure of database with
records, fields and attributes.

1. Relational Model

2. Hierarchical Model
3. Network Model – Network Model is same as hierarchical model except that it has
graph-like structure rather than a tree-based structure. Unlike hierarchical model,
this model allows each record to have more than one parent record.

Physical Data Models – These models describe data at the lowest level of abstraction.
Data Independence
o Data independence can be explained using the three-schema architecture.
o Data independence refers characteristic of being able to modify the schema at one level
of the database system without altering the schema at the next higher level.

There are two types of data independence:

1. Logical Data Independence


o Logical data independence refers characteristic of being able to change the conceptual
schema without having to change the external schema.
o Logical data independence is used to separate the external level from the conceptual
view.
o If we do any changes in the conceptual view of the data, then the user view of the data
would not be affected.
o Logical data independence occurs at the user interface level.

2. Physical Data Independence


o Physical data independence can be defined as the capacity to change the internal schema
without having to change the conceptual schema.
o If we do any changes in the storage size of the database system server, then the
Conceptual structure of the database will not be affected.
o Physical data independence is used to separate conceptual levels from the internal levels.
o Physical data independence occurs at the logical interface level.
Fig: Data Independence

Structure of Database Management System


The database system is divided into three components: Query Processor, Storage
Manager, and Disk Storage. These are explained as following below.
Architecture of DBMS

1. Query Processor:It interprets the requests (queries) received from end user via an
application program into instructions. It also executes the user request which is received
from the DML compiler.
Query Processor contains the following components –
 DML Compiler:It processes the DML statements into low level instruction
(machine language), so that they can be executed.
 DDL Interpreter:It processes the DDL statements into a set of table containing
meta data (data about data).
 Embedded DML Pre-compiler:It processes DML statements embedded in an
application program into procedural calls.
 Query Optimizer:It executes the instruction generated by DML Compiler.
2. Storage Manager:Storage Manager is a program that provides an interface between
the data stored in the database and the queries received. It is also known as Database
Control System. It maintains the consistency and integrity of the database by applying
the constraints and executing theDCLstatements. It is responsible for updating, storing,
deleting, and retrieving data in the database
It contains the following components –
 Authorization Manager:It ensures role-based access control, i.e,. checks whether
the particular person is privileged to perform the requested operation or not.

 Integrity Manager:It checks the integrity constraints when the database is


modified.

 Transaction Manager:It controls concurrent access by performing the operations in


a scheduled way that it receives the transaction. Thus, it ensures that the database
remains in the consistent state before and after the execution of a transaction.

 File Manager:It manages the file space and the data structure used to represent
information in the database.

 Buffer Manager:It is responsible for cache memory and the transfer of data
between the secondary storage and main memory.

3. Disk Storage:It contains the following components –


 Data Files:It stores the data.

 Data Dictionary:It contains the information about the structure of any database
object. It is the repository of information that governs the metadata.
 Indices:It provides faster retrieval of data item.

Query Processing in DBMS


Query Processing is the activity performed in extracting data from the database. In query
processing, it takes various steps for fetching the data from the database. The steps involved are:

1. Parsing and translation


2. Optimization
3. Evaluation

The query processing works in the following way:

Parsing and Translation


As query processing includes certain activities for data retrieval. Initially, the given user queries
get translated in high-level database languages such as SQL. It gets translated into expressions
that can be further used at the physical level of the file system. After this, the actual evaluation of
the queries and a variety of query -optimizing transformations and takes place. Thus before
processing a query, a computer system needs to translate the query into a human-readable and
understandable language. Consequently, SQL or Structured Query Language is the best suitable
choice for humans. But, it is not perfectly suitable for the internal representation of the query to
the system. Relational algebra is well suited for the internal representation of a query. The
translation process in query processing is similar to the parser of a query. When a user executes
any query, for generating the internal form of the query, the parser in the system checks the
syntax of the query, verifies the name of the relation in the database, the tuple, and finally the
required attribute value. The parser creates a tree of the query, known as 'parse-tree.' Further,
translate it into the form of relational algebra. With this, it evenly replaces all the use of the
views when used in the query.

Thus, we can understand the working of a query processing in the below-described diagram:

Suppose a user executes a query. As we have learned that there are various methods of extracting
the data from the database. In SQL, a user wants to fetch the records of the employees whose
salary is greater than or equal to 10000. For doing this, the following query is undertaken:

select emp_name from Employee where salary>10000;

Thus, to make the system understand the user query, it needs to be translated in the form of
relational algebra. We can bring this query in the relational algebra form as:

o σsalary>10000 (πsalary (Employee))


o πsalary (σsalary>10000 (Employee))
After translating the given query, we can execute each relational algebra operation by using
different algorithms. So, in this way, a query processing begins its working.
Evaluation
For this, with addition to the relational algebra translation, it is required to annotate the translated
relational algebra expression with the instructions used for specifying and evaluating each
operation. Thus, after translating the user query, the system executes a query evaluation plan.

Query Evaluation Plan


o In order to fully evaluate a query, the system needs to construct a query evaluation plan.
o The annotations in the evaluation plan may refer to the algorithms to be used for the particular
index or the specific operations.
o Such relational algebra with annotations is referred to as Evaluation Primitives. The evaluation
primitives carry the instructions needed for the evaluation of the operation.
o Thus, a query evaluation plan defines a sequence of primitive operations used for evaluating a
query. The query evaluation plan is also referred to as the query execution plan.
o A query execution engine is responsible for generating the output of the given query. It takes the
query execution plan, executes it, and finally makes the output for the user query.

Optimization
o The cost of the query evaluation can vary for different types of queries. Although the system is
responsible for constructing the evaluation plan, the user does need not to write their query
efficiently.
o Usually, a database system generates an efficient query evaluation plan, which minimizes its cost.
This type of task performed by the database system and is known as Query Optimization.
o For optimizing a query, the query optimizer should have an estimated cost analysis of each
operation. It is because the overall operation cost depends on the memory allocations to several
operations, execution costs, and so on.

Finally, after selecting an evaluation plan, the system evaluates the query and produces the
output of the query.

There are the following differences between DBMS and File systems:
Basis DBMS Approach File System Approach

Sharing of data Due to the centralized approach, Data is distributed in many files,
data sharing is easy. and it may be of different formats,
so it isn't easy to share data.
Security and Protection DBMS provides a good protection It isn't easy to protect a file under
mechanism. the file system.

Recovery Mechanism DBMS provides a crash recovery The file system doesn't have a crash
mechanism, i.e., DBMS protects the mechanism, i.e., if the system
user from system failure. crashes while entering some data,
then the content of the file will be
lost.

Manipulation Techniques DBMS contains a wide variety of The file system can't efficiently
sophisticated techniques to store store and retrieve the data.
and retrieve the data.

Concurrency Problems DBMS takes care of Concurrent In the File system, concurrent
access of data using some form of access has many problems like
locking. redirecting the file while deleting
some information or updating some
information.

Cost The database system is expensive to The file system approach is cheaper
design. to design.

Data Redundancy and Due to the centralization of the In this, the files and application
Inconsistency database, the problems of data programs are created by different
redundancy and inconsistency are programmers so that there exists a
controlled. lot of duplication of data which may
lead to inconsistency.

Structure The database structure is complex The file system approach has a
to design. simple structure.

Data Independence In this system, Data Independence In the File system approach, there
exists exists no Data Independence.

Integrity Constraints Integrity Constraints are easy to Integrity Constraints are difficult to
apply. implement in file system.

Data Models In the database approach, 3 types of In the file system approach, there is
data models exist no concept of data models exists.

Flexibility Changes are often a necessity to the The flexibility of the system is less
content of the data stored in any as compared to the DBMS
system, and these changes are more approach.
easily with a database approach.

Examples Oracle, SQL Server, Sybase etc. Cobol, C++ etc.

You might also like