Data Abstraction in DBMS
Data Abstraction in DBMS
Database systems are made-up of complex data structures. To ease the user interaction
with database, the developers hide internal irrelevant details from users. This process of
hiding irrelevant details from user is called data abstraction. The term “irrelevant” used
here with respect to the user, it doesn’t mean that the hidden data is not relevant with
regard to the whole database. It just means that the user is not concerned about that
data.
For example: When you are booking a train ticket, you are not concerned how data is
processing at the back end when you click “book ticket”, what processes are happening
when you are doing online payments. You are just concerned about the message that
pops up when your ticket is successfully booked. This doesn’t mean that the process
happening at the back end is not relevant, it just means that you as a user are not
concerned what is happening in the database.
Physical level: This is the lowest level of data abstraction. It describes how data is
actually stored in database. You can get the complex data structure details at this level.
Logical level: This is the middle level of 3-level data abstraction architecture. It describes
what data is stored in database(Entities/Relationships/datatypes) etc.
View level: Highest level of data abstraction. This level describes the user interaction
with database system.(Only a part of entire database)
Example: Let’s say we are storing customer information in a customer table. At physical
level these records can be described as blocks of storage (bytes, gigabytes, terabytes etc.)
in memory. These details are often hidden from the programmers.
At the logical level these records can be described as fields and attributes along with their
data types, their relationship among each other can be logically implemented. The
programmers generally work at this level because they are aware of such things about
database systems.
At view level, user just interact with system with the help of GUI and enter the details at
the screen, they are not aware of how the data is stored and what data is stored; such
details are hidden from them.
DBMS Schema
Definition of schema: Design of a database is called the schema. For example:
An employee table in database exists with the following attributes:
In the following diagram, we have a schema that shows the relationship between three
tables: Course, Student and Section. The diagram only shows the design of the
database, it doesn’t show the data present in those tables. Schema is only a structural
view(design) of a database as shown in the diagram below.
The design of a database at physical level is called physical schema, how the data stored
in blocks of storage is described at this level.
Design of database at logical level is called logical schema, programmers and database
administrators work at this level, at this level data can be described as certain types of
data records gets stored in data structures, however the internal details such as
implementation of data structure is hidden at this level (available at physical level).
Design of database at view level is called view schema. This generally describes end user
interaction with database systems.
DBMS Instance
Definition of instance: The data stored in database at a particular moment of time is
called instance of database. Database schema defines the attributes in tables that belong
to a particular database. The value of these attributes at a moment of time is called the
instance of that database.
For example, we have seen the schema of table “employee” above. Let’s see the table
with the data now. At this moment the table contains two rows (records). This is the the
current instance of the table “employee” because this is the data that is stored in this table
at this particular moment of time.
Object based logical Models – Describe data at the conceptual and view levels.
E-R Model
1. Relational Model
2. Hierarchical Model
3. Network Model – Network Model is same as hierarchical model except that it has
graph-like structure rather than a tree-based structure. Unlike hierarchical model,
this model allows each record to have more than one parent record.
Physical Data Models – These models describe data at the lowest level of abstraction.
Data Independence
o Data independence can be explained using the three-schema architecture.
o Data independence refers characteristic of being able to modify the schema at one level
of the database system without altering the schema at the next higher level.
1. Query Processor:It interprets the requests (queries) received from end user via an
application program into instructions. It also executes the user request which is received
from the DML compiler.
Query Processor contains the following components –
DML Compiler:It processes the DML statements into low level instruction
(machine language), so that they can be executed.
DDL Interpreter:It processes the DDL statements into a set of table containing
meta data (data about data).
Embedded DML Pre-compiler:It processes DML statements embedded in an
application program into procedural calls.
Query Optimizer:It executes the instruction generated by DML Compiler.
2. Storage Manager:Storage Manager is a program that provides an interface between
the data stored in the database and the queries received. It is also known as Database
Control System. It maintains the consistency and integrity of the database by applying
the constraints and executing theDCLstatements. It is responsible for updating, storing,
deleting, and retrieving data in the database
It contains the following components –
Authorization Manager:It ensures role-based access control, i.e,. checks whether
the particular person is privileged to perform the requested operation or not.
File Manager:It manages the file space and the data structure used to represent
information in the database.
Buffer Manager:It is responsible for cache memory and the transfer of data
between the secondary storage and main memory.
Data Dictionary:It contains the information about the structure of any database
object. It is the repository of information that governs the metadata.
Indices:It provides faster retrieval of data item.
Thus, we can understand the working of a query processing in the below-described diagram:
Suppose a user executes a query. As we have learned that there are various methods of extracting
the data from the database. In SQL, a user wants to fetch the records of the employees whose
salary is greater than or equal to 10000. For doing this, the following query is undertaken:
Thus, to make the system understand the user query, it needs to be translated in the form of
relational algebra. We can bring this query in the relational algebra form as:
Optimization
o The cost of the query evaluation can vary for different types of queries. Although the system is
responsible for constructing the evaluation plan, the user does need not to write their query
efficiently.
o Usually, a database system generates an efficient query evaluation plan, which minimizes its cost.
This type of task performed by the database system and is known as Query Optimization.
o For optimizing a query, the query optimizer should have an estimated cost analysis of each
operation. It is because the overall operation cost depends on the memory allocations to several
operations, execution costs, and so on.
Finally, after selecting an evaluation plan, the system evaluates the query and produces the
output of the query.
There are the following differences between DBMS and File systems:
Basis DBMS Approach File System Approach
Sharing of data Due to the centralized approach, Data is distributed in many files,
data sharing is easy. and it may be of different formats,
so it isn't easy to share data.
Security and Protection DBMS provides a good protection It isn't easy to protect a file under
mechanism. the file system.
Recovery Mechanism DBMS provides a crash recovery The file system doesn't have a crash
mechanism, i.e., DBMS protects the mechanism, i.e., if the system
user from system failure. crashes while entering some data,
then the content of the file will be
lost.
Manipulation Techniques DBMS contains a wide variety of The file system can't efficiently
sophisticated techniques to store store and retrieve the data.
and retrieve the data.
Concurrency Problems DBMS takes care of Concurrent In the File system, concurrent
access of data using some form of access has many problems like
locking. redirecting the file while deleting
some information or updating some
information.
Cost The database system is expensive to The file system approach is cheaper
design. to design.
Data Redundancy and Due to the centralization of the In this, the files and application
Inconsistency database, the problems of data programs are created by different
redundancy and inconsistency are programmers so that there exists a
controlled. lot of duplication of data which may
lead to inconsistency.
Structure The database structure is complex The file system approach has a
to design. simple structure.
Data Independence In this system, Data Independence In the File system approach, there
exists exists no Data Independence.
Integrity Constraints Integrity Constraints are easy to Integrity Constraints are difficult to
apply. implement in file system.
Data Models In the database approach, 3 types of In the file system approach, there is
data models exist no concept of data models exists.
Flexibility Changes are often a necessity to the The flexibility of the system is less
content of the data stored in any as compared to the DBMS
system, and these changes are more approach.
easily with a database approach.