DBMS Chapter-1
DBMS Chapter-1
DATA:
The term data can be defined as known facts or raw facts that could be recorded
or stored in computer.
DATABASE:
A database is a organized collection of data. In other words, a database is a
collection of information in a systematic manner in form of tables.
TABLE:
A table is a collection of rows and columns.
ROW:
Row represents a set of related data. Example: (1, abc,6586565656)
COLUMN:
A column is a set of data values of particular type, one value for each row of
database. Example: ID,CUST_NAME,PHONE, etc
DBMS:
DBMS stands for Database Management System. It is a software or set of
programs that allow user to create, process, store, retrieve and manage data.
1
TECH-VISION INSTITUTE OF CS Mob No: 7030766381
Applications of DBMS:
ACID Properties –
DBMS follows the concepts of Atomicity, Consistency, Isolation, and Durability
(normally shortened as ACID). These concepts are applied on transactions, which
manipulate data in a database. ACID properties help the database stay healthy in multi-
transactional environments and in case of failure.
Multiuser and Concurrent Access –
DBMS supports multi-user environment and allows them to access and
manipulate data in parallel. Though there are restrictions on transactions when users
attempt to handle the same data item, but users are always unaware of them.
Multiple views –
DBMS offers multiple views for different users. A user who is in the Sales
department will have a different view of database than a person working in the Production
2
TECH-VISION INSTITUTE OF CS Mob No: 7030766381
department. This feature enables the users to have a concentrate view of the database
according to their requirements.
Security –
Features like multiple views offer security to some extent where users are
unable to access data of other users and departments.
Backup and Recovery
There are many chances of failure of whole database. At that time no one will
be able to get the database back and for sure company will be in a big loss. The only solution
is to take backup of database and whenever it is needed, it can be stored back.
Data Integrity
It protects unauthorized access of database and makes it more secure. It
brings only consistence and accurate data into the database.
Stores Any Kind of Data
A database management system should be able to store any kind of data. It
should not be restricted to employee name, salary and address. Any kind of data that exists
in the real world can be stored in DBMS because we need to work with all kinds of data
that is present around us.
Query Language
Queries are used to retrieve and manipulate data but DBMS is armed by a strong
query language that make it more effective and efficient. Users have the power to retrieve
any kind of data they want from database by applying different set of queries.
Data Redundancy
Unlike traditional file-system storage, Data Redundancy in DBMS is very less or
not present. Data Redundancy occurs when the same data are stored unnecessarily at
different places. Data Redundancy is reduced or eliminated in DBMS because all data are
stored at a centralized location rather than being created by individual users and for each
application.
Data Inconsistency
In traditional file system storage, the changes made by one user in one
application doesn’t update the changes in other application, given both have the same set
3
TECH-VISION INSTITUTE OF CS Mob No: 7030766381
of details. While this is not the case with DBMS systems as there is a single repository of
data that is defined once and is accessed by many users, and data are consistent.
Data Backup and Recovery
This is another advantage of DBMS as it provides a strong framework for Data
backup, users are not required to back up their data periodically and manually, it is
automatically taken care by DBMS. Moreover, in case of a server crash, DBMS restores
the Database to its previous condition.
Data Security
Data Security is vital concept in a database. Only authorized users should be
allowed to access the database and their identity should be authenticated using a username
and password. Unauthorized users should not be allowed to access the database under any
circumstances
Sharing of Data
In a database, the users of the database can share the data among themselves.
There are various levels of authorization to access the data, and consequently the data can
only be shared based on the correct authorization protocols being followed.
Data Integrity
Data integrity means that the data is accurate and consistent in the database.
Data Integrity is very important as there are multiple databases in a DBMS. All of these
databases contain data that is visible to multiple users. So it is necessary to ensure that
the data is correct and consistent in all the databases and for all the users.
Flexibility
Flexibility DBMS approach ensures database is adaptable to change.
Sometimes, it may be necessary to change the structure of a database.
Modern DBMS allow certain types of Change to the structure of the database without
affecting stored data and exiting stored data and existing application programs.
Increased Cost
To store huge amount of data, one needs huge amount of space. Additionally, it
will require more memory and fast processing power to run the DBMS. So, an expensive
hardware and software will be needed that can provide all these facilities. As a result, old
file-based system needs to be upgraded. These sophisticated hardware and software
require maintenance which is very costly.
4
TECH-VISION INSTITUTE OF CS Mob No: 7030766381
DBMS requires high initial investment for hardware and software. A significant
investment based upon size and functionality of organization is required. Also, organization
has to pay concurrent annual maintenance cost.
Increased Vulnerability
As DBMS is capable of many things because it is centralized, but at the same
time centralization increases vulnerability. The whole system shuts down due to the failure
of a single component.
Complexity
A DBMS fulfill lots of requirement and it solve many problems related to
database. But all these functionalities make dbms an extremely complex software.
Developers, designers, DBA and end users of database must have complete skills of DBMS
if they want to use it properly.
If they don’t understand this complex software then it may cause loss of
data or database failure.
Technical staff required
Organization has many employees working for them and these employees can
perform many other tasks too that are not in their domain but it is also impossible for them
to work on dbms.
A dedicated team of technical staff is required who can handle dbms and as a
result, company has to pay salary to them.
Database failure
Data is key for any organization, if data is lost then whole organization will
collapse. And as we know that in dbms, all the files are stored in single database so chances
of database failure becomes more.
Huge size
DBMS is made to handle extremely huge data and queries, but due to its
complexity dbms has become huge in size. As a result, it requires lots of space and memory
to run its application efficiently.
System maintenance
System should be maintained and update with security measures frequently.
Privacy and Security
If information from the data centre gets corrupted then every user of the
organization will be in a big trouble.
5
TECH-VISION INSTITUTE OF CS Mob No: 7030766381
Database administrator(DBA)
Database Administrator (DBA) is a person/team who defines the schemaand also controls
the 3 levels of database.
The DBA will then create a new account id and password for the user ifhe/she need to
access the data base.
DBA is also responsible for providing security to the data base and he allows only the
authorized users to access/modify the data base.
DBA also monitors the recovery and back up and provide technical support.
The DBA has a DBA account in the DBMS which called a system or super user account.
DBA repairs damage caused due to hardware and/or software failures.
Naive/Parametric End Users:
Parametric End Users are the unsophisticated who don’t have any DBMS
knowledge but they frequently use the data base applications in their daily life to get the
desired results.
For examples, Railway’s ticket booking users are naive users. Clerks in any
bank is a naive user because they don’t have any DBMS knowledge but they still use the
database and perform their given task.
System Analyst :
System Analyst is a user who analyzes the requirements of parametric end
users. They check whether all the requirements of end users are satisfied.
Sophisticated Users :
Sophisticated users can be engineers, scientists, business analyst, who are
familiar with the database. They can develop their own data base applications according to
their requirement. They don’t write the program code but they interact the data base by
writing SQL queries directly through the query processor.
Data Base Designers :
Data Base Designers are the users who design the structure of data base
which includes tables, indexes, views, constraints, triggers, stored procedures. He/she
controls what data must be stored and how the data items to be related.
Application Programmer :
Application Program are the back end programmers who writes the code for
the application programs. They are the computer professionals. These programs could be
6
TECH-VISION INSTITUTE OF CS Mob No: 7030766381
Query Processor :
It interprets the requests (queries) received from end user via an
application program into instructions. It also executes the user request which is received
from the DML compiler.
Query Processor contains the following components –
1. DML Compiler –
It processes the DML statements into low level instruction (machine
language), so that they can be executed.
2. DDL Interpreter –
It processes the DDL statements into a set of table containing meta data
(data about data).
3. Query Optimizer –
It executes the instruction generated by DML Compiler.
Storage Manager :
Storage Manager is a program that provides an interface between the
data stored in the database and the queries received. It is also known as Database
Control System. It maintains the consistency and integrity of the database by applying
7
TECH-VISION INSTITUTE OF CS Mob No: 7030766381
the constraints and executes the DCL statements. It is responsible for updating, storing,
deleting, and retrieving data in the database.
It contains the following components –
1. Authorization Manager –
It ensures role-based access control, i.e,. checks whether the particular
person is privileged to perform the requested operation or not.
2. Integrity Manager –
It checks the integrity constraints when the database is modified.
3. Transaction Manager –
It controls concurrent access by performing the operations in a scheduled
way that it receives the transaction. Thus, it ensures that the database remains in the
consistent state before and after the execution of a transaction.
4. File Manager –
It manages the file space and the data structure used to represent
information in the database.
5. Buffer Manager –
It is responsible for cache memory and the transfer of data between the
secondary storage and main memory.
Disk Storage :
It contains the following components –
1. Data Files –
It stores the data.
2. Data Dictionary –
It contains the information about the structure of any database object.
It is the repository of information that governs the metadata.
3. Indices-
It provides faster retrieval of data item.
Users:
There are 4 users which directly or indirectly interact with DBMS they are:
1. Naïve user
2. Application programmer
3. Sophisticated user
4. Database administrator
8
TECH-VISION INSTITUTE OF CS Mob No: 7030766381
Data abstraction
As a data in database are stored with very complex data structure so when
user come and want to access any data, he will not be able to access data if he has go through
9
TECH-VISION INSTITUTE OF CS Mob No: 7030766381
this data structure. So to simplify the interaction of user and database, DBMS hides some
information which is not of user interest, this is called data abstraction:- So developer hides
complexity from user and store abstract view of data.
Physical level:-
This is the lowest level of data abstraction which describe How data is actual
stored in database. This level basically describe the data structure and access path /indexing
use for accessing file.
Logical level:-
The next level of abstraction describe what data are stored in the
database and what are the relationship existed among those of data.
View level:-
In this level user only interact with database and the complexity remain
TECH-VISION INSTITUTE OF CS
unview . User see data and there may be many views of one data like chart and graph.
10
TECH-VISION INSTITUTE OF CS Mob No: 7030766381
1. Schema:
Design of database is called the Schema. It is basically skeleton
structure that represent the logical view of entire database. It defines how data is
organized and how the relationship among them are associated. It formulates all the
constraints that are to be applied on the data.
Database system has various schemas:
1. Physical schema
2. Logical schema
3. View schema
This schema define all the logical constraints that need to be applied on the data storage.
It Defines table, views, and integrity constraint.
Define relationship between table and keys applied.
3.View Schema:
11
TECH-VISION INSTITUTE OF CS Mob No: 7030766381
2. Instances:
Data stored in database at particular moment of time is instance of database.
1.8 Database Languages:
A DBMS must provide appropriate languages and interfaces for each
category of users to express database queries and updates. Database Languages
are used to create and maintain database on computer.
Following are the database languages:
1. Data Definition Language (DDL)
2. Data Manipulation Language (DML)
3. Data Control Language (DCL)
4. Transaction Control Language (TCL)
It is a language that allows the users to define data and their relationship to
other types of data. It is mainly used to create files, databases, data dictionary and tables within
databases.
It is also used to specify the structure of each table, set of associated values
with each attribute, integrity constraints, security and authorization information for each table
and physical storage structure of each table on disk.
The following table gives an overview about the usage of DDL statements:
12
TECH-VISION INSTITUTE OF CS Mob No: 7030766381
DCL statements control access to data and the database using statements
such as GRANT and REVOKE. A privilege can either be granted to a User with the help of GRANT
statement. The privileges assigned can be SELECT, ALTER, DELETE, EXECUTE, INSERT, INDEX
etc. In addition to granting of privileges, you can also revoke (taken back) it by using REVOKE
command.
The following table gives an overview about the usage of DCL statements:
13
TECH-VISION INSTITUTE OF CS Mob No: 7030766381
14
TECH-VISION INSTITUTE OF CS Mob No: 7030766381
File –
A file is named collection of related information that is recorded on secondary
storage such as magnetic disks, magnetic tables and optical disks.
File Organization-
File Organization refers to the logical relationships among various records that
constitute the file, particularly with respect to the means of identification and access to any
specific record. In simple terms, Storing the files in certain order is called file Organization. File
Structure refers to the format of the label and data blocks and of any logical control record.
15
TECH-VISION INSTITUTE OF CS Mob No: 7030766381
upon the programmer to decide the best suited file Organization method according to his
requirements.
Some types of File Organizations are :
1. Pile File Method – This method is quite simple, in which we store the
records in a sequence i.e one after other in the order in which they are
inserted into the tables.
16
TECH-VISION INSTITUTE OF CS Mob No: 7030766381
2.Sorted File Method- In this method, As the name itself suggest whenever a new
record has to be inserted, it is always inserted in a sorted (ascending or descending)
manner. Sorting of records may be based on any primary key or any other key.
17
TECH-VISION INSTITUTE OF CS Mob No: 7030766381
--When the data block is full, the new record is stored in some other block. This new
data block need not to be the very next data block, but it can select any data block in the
memory to store new records. The heap file is also known as an unordered file.
--Suppose we have five records R1, R3, R6, R4 and R5 in a heap and suppose we want to
insert a new record R2 in a heap. If the data block 3 is full then it will be inserted in any
of the database selected by the DBMS, let's say data block 1.
18
TECH-VISION INSTITUTE OF CS Mob No: 7030766381
Pros-
It is a very good method of file organization for bulk insertion. If there is a large
number of data which needs to load into the database at a time, then this method
is best suited.
In case of a small database, fetching and retrieving of records is faster than the
sequential record.
-Cons
This method is inefficient for the large database because it takes time to search
or modify the record.
This method is inefficient for large databases.
19
TECH-VISION INSTITUTE OF CS Mob No: 7030766381
In this method, there is no effort for searching and sorting the entire file.
In this method, each record will be stored randomly in the memory.
--when a new record has to be inserted, then the address is generated using the hash key
and record is directly inserted.
Pros-
Records need not be sorted after any of the transaction. Hence the effort of sorting is
reduced in this method.
Since block address is known by hash function, accessing any record is very faster.
Similarly updating or deleting a record is also very quick.
Cons-
Since all the records are randomly stored, they are scattered in the memory. Hence
memory is not efficiently used.
20
TECH-VISION INSTITUTE OF CS Mob No: 7030766381
If these hash columns are frequently updated, then the data block address is also changed
accordingly. Each update will generate new address. This is also not acceptable.
21
TECH-VISION INSTITUTE OF CS Mob No: 7030766381
1. Hierarchical Model:
In this model, a child node will only have a single parent node. This model
efficiently describes many real-world relationships like index of a book, recipes etc.
2. Network Model:
This is an extension of the Hierarchical model. In this model data is
organized more like a graph, and are allowed to have more than one parent node.
22
TECH-VISION INSTITUTE OF CS Mob No: 7030766381
3. Entity-relationship Model
In this database model, relationships are created by dividing object of
interest into entity and its characteristics into attributes.
Different entities are related using relationships.
E-R Models are defined to represent the relationships into pictorial
form to make it easier for different stakeholders to understand.
Relationship: Relationship tells how two attributes are related. Example: Teacher works
for a department.
23
TECH-VISION INSTITUTE OF CS Mob No: 7030766381
4. Relational Model
In this model, data is organized in two-dimensional tables and the
relationship is maintained by storing a common field.
This model was introduced by E.F Codd in 1970, and since then it has
been the most widely used database model, infact, we can say the only database model used
around the world.
The basic structure of data in the relational model is tables. All the
information related to a particular type is stored in rows of that table.
Hence, tables are also known as relations in relational model.
24
TECH-VISION INSTITUTE OF CS Mob No: 7030766381
Structure of index:
• The first column is the Search key that contains a copy of the primary key of the table.
These values are stored in sorted order so that the corresponding data can be accessed
quickly.
• The second column is the Data Reference or Pointer which contains a set of pointers
holding the address of the disk block where that particular key value can be found.
25
TECH-VISION INSTITUTE OF CS Mob No: 7030766381
As primary keys are stored in sorted order, the performance of the searching operation is
quite efficient.
The primary index can be classified into two types:
1. Dense index
2. Sparse index.
1. Dense index:
The dense index contains an index record for every search key value in
the data file. It makes searching faster.
In this, the number of records in the index table is same as the number
of records in the main table.
The index records have the search key and a pointer to the actual record
on the disk.
2. Sparse index:
In the data file, index record appears only for a few items. Each item
points to a block.
In this, instead of pointing to each record in the main table, the index
points to the records in the main table in a gap.
26
TECH-VISION INSTITUTE OF CS Mob No: 7030766381
27
TECH-VISION INSTITUTE OF CS Mob No: 7030766381
The previous schema is little confusing because one disk block is sharedby records which
belong to the different cluster. If we use separate disk block for separate clusters, then it is
called better technique.
28
TECH-VISION INSTITUTE OF CS Mob No: 7030766381
29
TECH-VISION INSTITUTE OF CS Mob No: 7030766381
30