DBMS Unit-1
DBMS Unit-1
Introduction to Databases:
A database is a collection of related data or information. Data means any raw facts such as its
combination of A to Z alphabetic, 0-9 combinational numbers, characters , images, videos, audios etc…
A database is an organized collection of structured information, or data, typically stored
electronically in a computer system.
A database may be generated and maintained manually or it may be computerized.
A database can be of any size and complexity. For example, the list of names and addresses referred to
earlier may consist of only a few hundred records, each with a simple structure.
An example of a large commercial database is Amazon.com. It contains data for over 20 million books,
CDs, videos, DVDs, games, electronics, apparel, and other items.
Database Management System (DBMS):
A database management system is a collection of programs that enables users to create and maintain a
database. The DBMS is a general-purpose software system that facilitates the processes of defining,
constructing, manipulating, and sharing databases among various users and applications.
Defining a database involves specifying the data types, structures, and constraints of the data to be
stored in the database.
Constructing the database is the process of storing the data on some storage medium that is controlled
by the DBMS.
Manipulating a database includes functions such as querying the database to retrieve specific data,
updating the database to reflect changes in the data.
Sharing a database allows multiple users and programs to access the database simultaneously.
An application program accesses the database by sending queries or requests for data to the DBMS.
Other important functions provided by the DBMS include protecting the database and maintaining it
over a long period of time.
Protection includes system protection against hardware or software malfunction (or crashes) and security
protection against unauthorized or malicious access.
A typical large database may have a life cycle of many years, so the DBMS must be able to maintain the
database system by allowing the system to evolve as requirements change over time.
**********************
An Example:
Consider the example of university system. UNIVERSITY database for maintaining information
concerning students, courses, and grades in a university environment. The STUDENT file stores data on each
student, the COURSE file stores data on each course, the SECTION file stores data on each section of a course,
the GRADE_REPORT file stores the grades that students receive in the various sections they have completed
each course.
Database users are categorized based up on their interaction with the database. These are seven types
of database users in DBMS.
1. Database administrator (DBA): The DBA is responsible for authorizing access to the database,
coordinating and monitoring its use and acquiring software and hardware resources as needed. DBA is
also responsible for providing security to the database.DBA also monitors the recovery and backup
and provides technical support. DBA repairs damage caused due to hardware and/or software failures.
DBA is the one having privileges to users.
2. Naïve users: who don’t have any DBMS knowledge but they frequently use the database applications in
their daily life. For examples, Railway’s ticket booking users are naive users.
3. Database designers: Database designers are responsible for identifying the data to be stored in the
database and for choosing appropriate structures to represent and store this data.
4. System analysts: System analysts determine the requirements of end users, especially naive and
parametric end users, and develop specifications for standard canned transactions that meet these
requirements.
5. Application programmers: Application programmers implement these specifications as programs; then
they test, debug, document, and maintain these canned transactions.
6. Sophisticated Users: Sophisticated users can be engineers, scientists, business analyst, who are
familiar with the database.
7. Casual Users / Temporary Users : Casual Users are the users who occasionally use/access the
database but each time when they access the database they require the new information ,
*********************
Workers behind the scene:
Who design, use, and administer a database, others are associated with the design, development, and operation
of the DBMS software and system environment. These persons are typically not interested in the database
content itself. We call them the workers behind the scene.
DBMS system designers and implementers design and implement the DBMS modules and interfaces
as a software package. A DBMS is a very complex software system that consists of many components,
or modules, including modules for implementing the catalog, query language processing, interface
processing, accessing and buffering data, controlling concurrency, and handling data recovery and
security. The DBMS must interface with other system software such as the operating system and
compilers for various programming languages.
Network Model in DBMS is a hierarchical model that is used to represent the many-to-many
relationship among the database constraints. It is a simple and easy-to-construct database model.
The Network Model in DBMS is based on the set of nodes and links.
Relational Databases:
The relational data model also introduced high-level query languages that provided an
alternative to programming language interfaces, making it much faster to write new queries.
In relational model, the data and relationships are represented by collection of inter-related
tables. Each table is a group of column and rows, where column represents attribute of an entity
and rows represents records.
Student-id Student-name branch Phone number
101 Rekha MCA 9999555522
102 Geetha MBA 9933446688
103 Kumar MCA 8844998877
104 Veeru MBA 1702271702
Object-Oriented Applications: The real world entities and situations are represented as objects in the
Object oriented database model.
Relational Databases:
The relational data model also introduced high-level query languages that provided an alternative to
programming language interfaces, making it much faster to write new queries. In relational model, the
data and relationships are represented by collection of inter-related tables. Each table is a group of
column and rows, where column represents attribute of an entity and rows represents records.
Student-id Student-name branch Phone number
101 Rekha MCA 9999555522
102 Geetha MBA 9933446688
103 Kumar MCA 8844998877
104 Veeru MBA 1702271702
Object-Oriented Applications:
The real world entities and situations are represented as objects in the Object oriented database model .
In this model, both the data and relationship are present in a single structure known as an object. In
this model, two are more objects are connected through links.
****************************
Schemas and Instances:
Schemas:
It refers to an overall description that we get for any given database. In simpler words, schema refers to
the basic structure of how one needs to store data in any database. There are basically two types of
Schema: Physical Schema and Logical Schema.
Physical Schema – This schema describes the database designed at a physical level.
Logical Schema – This schema describes the database designed at a logical level.
Instances:
In DBMS, the data is stored for a particular amount of time and is called an instance of the database. The
database schema defines the attributes of the database in the particular DBMS. The value of the
particular attribute at a particular moment in time is known as an instance of the DBMS.
The major differences between schema and instance are as follows:
Database Schema Database Instance
It is the definition of the database, or it is defined It is a snapshot of a database at a specific moment
as the description of the database.
This corresponds to the variable declaration of a The value of the variable in a program at a point in
programming language. time corresponds to an instance of the database
schema.
It rarely changes. It changes frequently.
This corresponds to the variable declaration of a The value of the variable in a program at a point in
programming language. time corresponds to an instance of the database
schema.
Defines the basic structure of the database, i.e., It is the set of Information stored at a particular
how the data will be stored in the database. time.
Schema is same for whole database. Data in instances can be changed using addition,
deletion, updating.
***********************
Three-Schema Architecture:
The goal of the three-schema architecture, is to separate the user applications from the physical
database. In this architecture, schemas can be defined at the following three levels:
1. Internal Level
2. Conceptual Level
3. External Level
**********************
Data Independence:
Data Independence can be defined as the capacity to change the schema at one level of a database
system without having to change the schema at the next higher level.
We can define two types of data independence:
1. Logical data independence.
2. Physical data independence.
Logical data independence:
Logical data independence is the capacity to change the conceptual schema without having to change
external schemas or application programs. We may change the conceptual schema to expand the
database (by adding a record type or data item), to change constraints, or to reduce the database (by
removing a record type or data item).
Physical data independence:
Physical data independence is the capacity to change the internal schema without having to change the
conceptual schema. Hence, the external schemas need not be changed as well. Changes to the internal
schema may be needed because some physical files were reorganized. for example, by creating
additional access structures to improve the performance of retrieval or update. If the same data as before
remains in the database, we should not have to change the conceptual schema.
*************************
Once the design of a database is completed and a DBMS is chosen to implement the database, the first
step is to specify conceptual and internal schemas for the database and any mappings between the two.
Database languages are
1. Data Definition Language.
2. Data Manipulation Language.
3. Data Control Language.
4. Transaction Control Language
Database Users:
Users are differentiated by the way they expect to interact with the system:
Application programmers:
Application programmers are computer professionals who write application programs.
Application programmers can choose from many tools to develop user interfaces.
Sophisticated users:
Sophisticated users interact with the system application programs and database Languages.
Sophisticated users are having knowledge on multiple tools.
Naïve users :
Naive users are unsophisticated users who interact with the system without any knowledge by
invoking one of the application programming interfaces that have been written previously.
DBMS M L REKHA BVCITS-AMALAPURAM
Database Administrator:
Coordinates all the activities of the database system. Database administrator's duties include:
Schema definition: The DBA creates the original database schema by executing a set of data
definition statements in the DDL.
Storage structure and access method definition.
Schema and physical organization modification
Granting user authority to access the database: By granting different types of authorization, the
database administrator can regulate which parts of the database various users can access.
Specifying integrity constraints.
Monitoring performance and responding to changes in requirements.
Backup and recovery the data.
Providing security.
Query Processor:
The query processor will accept query from user and solves it by accessing the database.
Parts of Query processor:
DDL interpreter
This will interprets DDL statements and fetch the definitions in the data dictionary.
DML compiler
This will translates DML statements in a query language into low level instructions that the
query evaluation engine understands.
A query can usually be translated into any of a number of alternative evaluation plans for same
query result DML compiler will select best plan for query optimization.
Query evaluation engine
This engine will execute low-level instructions generated by the DML compiler on DBMS.
Storage Manager/Storage Management:
A storage manager is a program module which acts like interface between the data stored in a database
and the application programs and queries submitted to the system. Thus, the storage manager is responsible for
storing, retrieving and updating data in the database.
The storage manager components include:
Authorization and integrity manager: Checks for integrity constraints and authority of users to access
data.
Transaction manager: Ensures that the database remains in a consistent state although there are system
failures.
File manager: Manages the allocation of space on disk storage and the data structures used to represent
information stored on disk.
Buffer manager: It is responsible for retrieving data from disk storage into main memory. It enables the
database to handle data sizes that are much larger than the size of main memory.
Memory storage:
Data structures implemented by storage manager.
Data files: Stored in the database itself.
Data dictionary: Stores metadata about the structure of the database.
Indices: Provide fast access to data items.
Statistical data: Statistical analysis performed on memory.
*********************
Classification of Database Management Systems:
Classification of database management system is based on various parameters such as the kind of data
model used to construct the DBMS, the number of users that will be using the database system, the way in
which the database is distributed
1. Based on Data Model
2. Based on Number of Users
3. Based on Database Distribution
4. Based on Cost of Database
5. Based on Usage
DBMS M L REKHA BVCITS-AMALAPURAM
Based on Data Model
. Depending upon how the data is structured, data models are further classified into:
Relational Data Model:
In the relational data model, we use tables to represent data and the relationship among that data. Each
of the tables in the relational data model has a unique name. A table has multiple columns where each
column name is unique. A table holds records which has value for each column of the table. The
relational database model is the most currently used data model.
Entity-Relationship Model
The Entity-Relationship model (E-R data model) represents data using objects and the relationship
among these objects. These objects are referred to as entities that represent the real ‘thing’ or ‘object’ in
the real world.
Object-Based Data Model
The object-based data model is an extension of the E-R model which also include notion for
encapsulation, methods. There is also an object-relational data model which is a combination of the
object-oriented data model and relational data model.
Semi structured Data Model
The semi structured data model is different from what we have studied above. In the semi structured
data model, the data items or objects of the same kind might have a different set of attributes. The
Extensible Markup Language represents the semi structured data. The hierarchical data model stores the
data in the form of records and uses a tree structure to represent these records. Network data model was
introduced which allow the multiple parent record for a single child record.
Based on Number of Users:
The database management system can also be classified on the basis of its user. So, a DBMS can either
be used by a single user or it can be used by multiple users.
The database system that can be used by a single user at a time is referred to as a single-user system
The database system that can be used by multiple users at a time is referred to as a multiple usersystem.
Based on Database Distribution:
Depending on the distribution of the database over numerous sites we can classify the database as:
Centralized DBMS:
In the centralized DBMS, the entire database is stored in a single computer site. Though the centralized
database supports multiple users still the DBMS software and the data both are stores on a single
computer site.
Distributed DBMS
In the distributed DBMS (DDBMS) the database and the DBMS software are distributed over many
computer sites. These computer sites are connected via a computer network. The DDBMS is further
classified as homogeneous DDBMS and heterogeneous DDBMS.
Homogeneous DDBMS: The homogeneous DDBMS has the same DBMS software at all the
distributed sites.
Heterogeneous DDBMS: The heterogeneous DDBMS has different DBMS software for
different sites.
Based on Cost of Database:
Well, it is quite difficult to classify the database on the basis of its cost as nowadays you can have free
open source DBMS products such as MySQL and PostgreSQL. Although the personal version of RDBMS can
cost up to $100.You may also have to pay millions of dollars for the installation and maintenance of a large
database system.
Based on Usage:
On the basis of the access path that is used to store the files, the database can be classified as general-
purpose DBMS and special-purpose DBMS.
The special-purpose DBMS is the one that is designed for a specific application and it can not be used for
another application .
Online transaction processing (OLTP): The OLTP system supports a large number of transactions concurrently
without any delay.
The general-purpose DBMS is the one that is designed to meet the need of as many applications as possible
*****************************
DBMS M L REKHA BVCITS-AMALAPURAM