0% found this document useful (0 votes)
3 views

RDBMS

This document provides an overview of Database Management Systems (DBMS), including their purpose, architecture, and advantages over traditional file systems. It discusses the components of DBMS, types of architectures, data abstraction levels, and various database languages. Additionally, it highlights the importance of storage and query processors in managing large amounts of data efficiently.

Uploaded by

merita.281984
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

RDBMS

This document provides an overview of Database Management Systems (DBMS), including their purpose, architecture, and advantages over traditional file systems. It discusses the components of DBMS, types of architectures, data abstraction levels, and various database languages. Additionally, it highlights the importance of storage and query processors in managing large amounts of data efficiently.

Uploaded by

merita.281984
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 18

RDBMS UNIT - I

Introduction to DBMS– Data and Information - Database – Database Management System – Objectives
- Advantages – Components - Architecture. ER Model: Building blocks of ER Diagram – Relationship
Degree – Classification – ER diagram to Tables – ISA relationship – Constraints – Aggregation and
Composition – Advantages

Introduction to DBMS
Database is a collection of data and Management System is a set of programs to store and
retrieve those data. Based on this we can define DBMS like this: DBMS is a collection of inter-
related data and set of programs to store & access those data in an easy and effective manner.

What is the need of DBMS?


Database systems are basically developed for large amount of data. When dealing with
huge amount of data, there are two things that require optimization: Storage of data
and retrieval of data.

Storage:
According to the principles of database systems, the data is stored in such a way that it
acquires lot less space as the redundant data (duplicate data) has been removed before storage.
Purpose of Database Systems
The main purpose of database systems is to manage the data. Consider a university that
keeps the data of students, teachers, courses, books etc. To manage this data we need to store this
data somewhere where we can add new data, delete unused data, update outdated data, retrieve
data, to perform these operations on data we need a Database management system that allows us
to store the data in such a way so that all these operations can be performed on the data
efficiently.
Database Applications – DBMS
Applications where we use Database Management Systems are:

1. Telecom:
There is a database to keeps track of the information regarding calls made, network
usage, customer details etc. Without the database systems it is hard to maintain that huge amount
of data that keeps updating every millisecond.
2. Industry:
Where it is a manufacturing unit, warehouse or distribution centre, each one needs a
database to keep the records of ins and outs. For example distribution centre should keep a track
of the product units that supplied into the centre as well as the products that got delivered out
from the distribution centre on each day; this is where DBMS comes into picture.

3. Banking System:
For storing customer info, tracking day to day credit and debit transactions, generating
bank statements etc. All this work has been done with the help of Database management
systems.

4. Sales:
To store customer information, production information and invoice details.

5. Airlines:
To travel though airlines, we make early reservations; this reservation information along
with flight schedule is stored in database.
6. Education sector:
Database systems are frequently used in schools and colleges to store and retrieve the
data regarding student details, staff details, course details, exam details, payroll data, attendance
details, fees details etc. There is a hell lot amount of inter-related data that needs to be stored
and retrieved in an efficient manner.

7. Online shopping:
Online shopping websites such as Amazon, Flip kart etc. These sites store the product
information, customer addresses and preferences, credit details and provide the relevant list of
products based on query. All this involves a Database management system.

Advantages of DBMS over file system


what is a file processing system and how Database management systems are better than
file processing systems will be discussed here.

Drawbacks of File system


1. Data redundancy:
Data redundancy refers to the duplication of data, lets say we are managing the data of a
college where a student is enrolled for two courses, the same student details in such case will be
stored twice, which will take more storage than needed. Data redundancy often leads to higher
storage costs and poor access time.

2. Data inconsistency:
Data redundancy leads to data inconsistency, lets take the same example that we have
taken above, a student is enrolled for two courses and we have student address stored twice, now
lets say student requests to change his address, if the address is changed at one place and not on
all the records then this can lead to data inconsistency.

3. Data Isolation:
Because data are scattered in various files, and files may be in different formats, writing
new application programs to retrieve the appropriate data is difficult.

4. Dependency on application programs:


Changing files would lead to change in application programs.

5. Atomicity issues:
Atomicity of a transaction refers to “All or nothing”, which means either all the
operations in a transaction executes or none.
The architecture of DBMS depends on the computer system on which it runs. For
example, in client-server DBMS architecture, the database systems at server machine can run
several requests made by client machine

Types of DBMS Architecture


There are three types of DBMS architecture:
1. Single tier architecture
2. Two tier architecture
3. Three tier architecture

1. Single tier architecture:


In this type of architecture, the database is readily available on the client machine, any
request made by client doesn’t require a network connection to perform the action on the
database.
For example, lets say you want to fetch the records of employee from the database and
the database is available on your computer system, so the request to fetch employee details will
be done by your computer and the records will be fetched from the database by your computer as
well. This type of system is generally referred as local database system.

2. Two tier architecture:

In two-tier architecture, the Database system is present at the server machine and the
DBMS application is present at the client machine, these two machines are connected with each
other through a reliable network as shown in the above diagram.

Whenever client machine makes a request to access the database present at server using a
query language like sql, the server perform the request on the database and returns the

result back to the client. The application connection interface such as JDBC, ODBC are used
for the interaction between server and client.

3. Three tier architecture


In three-tier architecture, another layer is present between the client machine and server
machine. In this architecture, the client application doesn’t communicate directly with the
database systems present at the server machine, rather the client application communicates with
server application and the server application internally communicates with the database system
present at the server

DBMS Three Level Architecture Diagram:

This architecture has three levels:


1. External level
2. Conceptual level
3. Internal level

1. External level
It is also called view level. The reason this level is called “view” is because several users
can view their desired data from this level which is internally fetched from database with the
help of conceptual and internal level mapping.
The user doesn’t need to know the database schema details such as data structure, table
definition etc. user is only concerned about data which is what returned back to the view level
after it has been fetched from database (present at the internal level).
External level is the “top level” of the Three Level DBMS Architecture.

2. Conceptual level:
It is also called logical level. The whole design of the database such as relationship
among data, schema of data etc. are described in this level.
Database constraints and security are also implemented in this level of architecture.
This level is maintained by DBA (database administrator).

3. Internal level:
This level is also known as physical level. This level describes how the data is actually
stored in the storage devices. This level is also responsible for allocating space to the data. This
is the lowest level of the architecture.

View of Data in DBMS

Abstraction is one of the main features of database systems. Hiding irrelevant details
from user and providing abstract view of data to users, helps in easy and efficient user- database
interaction. In the previous tutorial, we discussed the three level of DBMS architecture, The top
level of that architecture is “view level”. The view level provides the “view of data” to the users
and hides the irrelevant details such as data relationship, database schema, constraints, security
etc from the user.

1. Data abstraction
2. Instance and schema

Data Abstraction in DBMS

Database systems are made-up of complex data structures. To ease the user interaction with database, the developers
hide internal irrelevant details from users. This process of hiding irrelevant data We have three levels of abstraction:
Physical level:

This is the lowest level of data abstraction. It describes how data is actually stored in
database. You can get the complex data structure details at this level.

Logical level:
This is the middle level of 3-level data abstraction architecture. It describes what data is
stored in database.

View level:
Highest level of data abstraction. This level describes the user interaction with database
system.

Example:
Let’s say we are storing customer information in a customer table. At physical
level these records can be described as blocks of storage (bytes, gigabytes, terabytes etc.) in
memory. These details are often hidden from the programmers.
At the logical level these records can be described as fields and attributes along with their
data types, their relationship among each other can be logically implemented. The programmers
generally work at this level because they are aware of such things about database systems.
At view level, user just interact with system with the help of GUI and enter the details at
the screen, they are not aware of how the data is stored and what data is stored; such details are
hidden from them.Such Details from user is called data abstraction.

DBMS Schema
Definition of schema:
Design of a database is called the schema. Schema is of three types: Physical schema,
logical schema and view schema.

For example:
In the following diagram, we have a schema that shows the relationship between three
tables: Course, Student and Section. The diagram only shows the design of the database, it
doesn’t show the data present in those tables. Schema is only a structural view(design) of a
database as shown in the diagram below.
The design of a database at physical level is called physical schema, how the data stored
in blocks of storage is described at this level.

Design of database at logical level is called logical schema, programmers and database
administrators work at this level, at this level data can be described as certain types of data
records gets stored in data structures, however the internal details such as implementation of data
structure is hidden at this level (available at physical level).

Design of database at view level is called view schema. This generally describes end user
interaction with database systems.
To learn more about these schemas, refer 3 level data abstraction architecture.

DBMS Instance
Definition of instance:
The data stored in database at a particular moment of time is called instance of database.
Database schema defines the variable declarations in tables that belong to a particular database;
the value of these variables at a moment of time is called the instance of that database.

For example, lets say we have a single table student in the database, today the table has
100 records, so today the instance of the database has 100 records. Lets say we are going to
add another 100 records in this table by tomorrow so the instance of database tomorrow will
have 200 records in table. In short, at a particular moment the data stored in database is called
the instance that changes over time when we add or delete data from the database.

DBMS languages
Database languages are used to read, update and store data in a database. There are
several such languages that can be used for this purpose; one of them is SQL (Structured Query
Language).

Types of DBMS languages:


Data Definition Language (DDL)
DDL is used for specifying the database schema. It is used for creating tables, schema,
indexes, constraints etc. in database. Let’s see the operations that we can perform on database
using DDL:
1. To create the database instance – CREATE
2. To alter the structure of database – ALTER
3. To drop database instances – DROP
4. To delete tables in a database instance – TRUNCATE
5. To rename database instances – RENAME
6. To drop objects from database such as tables – DROP
7. To Comment – Comment

All of these commands either defines or update the database schema that’s why they
come under Data Definition language.

Data Manipulation Language (DML)


DML is used for accessing and manipulating data in a database. The following operations
on data base come under DML:
1. To read records from table(s) – SELECT
2. To insert record(s) into the table(s) – INSERT
3. Update the data in table(s) – UPDATE
4. Delete all the records from the table – DELETE

Data Control language (DCL)


DCL is used for granting and revoking user access on a database –
1. To grant access to user – GRANT
2. To revoke access from user – REVOKE

In practical data definition language, data manipulation language and data control
languages are not separate language, rather they are the parts of a single database language such
as SQL.

Transaction Control Language (TCL)


The changes in the database that we made using DML commands are either performed or roll
backed using TCL.

1. To persist the changes made by DML commands in database – COMMIT


2. To rollback the changes made to the database – ROLLBACK

Data Base Architecture:


A database system is partitioned into modules that deal with each of the responsibilities
of the overall system. The functional components of a database system can be broadly divided
into the storage manager and the query processor components.

The storage manager is important because databases typically require a large amount of
storage space. Corporate databases range in size from hundreds of gigabytes to, for the largest
databases, terabytes of data. A gigabyte is approximately 1000 megabytes (actually 1024) (1
billion bytes), and a terabyte is 1 million megabytes (1 trillion bytes). Since the main memory of
computers cannot store this much information, the information is stored on disks. Data are
moved between disk storage and main memory as needed. Since the movement of data to and
from disk is slow relative to the speed of the central processing unit, it is imperative that the
database system structure the data so as to minimize the need to move data between disk and
main memory.

The query processor is important because it helps the database system to simplify and
facilitate access to data. The query processor allows database users to obtain good
Performance while being able to work at the view level and not be burdened with understanding
the physical-level details of the implementation of the system. It is the job of the database system
to translate updates and queries written in a nonprocedural language, at the logical level, into an
efficient sequence of operations at the physical level.

Storage Manager
The storage manager is the component of a database system that provides the interface
between the low-level data stored in the database and the application programs and queries
submitted to the system. The storage manager is responsible for the interaction with the file
manager. The raw data are stored on the disk using the file system provided by the operating
system. The storage manager translates the various DML statements into low-level file-system
commands. Thus, the storage manager is responsible for storing, retrieving, and updating data in
the database.
The storage manager components include:
1. Authorization and integrity manager, which tests for the satisfaction of integrity
constraints and checks the authority of users to access data
2. Transaction manager, which ensures that the database remains in a consistent (correct)
state despite system failures, and that concurrent transaction
executions proceed without conflicting.
3. File manager, which manages the allocation of space on disk storage and the data
structures used to represent information stored on disk.
4. Buffer manager, which is responsible for fetching data from disk storage into main
memory, and deciding what data to cache in main memory. The buffer manager is a
critical part of the database system, since it enables the database to handle data sizes that
are much larger than the size of main memory.

The storage manager implements several data structures as part of the physical system
implementation:
5. Data files, which store the database itself. Data dictionary, which stores metadata about
the structure of the database, in particular the schema of the data base
6. Indices, which can provide fast access to data items. Like the index in this textbook, a
database index provides pointers to those data items that hold a particular value. For
example, we could use an index to find the instructor record with a particular ID, or all
instructor records with a particular name. Hashing is an alternative to indexing that is
faster in some but not all cases.

The Query Processor:


The query processor components include:
DDL interpreter, which interprets DDL statements and records the definitions in the
data dictionary

DML compiler, which translates DML statements in a query language into an evaluation
plan consisting of low-level instructions that the query evaluation engine understands

A query can usually be translated into any of a number of alternative evaluation plans
that all give the same result. The DML compiler also performs query optimization; that is, it
picks the lowest cost evaluation plan from among the alternatives.
Query evaluation engine, which executes low-level instructions generated by the DML
compiler.

Database design
Database design mainly involves the design of the database schema. The design of a
complete database application environment that meets the needs of the enterprise being modeled
requires attention to a broader set of issues. In this text, we focus initially on the writing of
database queries and the design of database schemas.

Design Process
A high-level data model provides the database designer with a conceptual frame work in
which to specify the data requirements of the database users, and how the database will be
structured to fulfill these requirements. The initial phase of database design, then, is to
characterize fully the data needs of the prospective database users. The database designer needs
to interact extensively with domain experts and users to carry out this task. The outcome of this
phase is a specification of user requirements.

ENTITY RELATIONSHIP (E-R) MODELLING


Entity Relationship Diagram – ER Diagram in DBMS
An Entity–relationship model (ER model) describes the structure of a database with the help of a
diagram, which is known as Entity Relationship Diagram (ER Diagram). An ER model is a design or
blueprint of a database that can later be implemented as a database. The main components of E-R model
are: entity set and relationship set.

What is an Entity Relationship Diagram (ER Diagram)


An ER diagram shows the relationship among entity sets. An entity set is a group of similar
entities and these entities can have attributes. In terms of DBMS, an entity is a table or attribute of a table
in database, so by showing relationship among tables and their attributes, ER diagram shows the
complete logical structure of a database. Lets have a look at a simple ER diagram to understand this
concept.

A simple ER Diagram:
In the following diagram we have two entities Student and College and their relationship. The
relationship between Student and College is many to one as a college can have many students however a
student cannot study in multiple colleges at the same time. Student entity has attributes such as Stu_Id,
Stu_Name & Stu_Addr and College entity has attributes such as Col_ID & Col_Name.

Here are the geometric shapes and their meaning in an E-R Diagram. We will discuss these terms
in detail in the next section(Components of a ER Diagram) of this guide so don’t worry too much about
these terms now, just go through them once.
Rectangle: Represents Entity sets.
Ellipses: Attributes
Diamonds: Relationship Set
Lines: They link attributes to Entity Sets and Entity sets to Relationship Set Double
Ellipses: Multi valued Attributes
Dashed Ellipses: Derived Attributes
Double Rectangles: Weak Entity Sets
Double Lines: Total participation of an entity in a relationship set

Compounds of a ER Diagram

As shown in the above diagram, an ER diagram has three main components:


1. Entity
2. Attribute
3. Relationship
1. Entity
An entity is an object or component of data. An entity is represented as rectangle in an ER
diagram.

For example:
In the following ER diagram we have two entities Student and College and these two entities
have many to one relationship as many students study in a single college. We will read more about
relationships later, for now focus on entities.

M 1
Student Study College

Weak Entity:
An entity that cannot be uniquely identified by its own attributes and relies on the relationship
with other entity is called weak entity. The weak entity is represented by a double rectangle. For example
– a bank account cannot be uniquely identified without knowing the bank to which the account belongs,
so bank account is a weak entity.

Bank account Bank

2. Attribute
An attribute describes the property of an entity. An attribute is represented as Oval in an ER
diagram. There are four types of attributes:
a) Key attribute
b) Composite attribute
c) Multivalued attribute
d) Derived attribute

a) Key attribute:

A key attribute can uniquely identify an entity from an entity set. For example, student roll number can uniquely
identify a student from a set of students. Key attribute is represented by oval same as other attributes however the text
of key attribute is underlined.

b) Composite attribute:

An attribute that is a combination of other attributes is known as composite attribute.


For example, In student entity, the student address is a composite attribute as an address is
composed of other attributes such as pincode, state, country.
c) Multivalued attribute:
An attribute that can hold multiple values is known as multivalued attribute. It is represented with
double ovals in an ER Diagram. For example – A person can have more than one phone numbers so the
phone number attribute is multivalued.

d) Derived attribute:
A derived attribute is one whose value is dynamic and derived from another attribute. It is
represented by dashed oval in an ER Diagram. For example – Person age is a derived attribute as it
changes over time and can be derived from another attribute (Date of birth).
E-R diagram with multivalued and derived attributes:

3. Relationship
A relationship is represented by diamond shape in ER diagram, it shows the relationship among entities.

There are four types of relationships:


1. One to One
2. One to Many
3. Many to One
4. Many to Many
1. One to One Relationship

When a single instance of an entity is associated with a single instance of another entity then it is
called one to one relationship. For example, a person has only one passport and a passport is given to one
person.
1 1
Person has Pass Port
2. One to Many Relationship

When a single instance of an entity is associated with more than one instances of another
entity then it is called one to many relationship. For example – a customer can place
many orders but a order cannot be placed by many customers.
1 M
Cutomer Placed Order

3. Many to One Relationship


When more than one instances of an entity is associated with a single
instance of another entity then it is called many to one relationship. For
example – many students can study in a single college but a student cannot
study in many colleges at the same time.

M 1 College
Study
Student

Many to Many Relationship


When more than one instances of an entity is associated with more than one
instances of another entity then it is called many to many relationship. For example,
a can be assigned to many projects and a project can be assigned to many students.

Student M Assigned M Project

Total Participation of an Entity set


A Total participation of an entity set represents that each entity in entity set
must have at least one relationship in a relationship set. For example: In the below
diagram each college must have at-least one associated Student.
Extended Entity - Relationship (EE-R) Model
Incorporate the extensions to the original ER model. Enhanced ERD are
high level models that represent the requirements and complexities of complex
database.

In addition to ER model concepts EE-R includes:


a. Subclasses and Super classes.
b. Specialization and Generalization.
c. Category or union type.
d. Aggregation.

These concepts are used to create E-R diagrams. Subclasses and Super class
Super class is an entity that can be divided into further subtype. For example − consider Shape super
class.

Super class shape has sub groups: Triangle, Square and Circle.

Sub classes are the group of entities with some unique attributes. Sub class
inherits the properties and attributes from super class.

Specialization and Generalization:


Generalization is a process of generalizing an entity which contains generalized
attributes or properties of generalized entities.
It is a Bottom up process i.e. considers we have 3 sub entities Car, Truck
and Motorcycle. Now these three entities can be generalized into one super class
named as Vehicle. Specialization is a process of identifying subsets of an entity that
share some different characteristic. It is a top down approach in which one entity is
broken down into low level entity.
In above example Vehicle entity can be a Car, Truck
or Motorcycle. Category or Union
Relationship of one super or sub class with more than one super class.

Owner is the subset of two super class:


Represents relationship between a whole object and its component

Consider a ternary relationship Works_On between Employee, Branch and


Manager. Now the best way to model this situation is to use aggregation, So, the
relationship-set, Works_On is a higher level entity-set. Such an entity-set is treated
in the same manner as any other entity-set. We can create a binary relationship,
Manager, between Works_On and Manager to represent who manages what tasks.
Unit :3

Normalization of Database
Database Normalization is a technique of organizing the data in the
database. Normalization is a systematic approach of decomposing tables to eliminate
data redundancy (repetition) and undesirable characteristics like Insertion, Update
and Deletion Anomalies. It is a multi-step process that puts data into tabular form,
removing duplicated data from the relation tables.

Normalization is used for mainly two purposes, Eliminating redundant (useless) data
Ensuring data dependencies make sense i.e data is
logically stored. Problems without Normalization
If a table is not properly normalized and has data redundancy then it will not
only eat up extra memory space but will also make it difficult to handle and update
the database, without facing data loss. Insertion, Updation and Deletion Anomalies
are very frequent if database is not normalized. To understand these anomalies let us
take an example of a Student table.

Rollno Name Branch Hod office_tel

401 Akon CSE Mr. X 53337

402 Bkon CSE Mr. X 53337

403 Ckon CSE Mr. X 53337

404 Dkon CSE Mr. X 53337

In the table above, we have data of 4 Computer Sci. students. As we can


see, data for the fields branch, hod(Head of Department) and office_tel is repeated
for the students who are in the same branch in the college, this is Data Redundancy.

Insertion Anomaly
Suppose for a new admission, until and unless a student opts for a branch, data of the student
cannot be inserted, or else we will have to set the branch information

First Normal Form (1NF)


For a table to be in the First Normal Form, it should follow the
following 4 rules: It should only have single(atomic) valued
attributes/columns.
Values stored in a column should be of
the same domain All the columns in a
table should have unique names.
And the order in which data is stored, does not matter.
In the next tutorial, we will discuss about the First Normal Form in details.

Second Normal Form (2NF)


For a table to be in the
Second Normal Form, It
should be in the First
Normal form.
And, it should not have Partial Dependency.
To understand what is Partial Dependency and how to normalize a table to 2nd
normal for, jump to the Second Normal Form tutorial.

Third Normal Form (3NF)


A table is said to be in the Third
Normal Form when, It is in the
Second Normal form.
And, it doesn't have Transitive Dependency.
Here is the Third Normal Form tutorial. But we suggest you to first study about the second
normal form and then head over to the third normal form.

Boyce and Codd Normal Form (BCNF)


Boyce and Codd Normal Form is a higher version of the Third Normal form.
This form deals with certain type of anomaly that is not handled by 3NF. A 3NF
table which does not have multiple overlapping candidate keys is said to be in
BCNF. For a table to be in BCNF, following conditions must be satisfied:
R must be in 3rd Normal Form
and, for each functional dependency ( X → Y ), X should be a super Key.

Fourth Normal Form (4NF)


A table is said to be in the Fourth
Normal Form when, It is in the
Boyce-Codd Normal Form.
And, it doesn't have Multi-Valued Dependency.

You might also like