0% found this document useful (0 votes)
2 views

Notes Unit-1DBMS

Uploaded by

bhartivandana198
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Notes Unit-1DBMS

Uploaded by

bhartivandana198
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 34

Noida International University

Department of Computer Science


Course-B.Tech Sem-V
Subject-Database Management System
Subject Code-

UNIT-I

Database Management System

DBMS stands for Database Management System. We can break it like this DBMS = Database +
Management System. Database is a collection of data and Management System is a set of
programs to store and retrieve those data. Based on this we can define DBMS like this: DBMS is
a collection of inter-related data and set of programs to store & access those data in an easy and
effective manner.

What is the need of DBMS?


Database systems are basically developed for large amount of data. When dealing with huge
amount of data, there are two things that require optimization: Storage of data and retrieval of
data.

Storage: According to the principles of database systems, the data is stored in such a way that it
acquires lot less space as the redundant data (duplicate data) has been removed before storage.
Let’s take a layman example to understand this:
In a banking system, suppose a customer is having two accounts, one is saving account and
another is salary account. Let’s say bank stores saving account data at one place (these places are
called tables we will learn them later) and salary account data at another place, in that case if the
customer information such as customer name, address etc. are stored at both places then this is
just a wastage of storage (redundancy/ duplication of data), to organize the data in a better way
the information should be stored at one place and both the accounts should be linked to that
information somehow. The same thing we achieve in DBMS.

Fast Retrieval of data: Along with storing the data in an optimized and systematic manner, it is
also important that we retrieve the data quickly when needed. Database systems ensure that the
data is retrieved as quickly as possible.

Purpose of Database Systems


The main purpose of database systems is to manage the data. Consider a university that keeps the
data of students, teachers, courses, books etc. To manage this data we need to store this data
somewhere where we can add new data, delete unused data, update outdated data, retrieve data,
to perform these operations on data we need a Database management system that allows us to
store the data in such a way so that all these operations can be performed on the data efficiently.

Database System vs File System

There are following differences between DBMS and File system:

DBMS File System

DBMS is a collection of File system is a collection of data. In this system, the user has to write the
data. In DBMS, the user is procedures for managing the database.
not required to write the
procedures.

DBMS gives an abstract File system provides the detail of the data representation and storage of
view of data that hides the data.
details.

DBMS provides a crash File system doesn't have a crash mechanism, i.e., if the system crashes
recovery mechanism, i.e., while entering some data, then the content of the file will lost.
DBMS protects the user
from the system failure.

DBMS provides a good It is very difficult to protect a file under the file system.
protection mechanism.

DBMS contains a wide File system can't efficiently store and retrieve the data.
variety of sophisticated
techniques to store and
retrieve the data.
DBMS takes care of In the File system, concurrent access has many problems like redirecting
Concurrent access of data the file while other deleting some information or updating some
using some form of information.
locking.

DBMS Architecture
o The DBMS design depends upon its architecture. The basic client/server architecture is
used to deal with a large number of PCs, web servers, database servers and other
components that are connected with networks.
o The client/server architecture consists of many PCs and a workstation which are
connected via the network.
o DBMS architecture depends upon how users are connected to the database to get their
request done.

3-Tier Architecture
o The 3-Tier architecture contains another layer between the client and server. In this
architecture, client can't directly communicate with the server.
o The application on the client-end interacts with an application server which further
communicates with the database system.
o End user has no idea about the existence of the database beyond the application server.
The database also has no idea about any other user beyond the application.
o The 3-Tier architecture is used in case of large web application.
Fig: 3-tier Architecture
Data Models

Data models define how the logical structure of a database is modeled. Data Models are
fundamental entities to introduce abstraction in a DBMS. Data models define how data is
connected to each other and how they are processed and stored inside the system.
The first data model could be flat, where all the data used are kept in the same plane. Earlier
data models were not so scientific, hence they were prone to introduce lots of duplication and
update anomalies.

Some of the Data Models in DBMS are:

1. Hierarchical Model
2. Network Model
3. Entity-Relationship Model
4. Relational Model
5. Object-Oriented Data Model
6. Object-Relational Data Model
7. Flat Data Model
8. Semi-Structured Data Model
9. Associative Data Model
10. Context Data Model

Hierarchical Model
Hierarchical Model was the first DBMS model. This model organizes the data in the
hierarchical tree structure. The hierarchy starts from the root which has root data and then it
expands in the form of a tree adding a child node to the parent node. This model easily
represents some of the real-world relationships like food recipes, sitemap of a website
etc. Example: We can represent the relationship between the shoes present on a shopping
website in the following way:
Features of a Hierarchical Model

1. One-to-many relationship: The data here is organized in a tree-like structure where the
one-to-many relationship is between the data types. Also, there can be only one path
from the parent to any node. Example: In the above example, if we want to go to the
node sneakers we only have one path to reach there i.e. through the men's shoe node.
2. Parent-Child Relationship: Each child node has a parent node but a parent node can
have more than one child node. Multiple parents are not allowed.
3. Deletion Problem: If a parent node is deleted then the child node is automatically
deleted.
4. Pointers: Pointers are used to link the parent node with the child node and are used to
navigate between the stored data. Example: In the above example the 'shoes' node
points to the two other nodes 'women's shoes' node and the 'men's shoes' node.
Advantages of the Hierarchical Model

 It is very simple and fast to traverse through a tree-like structure.


 Any change in the parent node is automatically reflected in the child node so, the
integrity of data is maintained.
Disadvantages of the Hierarchical Model

 Complex relationships are not supported.


 As it does not support more than one parent of the child node so if we have some
complex relationship where a child node needs to have two parent node then that can't
be represented using this model.
 If a parent node is deleted then the child node is automatically deleted.

Network Model
This model is an extension of the hierarchical model. It was the most popular model before the
relational model. This model is the same as the hierarchical model, the only difference is that a
record can have more than one parent. It replaces the hierarchical tree with a
graph. Example: In the example below we can see that node student has two parents i.e. CSE
Department and Library. This was earlier not possible in the hierarchical model.

Features of a Network Model

1. Ability to Merge more Relationships: In this model, as there are more relationships so
data is more related. This model has the ability to manage one-to-one relationships as
well as many-to-many relationships.
2. Many paths: As there are more relationships so there can be more than one path to the
same record. This makes data access fast and simple.
3. Circular Linked List: The operations on the network model are done with the help of
the circular linked list. The current position is maintained with the help of a program
and this position navigates through the records according to the relationship.
Advantages of Network Model

 The data can be accessed faster as compared to the hierarchical model. This is because
the data is more related in the network model and there can be more than one path to
reach a particular node. So the data can be accessed in many ways.
 As there is a parent-child relationship so data integrity is present. Any change in parent
record is reflected in the child record.
Disadvantages of Network Model

 As more and more relationships need to be handled the system might get complex. So, a
user must be having detailed knowledge of the model to work with the model.
 Any change like updation, deletion, insertion is very complex.

Entity-Relationship Model

Entity-Relationship (ER) Model is based on the notion of real-world entities and relationships
among them. While formulating real-world scenario into the database model, the ER Model
creates entity set, relationship set, general attributes and constraints.
ER Model is best used for the conceptual design of a database.
ER Model is based on −
 Entities and their attributes.
 Relationships among entities.
These concepts are explained below.

 Entity − An entity in an ER Model is a real-world entity having properties


called attributes. Every attribute is defined by its set of values called domain. For
example, in a school database, a student is considered as an entity. Student has various
attributes like name, age, class, etc.
 Relationship − The logical association among entities is called relationship.
Relationships are mapped with entities in various ways. Mapping cardinalities define the
number of association between two entities.
Mapping cardinalities −

o one to one
o one to many
o many to one
o many to many

Relational Model

The most popular data model in DBMS is the Relational Model. It is more scientific a model
than others. This model is based on first-order predicate logic and defines a table as an n-ary
relation.

The main highlights of this model are −

 Data is stored in tables called relations.


 Relations can be normalized.
 In normalized relations, values saved are atomic values.
 Each row in a relation contains a unique value.
 Each column in a relation contains values from a same domain.

DBMS Schema

Definition of schema: Design of a database is called the schema. Schema is of three types:
Physical schema, logical schema and view schema.

For example: In the following diagram, we have a schema that shows the relationship between
three tables: Course, Student and Section. The diagram only shows the design of the database, it
doesn’t show the data present in those tables. Schema is only a structural view(design) of a
database as shown in the diagram below.

The design of a database at physical level is called physical schema, how the data stored in
blocks of storage is described at this level.

Design of database at logical level is called logical schema, programmers and database
administrators work at this level, at this level data can be described as certain types of data
records gets stored in data structures, however the internal details such as implementation of data
structure is hidden at this level (available at physical level).

Design of database at view level is called view schema. This generally describes end user
interaction with database systems.

DBMS Instance

Definition of instance: The data stored in database at a particular moment of time is called
instance of database. Database schema defines the variable declarations in tables that belong to a
particular database; the value of these variables at a moment of time is called the instance of that
database.

For example, lets say we have a single table student in the database, today the table has 100
records, so today the instance of the database has 100 records. Lets say we are going to add
another 100 records in this table by tomorrow so the instance of database tomorrow will have
200 records in table. In short, at a particular moment the data stored in database is called the
instance, that changes over time when we add or delete data from the database.

Data Independence
o Data independence can be explained using the three-schema architecture.
o Data independence refers characteristic of being able to modify the schema at one level
of the database system without altering the schema at the next higher level.

There are two types of data independence:

1. Logical Data Independence

o Logical data independence refers characteristic of being able to change the conceptual
schema without having to change the external schema.
o Logical data independence is used to separate the external level from the conceptual
view.
o If we do any changes in the conceptual view of the data, then the user view of the data
would not be affected.
o Logical data independence occurs at the user interface level.

2. Physical Data Independence

o Physical data independence can be defined as the capacity to change the internal schema
without having to change the conceptual schema.
o If we do any changes in the storage size of the database system server, then the
Conceptual structure of the database will not be affected.
o Physical data independence is used to separate conceptual levels from the internal levels.
o Physical data independence occurs at the logical interface level.

Fig: Data Independence

Database Language
o A DBMS has appropriate languages and interfaces to express database queries and
updates.
o Database languages can be used to read, store and update the data in the database.
Types of Database Language

1. Data Definition Language


o DDL stands for Data Definition Language. It is used to define database structure or
pattern.
o It is used to create schema, tables, indexes, constraints, etc. in the database.
o Using the DDL statements, you can create the skeleton of the database.
o Data definition language is used to store the information of metadata like the number of
tables and schemas, their names, indexes, columns in each table, constraints, etc.

Here are some tasks that come under DDL:

o Create: It is used to create objects in the database.


o Alter: It is used to alter the structure of the database.
o Drop: It is used to delete objects from the database.
o Truncate: It is used to remove all records from a table.
o Rename: It is used to rename an object.
o Comment: It is used to comment on the data dictionary.

These commands are used to update the database schema that's why they come under Data
definition language.
2. Data Manipulation Language

DML stands for Data Manipulation Language. It is used for accessing and manipulating data in a
database. It handles user requests.

Here are some tasks that come under DML:

o Select: It is used to retrieve data from a database.


o Insert: It is used to insert data into a table.
o Update: It is used to update existing data within a table.
o Delete: It is used to delete all records from a table.
o Merge: It performs UPSERT operation, i.e., insert or update operations.
o Call: It is used to call a structured query language or a Java subprogram.
o Explain Plan: It has the parameter of explaining data.
o Lock Table: It controls concurrency.

3. Data Control Language


o DCL stands for Data Control Language. It is used to retrieve the stored or saved data.
o The DCL execution is transactional. It also has rollback parameters.

(But in Oracle database, the execution of data control language does not have the feature
of rolling back.)

Here are some tasks that come under DCL:

o Grant: It is used to give user access privileges to a database.


o Revoke: It is used to take back permissions from the user.

There are the following operations which have the authorization of Revoke:

CONNECT, INSERT, USAGE, EXECUTE, DELETE, UPDATE and SELECT.

4. Transaction Control Language

TCL is used to run the changes made by the DML statement. TCL can be grouped into a logical
transaction.

Here are some tasks that come under TCL:

o Commit: It is used to save the transaction on the database.


o Rollback: It is used to restore the database to original since the last Commit.
ER model
o ER model stands for an Entity-Relationship model. It is a high-level data model. This
model is used to define the data elements and relationship for a specified system.
o It develops a conceptual design for the database. It also develops a very simple and easy
to design view of data.
o In ER modeling, the database structure is portrayed as a diagram called an entity-
relationship diagram.

For example, Suppose we design a school database. In this database, the student will be an entity
with attributes like address, name, id, age, etc. The address can be another entity with attributes
like city, street name, pin code, etc and there will be a relationship between them.

Component of ER Diagram
1. Entity:

An entity may be any object, class, person or place. In the ER diagram, an entity can be
represented as rectangles.

Consider an organization as an example- manager, product, employee, department etc. can be


taken as an entity.

a. Weak Entity
An entity that depends on another entity called a weak entity. The weak entity doesn't contain
any key attribute of its own. The weak entity is represented by a double rectangle.

2. Attribute

The attribute is used to describe the property of an entity. Eclipse is used to represent an
attribute.

For example, id, age, contact number, name, etc. can be attributes of a student.

a. Key Attribute

The key attribute is used to represent the main characteristics of an entity. It represents a primary
key. The key attribute is represented by an ellipse with the text underlined.
b. Composite Attribute

An attribute that composed of many other attributes is known as a composite attribute. The
composite attribute is represented by an ellipse, and those ellipses are connected with an ellipse.

c. Multivalued Attribute

An attribute can have more than one value. These attributes are known as a multivalued attribute.
The double oval is used to represent multivalued attribute.

For example, a student can have more than one phone number.
d. Derived Attribute

An attribute that can be derived from other attribute is known as a derived attribute. It can be
represented by a dashed ellipse.

For example, A person's age changes over time and can be derived from another attribute like
Date of birth.

3. Relationship

A relationship is used to describe the relation between entities. Diamond or rhombus is used to
represent the relationship.
Types of relationship are as follows:

a. One-to-One Relationship

When only one instance of an entity is associated with the relationship, then it is known as one to
one relationship.

For example, A female can marry to one male, and a male can marry to one female.

b. One-to-many relationship

When only one instance of the entity on the left, and more than one instance of an entity on the
right associates with the relationship then this is known as a one-to-many relationship.

For example, Scientist can invent many inventions, but the invention is done by the only specific
scientist.

Many-to-one relationship

When more than one instance of the entity on the left, and only one instance of an entity on the
right associates with the relationship then it is known as a many-to-one relationship.

For example, Student enrolls for only one course, but a course can have many students.
d. Many-to-many relationship

When more than one instance of the entity on the left, and more than one instance of an entity on
the right associates with the relationship then it is known as a many-to-many relationship.

For example, Employee can assign by many projects and project can have many employees.

Notation of ER diagram

Database can be represented using the notations. In ER diagram, many notations are used to
express the cardinality. These notations are as follows:
Mapping Constraints
o A mapping constraint is a data constraint that expresses the number of entities to which
another entity can be related via a relationship set.
o It is most useful in describing the relationship sets that involve more than two entity sets.
o For binary relationship set R on an entity set A and B, there are four possible mapping
cardinalities. These are as follows:
1. One to one (1:1)
2. One to many (1:M)
3. Many to one (M:1)
4. Many to many (M:M)

One-to-one

In one-to-one mapping, an entity in E1 is associated with at most one entity in E2, and an entity
in E2 is associated with at most one entity in E1.
One-to-many

In one-to-many mapping, an entity in E1 is associated with any number of entities in E2, and an
entity in E2 is associated with at most one entity in E1.

Many-to-one

In one-to-many mapping, an entity in E1 is associated with at most one entity in E2, and an
entity in E2 is associated with any number of entities in E1.

Many-to-many
In many-to-many mapping, an entity in E1 is associated with any number of entities in E2, and
an entity in E2 is associated with any number of entities in E1.

Keys
o Keys play an important role in the relational database.
o It is used to uniquely identify any record or row of data from the table. It is also used to
establish and identify relationships between tables.

For example: In Student table, ID is used as a key because it is unique for each student. In
PERSON table, passport_number, license_number, SSN are keys since they are unique for each
person.
Types of key:

1. Primary key
o It is the first key which is used to identify one and only one instance of an entity
uniquely. An entity can contain multiple keys as we saw in PERSON table. The key
which is most suitable from those lists become a primary key.
o In the EMPLOYEE table, ID can be primary key since it is unique for each employee. In
the EMPLOYEE table, we can even select License_Number and Passport_Number as
primary key since they are also unique.
o For each entity, selection of the primary key is based on requirement and developers.
2. Candidate key
o A candidate key is an attribute or set of an attribute which can uniquely identify a tuple.
o The remaining attributes except for primary key are considered as a candidate key. The
candidate keys are as strong as the primary key.

For example: In the EMPLOYEE table, id is best suited for the primary key. Rest of the
attributes like SSN, Passport_Number, and License_Number, etc. are considered as a candidate
key.
3. Super Key

Super key is a set of an attribute which can uniquely identify a tuple. Super key is a superset of a
candidate key.

For example: In the above EMPLOYEE table, for(EMPLOEE_ID, EMPLOYEE_NAME) the


name of two employees can be the same, but their EMPLYEE_ID can't be the same. Hence, this
combination can also be a key.

The super key would be EMPLOYEE-ID, (EMPLOYEE_ID, EMPLOYEE-NAME), etc.

4. Foreign key
o Foreign keys are the column of the table which is used to point to the primary key of
another table.
o In a company, every employee works in a specific department, and employee and
department are two different entities. So we can't store the information of the department
in the employee table. That's why we link these two tables through the primary key of
one table.
o We add the primary key of the DEPARTMENT table, Department_Id as a new attribute
in the EMPLOYEE table.
o Now in the EMPLOYEE table, Department_Id is the foreign key, and both the tables are
related.

Generalization
o Generalization is like a bottom-up approach in which two or more entities of lower level
combine to form a higher level entity if they have some attributes in common.
o In generalization, an entity of a higher level can also combine with the entities of the
lower level to form a further higher level entity.
o Generalization is more like subclass and superclass system, but the only difference is the
approach. Generalization uses the bottom-up approach.
o In generalization, entities are combined to form a more generalized entity, i.e., subclasses
are combined to make a superclass.

For example, Faculty and Student entities can be generalized and create a higher level entity
Person.
Aggregation

In aggregation, the relation between two entities is treated as a single entity. In aggregation,
relationship with its corresponding entities is aggregated into a higher level entity.

For example: Center entity offers the Course entity act as a single entity in the relationship which
is in a relationship with another entity visitor. In the real world, if a visitor visits a coaching
center then he will never enquiry about the Course only or just about the Center instead he will
ask the enquiry about both.
Reduction of ER diagram to Table

The database can be represented using the notations, and these notations can be reduced to a
collection of tables.

In the database, every entity set or relationship set can be represented in tabular form.

The ER diagram is given below:


o Entity type becomes a table.

In the given ER diagram, LECTURE, STUDENT, SUBJECT and COURSE forms individual
tables.

o All single-valued attribute becomes a column for the table.

In the STUDENT entity, STUDENT_NAME and STUDENT_ID form the column of STUDENT
table. Similarly, COURSE_NAME and COURSE_ID form the column of COURSE table and so
on.

o A key attribute of the entity type represented by the primary key.

In the given ER diagram, COURSE_ID, STUDENT_ID, SUBJECT_ID, and LECTURE_ID are


the key attribute of the entity.

o The multivalued attribute is represented by a separate table.

In the student table, a hobby is a multivalued attribute. So it is not possible to represent multiple
values in a single column of STUDENT table. Hence we create a table STUD_HOBBY with
column name STUDENT_ID and HOBBY. Using both the column, we create a composite key.
o Composite attribute represented by components.

In the given ER diagram, student address is a composite attribute. It contains CITY, PIN,
DOOR#, STREET, and STATE. In the STUDENT table, these attributes can merge as an
individual column.

o Derived attributes are not considered in the table.

In the STUDENT table, Age is the derived attribute. It can be calculated at any point of time by
calculating the difference between current date and Date of Birth.

Using these rules, you can convert the ER diagram to tables and columns and assign the mapping
between the tables. Table structure for the given ER diagram is as below:

Relationship of higher degree

The degree of relationship can be defined as the number of occurrences in one entity that is
associated with the number of occurrences in another entity.

There is the three degree of relationship:


1. One-to-one (1:1)
2. One-to-many (1:M)
3. Many-to-many (M:N)

1. One-to-one
o In a one-to-one relationship, one occurrence of an entity relates to only one occurrence in
another entity.
o A one-to-one relationship rarely exists in practice.
o For example: if an employee is allocated a company car then that car can only be driven
by that employee.
o Therefore, employee and company car have a one-to-one relationship.

2. One-to-many
o In a one-to-many relationship, one occurrence in an entity relates to many occurrences in
another entity.
o For example: An employee works in one department, but a department has many
employees.
o Therefore, department and employee have a one-to-many relationship.

3. Many-to-many
o In a many-to-many relationship, many occurrences in an entity relate to many
occurrences in another entity.
o Same as a one-to-one relationship, the many-to-many relationship rarely exists in
practice.
o For example: At the same time, an employee can work on several projects, and a project
has a team of many employees.
o Therefore, employee and project have a many-to-many relationship.

You might also like