Notes Unit-1DBMS
Notes Unit-1DBMS
UNIT-I
DBMS stands for Database Management System. We can break it like this DBMS = Database +
Management System. Database is a collection of data and Management System is a set of
programs to store and retrieve those data. Based on this we can define DBMS like this: DBMS is
a collection of inter-related data and set of programs to store & access those data in an easy and
effective manner.
Storage: According to the principles of database systems, the data is stored in such a way that it
acquires lot less space as the redundant data (duplicate data) has been removed before storage.
Let’s take a layman example to understand this:
In a banking system, suppose a customer is having two accounts, one is saving account and
another is salary account. Let’s say bank stores saving account data at one place (these places are
called tables we will learn them later) and salary account data at another place, in that case if the
customer information such as customer name, address etc. are stored at both places then this is
just a wastage of storage (redundancy/ duplication of data), to organize the data in a better way
the information should be stored at one place and both the accounts should be linked to that
information somehow. The same thing we achieve in DBMS.
Fast Retrieval of data: Along with storing the data in an optimized and systematic manner, it is
also important that we retrieve the data quickly when needed. Database systems ensure that the
data is retrieved as quickly as possible.
DBMS is a collection of File system is a collection of data. In this system, the user has to write the
data. In DBMS, the user is procedures for managing the database.
not required to write the
procedures.
DBMS gives an abstract File system provides the detail of the data representation and storage of
view of data that hides the data.
details.
DBMS provides a crash File system doesn't have a crash mechanism, i.e., if the system crashes
recovery mechanism, i.e., while entering some data, then the content of the file will lost.
DBMS protects the user
from the system failure.
DBMS provides a good It is very difficult to protect a file under the file system.
protection mechanism.
DBMS contains a wide File system can't efficiently store and retrieve the data.
variety of sophisticated
techniques to store and
retrieve the data.
DBMS takes care of In the File system, concurrent access has many problems like redirecting
Concurrent access of data the file while other deleting some information or updating some
using some form of information.
locking.
DBMS Architecture
o The DBMS design depends upon its architecture. The basic client/server architecture is
used to deal with a large number of PCs, web servers, database servers and other
components that are connected with networks.
o The client/server architecture consists of many PCs and a workstation which are
connected via the network.
o DBMS architecture depends upon how users are connected to the database to get their
request done.
3-Tier Architecture
o The 3-Tier architecture contains another layer between the client and server. In this
architecture, client can't directly communicate with the server.
o The application on the client-end interacts with an application server which further
communicates with the database system.
o End user has no idea about the existence of the database beyond the application server.
The database also has no idea about any other user beyond the application.
o The 3-Tier architecture is used in case of large web application.
Fig: 3-tier Architecture
Data Models
Data models define how the logical structure of a database is modeled. Data Models are
fundamental entities to introduce abstraction in a DBMS. Data models define how data is
connected to each other and how they are processed and stored inside the system.
The first data model could be flat, where all the data used are kept in the same plane. Earlier
data models were not so scientific, hence they were prone to introduce lots of duplication and
update anomalies.
1. Hierarchical Model
2. Network Model
3. Entity-Relationship Model
4. Relational Model
5. Object-Oriented Data Model
6. Object-Relational Data Model
7. Flat Data Model
8. Semi-Structured Data Model
9. Associative Data Model
10. Context Data Model
Hierarchical Model
Hierarchical Model was the first DBMS model. This model organizes the data in the
hierarchical tree structure. The hierarchy starts from the root which has root data and then it
expands in the form of a tree adding a child node to the parent node. This model easily
represents some of the real-world relationships like food recipes, sitemap of a website
etc. Example: We can represent the relationship between the shoes present on a shopping
website in the following way:
Features of a Hierarchical Model
1. One-to-many relationship: The data here is organized in a tree-like structure where the
one-to-many relationship is between the data types. Also, there can be only one path
from the parent to any node. Example: In the above example, if we want to go to the
node sneakers we only have one path to reach there i.e. through the men's shoe node.
2. Parent-Child Relationship: Each child node has a parent node but a parent node can
have more than one child node. Multiple parents are not allowed.
3. Deletion Problem: If a parent node is deleted then the child node is automatically
deleted.
4. Pointers: Pointers are used to link the parent node with the child node and are used to
navigate between the stored data. Example: In the above example the 'shoes' node
points to the two other nodes 'women's shoes' node and the 'men's shoes' node.
Advantages of the Hierarchical Model
Network Model
This model is an extension of the hierarchical model. It was the most popular model before the
relational model. This model is the same as the hierarchical model, the only difference is that a
record can have more than one parent. It replaces the hierarchical tree with a
graph. Example: In the example below we can see that node student has two parents i.e. CSE
Department and Library. This was earlier not possible in the hierarchical model.
1. Ability to Merge more Relationships: In this model, as there are more relationships so
data is more related. This model has the ability to manage one-to-one relationships as
well as many-to-many relationships.
2. Many paths: As there are more relationships so there can be more than one path to the
same record. This makes data access fast and simple.
3. Circular Linked List: The operations on the network model are done with the help of
the circular linked list. The current position is maintained with the help of a program
and this position navigates through the records according to the relationship.
Advantages of Network Model
The data can be accessed faster as compared to the hierarchical model. This is because
the data is more related in the network model and there can be more than one path to
reach a particular node. So the data can be accessed in many ways.
As there is a parent-child relationship so data integrity is present. Any change in parent
record is reflected in the child record.
Disadvantages of Network Model
As more and more relationships need to be handled the system might get complex. So, a
user must be having detailed knowledge of the model to work with the model.
Any change like updation, deletion, insertion is very complex.
Entity-Relationship Model
Entity-Relationship (ER) Model is based on the notion of real-world entities and relationships
among them. While formulating real-world scenario into the database model, the ER Model
creates entity set, relationship set, general attributes and constraints.
ER Model is best used for the conceptual design of a database.
ER Model is based on −
Entities and their attributes.
Relationships among entities.
These concepts are explained below.
o one to one
o one to many
o many to one
o many to many
Relational Model
The most popular data model in DBMS is the Relational Model. It is more scientific a model
than others. This model is based on first-order predicate logic and defines a table as an n-ary
relation.
DBMS Schema
Definition of schema: Design of a database is called the schema. Schema is of three types:
Physical schema, logical schema and view schema.
For example: In the following diagram, we have a schema that shows the relationship between
three tables: Course, Student and Section. The diagram only shows the design of the database, it
doesn’t show the data present in those tables. Schema is only a structural view(design) of a
database as shown in the diagram below.
The design of a database at physical level is called physical schema, how the data stored in
blocks of storage is described at this level.
Design of database at logical level is called logical schema, programmers and database
administrators work at this level, at this level data can be described as certain types of data
records gets stored in data structures, however the internal details such as implementation of data
structure is hidden at this level (available at physical level).
Design of database at view level is called view schema. This generally describes end user
interaction with database systems.
DBMS Instance
Definition of instance: The data stored in database at a particular moment of time is called
instance of database. Database schema defines the variable declarations in tables that belong to a
particular database; the value of these variables at a moment of time is called the instance of that
database.
For example, lets say we have a single table student in the database, today the table has 100
records, so today the instance of the database has 100 records. Lets say we are going to add
another 100 records in this table by tomorrow so the instance of database tomorrow will have
200 records in table. In short, at a particular moment the data stored in database is called the
instance, that changes over time when we add or delete data from the database.
Data Independence
o Data independence can be explained using the three-schema architecture.
o Data independence refers characteristic of being able to modify the schema at one level
of the database system without altering the schema at the next higher level.
o Logical data independence refers characteristic of being able to change the conceptual
schema without having to change the external schema.
o Logical data independence is used to separate the external level from the conceptual
view.
o If we do any changes in the conceptual view of the data, then the user view of the data
would not be affected.
o Logical data independence occurs at the user interface level.
o Physical data independence can be defined as the capacity to change the internal schema
without having to change the conceptual schema.
o If we do any changes in the storage size of the database system server, then the
Conceptual structure of the database will not be affected.
o Physical data independence is used to separate conceptual levels from the internal levels.
o Physical data independence occurs at the logical interface level.
Database Language
o A DBMS has appropriate languages and interfaces to express database queries and
updates.
o Database languages can be used to read, store and update the data in the database.
Types of Database Language
These commands are used to update the database schema that's why they come under Data
definition language.
2. Data Manipulation Language
DML stands for Data Manipulation Language. It is used for accessing and manipulating data in a
database. It handles user requests.
(But in Oracle database, the execution of data control language does not have the feature
of rolling back.)
There are the following operations which have the authorization of Revoke:
TCL is used to run the changes made by the DML statement. TCL can be grouped into a logical
transaction.
For example, Suppose we design a school database. In this database, the student will be an entity
with attributes like address, name, id, age, etc. The address can be another entity with attributes
like city, street name, pin code, etc and there will be a relationship between them.
Component of ER Diagram
1. Entity:
An entity may be any object, class, person or place. In the ER diagram, an entity can be
represented as rectangles.
a. Weak Entity
An entity that depends on another entity called a weak entity. The weak entity doesn't contain
any key attribute of its own. The weak entity is represented by a double rectangle.
2. Attribute
The attribute is used to describe the property of an entity. Eclipse is used to represent an
attribute.
For example, id, age, contact number, name, etc. can be attributes of a student.
a. Key Attribute
The key attribute is used to represent the main characteristics of an entity. It represents a primary
key. The key attribute is represented by an ellipse with the text underlined.
b. Composite Attribute
An attribute that composed of many other attributes is known as a composite attribute. The
composite attribute is represented by an ellipse, and those ellipses are connected with an ellipse.
c. Multivalued Attribute
An attribute can have more than one value. These attributes are known as a multivalued attribute.
The double oval is used to represent multivalued attribute.
For example, a student can have more than one phone number.
d. Derived Attribute
An attribute that can be derived from other attribute is known as a derived attribute. It can be
represented by a dashed ellipse.
For example, A person's age changes over time and can be derived from another attribute like
Date of birth.
3. Relationship
A relationship is used to describe the relation between entities. Diamond or rhombus is used to
represent the relationship.
Types of relationship are as follows:
a. One-to-One Relationship
When only one instance of an entity is associated with the relationship, then it is known as one to
one relationship.
For example, A female can marry to one male, and a male can marry to one female.
b. One-to-many relationship
When only one instance of the entity on the left, and more than one instance of an entity on the
right associates with the relationship then this is known as a one-to-many relationship.
For example, Scientist can invent many inventions, but the invention is done by the only specific
scientist.
Many-to-one relationship
When more than one instance of the entity on the left, and only one instance of an entity on the
right associates with the relationship then it is known as a many-to-one relationship.
For example, Student enrolls for only one course, but a course can have many students.
d. Many-to-many relationship
When more than one instance of the entity on the left, and more than one instance of an entity on
the right associates with the relationship then it is known as a many-to-many relationship.
For example, Employee can assign by many projects and project can have many employees.
Notation of ER diagram
Database can be represented using the notations. In ER diagram, many notations are used to
express the cardinality. These notations are as follows:
Mapping Constraints
o A mapping constraint is a data constraint that expresses the number of entities to which
another entity can be related via a relationship set.
o It is most useful in describing the relationship sets that involve more than two entity sets.
o For binary relationship set R on an entity set A and B, there are four possible mapping
cardinalities. These are as follows:
1. One to one (1:1)
2. One to many (1:M)
3. Many to one (M:1)
4. Many to many (M:M)
One-to-one
In one-to-one mapping, an entity in E1 is associated with at most one entity in E2, and an entity
in E2 is associated with at most one entity in E1.
One-to-many
In one-to-many mapping, an entity in E1 is associated with any number of entities in E2, and an
entity in E2 is associated with at most one entity in E1.
Many-to-one
In one-to-many mapping, an entity in E1 is associated with at most one entity in E2, and an
entity in E2 is associated with any number of entities in E1.
Many-to-many
In many-to-many mapping, an entity in E1 is associated with any number of entities in E2, and
an entity in E2 is associated with any number of entities in E1.
Keys
o Keys play an important role in the relational database.
o It is used to uniquely identify any record or row of data from the table. It is also used to
establish and identify relationships between tables.
For example: In Student table, ID is used as a key because it is unique for each student. In
PERSON table, passport_number, license_number, SSN are keys since they are unique for each
person.
Types of key:
1. Primary key
o It is the first key which is used to identify one and only one instance of an entity
uniquely. An entity can contain multiple keys as we saw in PERSON table. The key
which is most suitable from those lists become a primary key.
o In the EMPLOYEE table, ID can be primary key since it is unique for each employee. In
the EMPLOYEE table, we can even select License_Number and Passport_Number as
primary key since they are also unique.
o For each entity, selection of the primary key is based on requirement and developers.
2. Candidate key
o A candidate key is an attribute or set of an attribute which can uniquely identify a tuple.
o The remaining attributes except for primary key are considered as a candidate key. The
candidate keys are as strong as the primary key.
For example: In the EMPLOYEE table, id is best suited for the primary key. Rest of the
attributes like SSN, Passport_Number, and License_Number, etc. are considered as a candidate
key.
3. Super Key
Super key is a set of an attribute which can uniquely identify a tuple. Super key is a superset of a
candidate key.
4. Foreign key
o Foreign keys are the column of the table which is used to point to the primary key of
another table.
o In a company, every employee works in a specific department, and employee and
department are two different entities. So we can't store the information of the department
in the employee table. That's why we link these two tables through the primary key of
one table.
o We add the primary key of the DEPARTMENT table, Department_Id as a new attribute
in the EMPLOYEE table.
o Now in the EMPLOYEE table, Department_Id is the foreign key, and both the tables are
related.
Generalization
o Generalization is like a bottom-up approach in which two or more entities of lower level
combine to form a higher level entity if they have some attributes in common.
o In generalization, an entity of a higher level can also combine with the entities of the
lower level to form a further higher level entity.
o Generalization is more like subclass and superclass system, but the only difference is the
approach. Generalization uses the bottom-up approach.
o In generalization, entities are combined to form a more generalized entity, i.e., subclasses
are combined to make a superclass.
For example, Faculty and Student entities can be generalized and create a higher level entity
Person.
Aggregation
In aggregation, the relation between two entities is treated as a single entity. In aggregation,
relationship with its corresponding entities is aggregated into a higher level entity.
For example: Center entity offers the Course entity act as a single entity in the relationship which
is in a relationship with another entity visitor. In the real world, if a visitor visits a coaching
center then he will never enquiry about the Course only or just about the Center instead he will
ask the enquiry about both.
Reduction of ER diagram to Table
The database can be represented using the notations, and these notations can be reduced to a
collection of tables.
In the database, every entity set or relationship set can be represented in tabular form.
In the given ER diagram, LECTURE, STUDENT, SUBJECT and COURSE forms individual
tables.
In the STUDENT entity, STUDENT_NAME and STUDENT_ID form the column of STUDENT
table. Similarly, COURSE_NAME and COURSE_ID form the column of COURSE table and so
on.
In the student table, a hobby is a multivalued attribute. So it is not possible to represent multiple
values in a single column of STUDENT table. Hence we create a table STUD_HOBBY with
column name STUDENT_ID and HOBBY. Using both the column, we create a composite key.
o Composite attribute represented by components.
In the given ER diagram, student address is a composite attribute. It contains CITY, PIN,
DOOR#, STREET, and STATE. In the STUDENT table, these attributes can merge as an
individual column.
In the STUDENT table, Age is the derived attribute. It can be calculated at any point of time by
calculating the difference between current date and Date of Birth.
Using these rules, you can convert the ER diagram to tables and columns and assign the mapping
between the tables. Table structure for the given ER diagram is as below:
The degree of relationship can be defined as the number of occurrences in one entity that is
associated with the number of occurrences in another entity.
1. One-to-one
o In a one-to-one relationship, one occurrence of an entity relates to only one occurrence in
another entity.
o A one-to-one relationship rarely exists in practice.
o For example: if an employee is allocated a company car then that car can only be driven
by that employee.
o Therefore, employee and company car have a one-to-one relationship.
2. One-to-many
o In a one-to-many relationship, one occurrence in an entity relates to many occurrences in
another entity.
o For example: An employee works in one department, but a department has many
employees.
o Therefore, department and employee have a one-to-many relationship.
3. Many-to-many
o In a many-to-many relationship, many occurrences in an entity relate to many
occurrences in another entity.
o Same as a one-to-one relationship, the many-to-many relationship rarely exists in
practice.
o For example: At the same time, an employee can work on several projects, and a project
has a team of many employees.
o Therefore, employee and project have a many-to-many relationship.