Dbms Complete Notes
Dbms Complete Notes
Database : it is collection of related data setup for easy access management and updating.
DBMS: A database management system (DBMS) is a software for creating and managing
databases
Example:
Oracle RDBMS
IBM DB2
Microsoft SQL Server
MySQL
Advantages of DBMS:
Improved Data Sharing: DBMS enables data sharing among multiple users and applications.
Multiple users can access the same data simultaneously, without interfering with each other's
work.
Data Integration: DBMS allows the integration of data from multiple sources. This means that
data can be collected from various systems and combined into a single database, which makes it
easier to access and analyze.
Data Security: DBMS provides various security features such as access control, authentication,
and encryption to safeguard data from unauthorized access and prevent data loss or corruption.
Data Consistency: DBMS enforces consistency in data by ensuring that data is accurate,
complete, and up-to-date. This helps to avoid data inconsistencies and errors
Data Integrity: DBMS ensures that data is stored and retrieved without any loss or corruption. It
provides mechanisms such as transaction management and recovery
Eliminate data redundancy(duplication): separate parts of database should correlate
To reduce data redundancy and provides efficient data update duplication of data is eliminated.
Disadvantages of DBMS:
High Cost: Implementing a DBMS can be expensive due to the cost of licensing, hardware, and
maintenance.
Complexity: DBMS is a complex software that requires a significant amount of technical
expertise to install, configure, and maintain.
System Overhead: DBMS requires system resources such as memory, CPU, and disk space,
which can lead to system overhead and reduced system performance.
Data Dependency: DBMS stores data in a particular format, which can create data dependency
issues. If the format changes, it can affect the application that uses the data, leading to additional
maintenance and development costs.
Single Point of Failure: DBMS represents a single point of failure for an organization's data. If
the DBMS fails, it can lead to significant data loss and downtime,
Database Administrators:
➢ DBA is a single person or a group of persons responsible for implementing the database
system within an organisation
➢ DBA is a centralized control of the database management system (both data & application
programs).
➢ The main functionality of DBA is to provide convenient and efficient atmosphere for database
users and access the data.
➢ The entire control of DBMS is in the hands of DBA
The DBA has following roles and responsibilities regarding database management.
1. Providing schema definition and modification: The overall design of the D/B is known as
schema. DBA is responsible to provide schema (logical and physical) of the/B is known as
schema.
2. Granting user authorization: DBA can grant authorization of data access to different users so
that database will not be accessible by unauthorized users.
3. Backup and recovery: DBA is responsible for taking the database backup periodically in order
to be able to recover data from any failures (like virus attack).
4. New Software installation: Installing how DBMS s/w, application S/W &S/W &another
related s/w. 5. Monitoring performance: BA monitor the CPU and memory usage of computer,
the performance of the DB.
Database Designers: Database Designers are responsible for identifying the data to be stored
in the database and for choosing appropriate structures to represent and store this data.
Database designers typically interact with each potential group and user and develop a view of
the database that meets the data and processing requirements of these groups.
End Users: End users are the people whose jobs require access to the database for querying,
updating, and generating reports; the database primarily exists for their use. There are several
categories of end users:
Casual end user: Occasionally access the database, but they may need different information
each time. They are typically middle-or high-level managers or other occasional browsers.
Naive or Parametric end user: Their main job function revolves around constantly querying
and updating the database, using standard types of queries and updates that have been carefully
programmed and tested. Bank tellers, Reservation Clerks for airlines, hotels, etc are the
example of Naive end users.
Sophisticated end users: Sophisticated end users include engineers, scientists, business
analysts, and others who thoroughly familiarize themselves with the facilities of the DBMS so
as to implement their applications to meet their complex requirements.
Stand-alone users: They maintain the personal databases by using ready-made program
packages that provide an easy-to-use menu or graphics-based interfaces.
Software Engineers: System analysts determines the requirements of end users, especially
naive and parametric end users, and develop specifications for canned transactions that meet
these requirements.Application programmers implement these specifications as programs; then
they test, debug, document, and maintain these canned transactions. Such analysts and
programmers are called Software Engineers.
Data Models
➢ A data model is a collection of concepts that can be used to describe the structure of a
database
. ➢ By structure of a database, the data types, relationships and constraints that should hold for
the data.
➢ Most data models also include a set of basic operations for specifying retrievals and updates
on the database.
The most often used data models are:
1. Hierarchical Model
2. Network Model
3. Relational Model
4. Object-Oriented Model
1. Hierarchical Model:
1. A Hierarchical Database Model is a data model in which the data is organized into a tree
structure.
2. The data is stored as records which are connected to one another through links.
3. A record is a collection of fields, with each field containing only one value.
4. At the top of hierarchy there is only one entity which is called Root
. Advantages:
1. The model allows easy addition and deletion of new information.
2. Data at the top of the Hierarchy is very fast to access.
3. The model relates, very well to natural hierarchies such as assembly plants and employee
organization in corporations.
Disadvantages:
1. Complex implementation.
2. Predefined tree structure reduces flexibility.
3. The database can be very slow when searching for information on the lower entities.
2. Network Model:
1. The Network Model is a database model conceived as a flexible way of representing
objects and their relationships.
2. Its distinguishing feature is that the schema, viewed as a graph in which object types are
nodes and relationship types are arcs.
3. The Objects or entities can be accessed through several paths.
Advantages:
1. The network model is also conceptually simple and easy to design.
2. The network model can handle the one to many and many to many relationships which is
real help in modeling the real-life situations.
3. The data access is easier and flexible than the hierarchical model.
4. The network model does not allow a member to exist without an owner.
5. The network model is better than the hierarchical model in isolating the programs from
the complex physical storage details.
Disadvantages:
1. All the records are maintained using pointers and hence the whole database
2. structure becomes very complex. The insertion, deletion and updating operations of any
record require large number of pointers adjustments.
3. Structural changes to the database are very difficult.
3. Relational Model:
1. The Relational Model for database management is an approach to managing data using a
structure and language described in 1969 by Edgar F. Codd.
2. In the relational model of a database, all data is represented in terms of tuples, grouped
into relations.
3. A database organized in terms of the relational model is a relational database.
Advantages:
1. Tables consisting of rows and columns are much easier to understand.
2. Different tables from which information has to be linked and extracted can be easily
manipulated by operators such as project and join to give information in the form in
which it is desired
3. The usage of relational algebra and relational calculus in the manipulation of the
4. relations between the tables ensures that there is no ambiguity.
5. Security control and authorization can also be implemented more easily by moving
sensitive attributes in a given table into a separate relation with its own authorization
controls.
6. Data independence is achieved more easily with normalization structure used in a
relational database than in the more complicated tree or network structure.
Disadvantages:
1. Performance is a major constraint and therefore disadvantage in the use of relational
database system is machine performance.
2. Physical storage consumption.
3. Slow extraction of meaning from data
4. Object-Oriented Model
1. Object-oriented models define a database as a collection of objects with features and
methods.
2. An object database (also Object-Oriented Database Management System, (OODBMS) is
a Database Management System in which information is represented in the form of
objects as used in object-oriented programming.
Advantages:
1. Applicability to advanced database applications.
2. Improved performance.
3. Support for schema evolution.
4. Enriched modeling capabilities.
5. Extensibility.
6. Capable of handling a large variety of data types.
7. Removal of impedance mismatch.
8. More expressive query language
. Disadvantages:
1. Lack of universal data.
2. Query optimization compromises encapsulations Lack of support for security.
Schema
The Skeleton of the database is created by the attributes and this skeleton is named Schema.
Schema mentions the logical constraints like table, primary key, etc.
The schema does not represent the data type of the attributes.
Database Schema
A database schema is a logical representation of data that shows how the data in a database
should be stored logically. It shows how the data is organized and the relationship between the
tables.
1. Database schema contains table, field, views and relation between different keys
like primary key, foreign key.
2. Data are stored in the form of files which is unstructured in nature which makes
accessing the data difficult. Thus to resolve the issue the data are organized in
structured way with the help of database schema.
3. Database schema provides the organization of data and the relationship between the
stored data.
4. Database schema defines a set of guidelines that control the database along with that it
provides information about the way of accessing and modifying the data
Database Architecture
A Database Architecture is a representation of DBMS design. It helps to design, develop,
implement, and maintain the database management system. A DBMS architecture allows
dividing the database system into individual components that can be independently modified,
changed, replaced, and altered. It also helps to understand the components of a database.
2-Tier Architecture
A 2 Tier Architecture in DBMS is a Database architecture where the presentation layer runs on
a client (PC, Mobile, Tablet, etc.), and data is stored on a server called the second tier. Two tier
architecture provides added security to the DBMS as it is not exposed to the end-user directly. It
also provides direct and faster communication.
In the above 2 Tier client-server architecture of database management system, we can see that
one server is connected with clients 1, 2, and 3.
Two Tier Architecture Example:
A Contact Management System created using MS- Access.
3-Tier Architecture
A 3 Tier Architecture in DBMS is the most popular client server architecture in DBMS in
which the development and maintenance of functional processes, logic, data access, data storage,
and user interface is done independently as separate modules. Three Tier architecture contains a
presentation layer, an application layer, and a database server.
The Application layer resides between the user and the DBMS, which is responsible for
communicating the user’s request to the DBMS system and send the response from the DBMS to
the user. The application layer(business logic layer) also processes functional logic, constraint,
and rules before passing data to the user or down to the DBMS.
Data Independence
Data independence can be explained using the three-schema architecture.
Data independence refers characteristic of being able to modify the schema at one level of the
database system without altering the schema at the next higher level.
. External level:
1. It is also called View level.
2. The user does not need to know the database schema details such as data structure, table
definition etc. User is only concerned about data which is what returned back to the wave
level after it has been fetched from database.
3. The reason this level is called “view” is because several users can weave their desired
data from this level which is internally fetched from database with the help of conceptual
and internal level mapping.
4. External level is the “Top level”
Database Languages
Interfaces
A database management system (DBMS) interface is a user interface that allows for the ability
to input queries to a database without using the query language itself. User-friendly interfaces
provided by DBMS may include the following:
1. Menu-Based Interfaces
2. Forms-Based Interfaces
3. Graphical User Interfaces
4. Interfaces for the Database Administrator (DBA)
Menu-Based Interfaces
These interfaces present the user with lists of options (called menus) that lead the user through
the formation of a request. The basic advantage of using menus is that they remove the tension
of remembering specific commands and syntax of any query language. The query is basically
composed step by step by collecting or picking options from a menu that is shown by the
system. Pull-down menus are a very popular technique in Web-based interfaces.
Forms-Based Interfaces
A forms-based interface displays a form to each user. Users can fill out all of the form entries
to insert new data, or they can fill out only certain entries, in which case the DBMS will
redeem the same type of data for other remaining entries.
Example: SQL Forms is a form-based language that specifies queries using a form designed in
conjunction with the relational database schema.
Graphical User Interface
A GUI typically displays a schema to the user in diagrammatic form. The user then can specify
a query by manipulating the diagram. In many cases, GUI utilise both menus and forms. Most
GUI use a pointing device such as a mouse, to pick a certain part of the displayed schema
diagram.
Interfaces for Database Administrators (DBA)
Most database system contains privileged commands that can be used only by the DBA’s staff.
These include commands for creating accounts, setting system parameters, granting account
authorization, changing a schema, and reorganizing the storage structures of databases
Classification of DBMS
History of ER models
Peter Chen proposed ER Diagrams in 1971 to create a uniform convention that can be used as a
conceptual modeling tool. Many models were presented and discussed, but none were suitable. The
data structure diagrams offered by Charles Bachman also inspired his model.
ER Diagram helps you conceptualize the database and lets you know which fields need to be
embedded for a particular entity
ER Diagram gives a better understanding of the information to be stored in a database
It reduces complexity and allows database designers to build databases quickly
It helps to describe elements using Entity-Relationship models
It allows users to get a preview of the logical structure of the database
Components of ER Diagram
ER Model consists of Entities, Attributes, and Relationships among Entities in a Database
System.
Entity
It is a “thing” or “object” in the real world. Which consists of either place, person,
animal, and so on.
Entity Set: An Entity is an object of Entity Type and a set of all entities is called an entity
set. For Example, E1 is an entity having Entity Type Student and the set of all students is
called Entity Set. In ER diagram, Entity Type is represented as:
1. Strong Entity
A Strong Entity is a type of entity that has a key Attribute. Strong Entity does not depend on
other Entity in the Schema. It has a primary key, that helps in identifying it uniquely, and it is
represented by a rectangle. These are called Strong Entity Types.
2. Weak Entity
An Entity type has a key attribute that uniquely identifies each entity in the entity set. But
some entity type exists for which key attributes can’t be defined. These are called Weak Entity
types.
For Example, A company may store the information of dependents (Parents, Children,
Spouse) of an Employee. But the dependents don’t have existed without the employee. So
Dependent will be a Weak Entity Type and Employee will be Identifying Entity type for
Dependent, which means it is Strong Entity Type.
A weak entity type is represented by a Double Rectangle. The participation of weak entity
types is always total. The relationship between the weak entity type and its identifying strong
entity type is called identifying relationship and it is represented by a double diamond.
Attributes
Attributes are the properties that define the entity type. For example, Roll_No, Name, DOB,
Age, Address, and Mobile_No are the attributes that define entity type Student. In ER diagram,
the attribute is represented by an oval.
. Key Attribute
The attribute which uniquely identifies each entity in the entity set is called the key attribute.
For example, Roll_No will be unique for each student. In ER diagram, the key attribute is
represented by an oval with underlying lines.
2. Composite Attribute
An attribute composed of many other attributes is called a composite attribute. For example,
the Address attribute of the student Entity type consists of Street, City, State, and Country. In
ER diagram, the composite attribute is represented by an oval comprising of ovals.
3. Multivalued Attribute
An attribute consisting of more than one value for a given entity. For example, Phone_No (can
be more than one for a given student). In ER diagram, a multivalued attribute is represented by
a double oval.
4. Derived Attribute
An attribute that can be derived from other attributes of the entity type is known as a derived
attribute. e.g.; Age (can be derived from DOB). In ER diagram, the derived attribute is
represented by a dashed oval.
The Complete Entity Type Student with its Attributes can be represented as:
Domains Of Attributes
A set of values that may be assigned to the attributes of each individual entity, in
an entity set is called the Value set or Domain.
Example 1: For employee entity, if age limit is 20 to 58, then the values set (domain) of
attribute age consists of integers from 20 to 58.Age: Domain is [20 – 58]
Example 2: Similarly, the value set for name attribute could be a set of alphabets and
some special characters.
Name: Domain is [a-z], [A-Z], blank space
Example 3: For salary attribute, the value set may be again a range from minimum of
5000 to maximum of 50000.
Degree of relationship
The number of different entity sets participating in a relationship set is called the degree of a
relationship set.
1. Unary Relationship: When there is only ONE entity set participating in a relation, the
relationship is called a unary relationship. For example, one person is married to only one
person.
Binary Relationship:
Binary relationship is a relationship between the instance of two entities type.
The Degree of relationship is Two.
Ternary Relationship:
Ternary Relationship is a relationship between the instance of three entities types.
The Degree of relationship is three.
Recursive relationship
A relationship between two entities of a similar entity type is called a recursive relationship.
Here the same entity type participates more than once in a relationship type with a different
role for each instance. In other words, a relationship has always been between occurrences in
two different entities. However, the same entity can participate in the relationship. This is
termed a recursive relationship.
Attributes of Relationship Types Just like entities, relationship types can also have
attributes. For example, in the
works-for relationship, we can add an attribute “hours”, which describes the relationship
as works-for so many hours.
Structural Constraints
To understand Structural Constraints, we must take a look at Cardinality Ratios and
Participation Constraints. Cardinality Ratios of relationships : The entities are denoted by
rectangle and relationships by diamond.
There are numbers (represented by M and N) written above the lines which connect
relationships and entities. These are called cardinality ratios. These represent the maximum
number of entities that can be associated with each other through relationship, R
1. One-to-one (1:1) – When one entity in each entity set takes part at most once in the
relationship, the cardinality is one-to-one.
2. One-to-many (1: N) – If entities in the first entity set take part in the relationship set at
most once and entities in the second entity set take part many times (at least twice), the
cardinality is said to be one-to-many.
3. Many-to-one (N:1) – If entities in the first entity set take part in the relationship set many
times (at least twice), while entities in the second entity set take part at most once, the
cardinality is said to be many-to-one.
4. Many-to-many (N: N) – The cardinality is said to be many to many if entities in both the
entity sets take part many times (at least twice) in the relationship set.
Participation Constraints : Participation Constraints tell us that the participation in a
relationship can either be total or partial.
Participation Constraint
Participation Constraint is applied to the entity participating in the relationship set.
1. Total Participation – Each entity in the entity set must participate in the relationship.
If each student must enroll in a course, the participation of students will be total. Total
participation is shown by a double line in the ER diagram.
2. Partial Participation – The entity in the entity set may or may NOT participate in the
relationship. If some courses are not enrolled by any of the students, the participation in the
course will be partial.
ER diagram
ER diagram of student
E.F. Codd proposed the relational Model to model data in the form of relations or tables. After
designing the conceptual model of the Database using ER diagram, we need to convert the
conceptual model into a relational model which can be implemented using
any RDBMS language like Oracle SQL, MySQL, etc
The relational model represents how data is stored in Relational Databases. A relational
database consists of a collection of tables, each of which is assigned a unique name. Consider a
relation STUDENT with attributes ROLL_NO, NAME, ADDRESS, PHONE, and AGE shown
in the table.
Table Student
ROLL_NO NAME ADDRESS PHONE AGE
4 SURESH DELHI 18
Important Terminologies
1. Attribute: Attributes are the properties that define an entity. e.g.; ROLL_NO, NAME,
ADDRESS
2. Relation Schema: A relation schema defines the structure of the relation and represents the
name of the relation with its attributes. e.g.; STUDENT (ROLL_NO, NAME, ddepartment)
is the relation schema for STUDENT. If a schema has more than 1 relation, it is called
Relational Schema.
3. Tuple: Each row in the relation is known as a tuple. Relation Instance: The set of tuples
of a relation at a particular instance of time is called a relation instance. Table 1 shows the
relation instance of STUDENT at a particular time. It can change whenever there is an
insertion, deletion, or update in the database.
4. Degree: The number of attributes in the relation is known as the degree of the relation.
The STUDENT relation defined above has degree 5.
5. Cardinality: The number of tuples in a relation is known as cardinality.
6. Column: The column represents the set of values for a particular attribute. The
column ROLL_NO is extracted from the relation STUDENT.
As we know we have several relations in a database. Now, each relation must be uniquely
identified. If it is not so, then it would create a lot of confusion. Here, we will discuss some
characteristics that when followed will automatically make a relation distinct in a database.
1. Each relation in a database must have a distinct or unique name which would separate it
from the other relations in a database.
2. A relation must not have two attributes with the same name. Each attribute must have a
distinct name.
3. Duplicate tuples must not be present in a relation.
4. Each tuple must have exactly one data value for an attribute. For example, below in the first
table, you can see that for Roll_No. 265 we have enrolled two students Jhoson and Charles, this
would not work. We must have only one student for one Roll_No.
5. Tuples in a relation do not have to follow a significant order as the relation is not order-
sensitive.
6. Similarly, the attributes of a relation also do not have to follow certain ordering, it’s up to the
developer to decide the ordering of attributes.
Domain Constraints
In a database table, domain constraints are guidelines that specify the acceptable values for a
certain property or field. These restrictions guarantee data consistency and aid in preventing the
entry of inaccurate or inconsistent data into the database. The following are some instances of
domain restrictions in a Relational Database Model −
1. Data type constraints − These limitations define the kinds of data that can be kept in a
column. A column created as VARCHAR can take string values, but a column specified as
INTEGER can only accept integer values.
2. Length Constraints − These limitations define the largest amount of data that may be put in
a column. For instance, a column with the definition VARCHAR(10) may only take strings
that are up to 10 characters long.
3. Range constraints − The allowed range of values for a column is specified by range
restrictions. A column designated as DECIMAL(5,2), for example, may only take decimal
values up to 5 digits long, including 2 decimal places.
4. Nullability constraints − Constraints on a column's capacity to accept NULL values are
known as nullability constraints. For instance, a column that has the NOT NULL definition
cannot take NULL values.
5. Unique constraints − Constraints that require the presence of unique values in a column or
group of columns are known as unique constraints. For instance, duplicate values are not
allowed in a column with the UNIQUE definition.
6. Check constraints − Constraints for checking data: These constraints outline a requirement
that must hold for any data placed into the column. For instance, a column with the definition
CHECK (age > 0) can only accept ages that are greater than zero.
Key Constraints
Key constraints are regulations that a Relational Database Model uses to ensure data accuracy
and consistency in a database. They define how the values in a table's one or more columns are
related to the values in other tables, making sure that the data remains correct.
In Relational Database Model, there are several key constraint kinds, including −
1. Primary Key Constraint − A primary key constraint is an individual identifier for each
record in a database. It guarantees that each database entry contains a single, distinct value—
or a pair of values—that cannot be null—as its method of identification.
2. Foreign Key Constraint − Reference to the primary key in another table is a foreign key
constraint. It ensures that the values of a column or set of columns in one table correspond to
the primary key column(s) in another table.
3. Unique Constraint − In a database, a unique constraint ensures that no two values inside a
column or collection of columns are the same.
Integrity constraint
Integrity constraints in DBMS can be defined as a collection of rules used to ensure the
quality of information. This assures that data insertion, modification, and other operations do
not affect data integration. There are mainly four types of integrity constraints, namely,
domain constraint, referential integrity constraint, entity integrity constraint, and key
constraint.
Relational Algebra
Relational Algebra refers to a procedural query language that accepts a Relation as input and
outputs another Relation. Theoretical foundations for relational databases and SQL are provided
by relational algebra
1 2 4
2 2 3
3 2 3
4 3 4
For the above relation, σ(c>3)R will select the tuples which have c more than 3.
A B C
1 2 4
4 3 4
Note: The selection operator only selects the required tuples but does not display them. For
display, the data projection operator is used.
2 4
2 3
3 4
Note: By Default, projection removes duplicate data.
3. Union(U): Union operation in relational algebra is the same as union operation in set theory.
Example:
FRENCH
Student_Name Roll_Number
Ram 01
Mohan 02
Vivek 13
Geeta 17
GERMAN
Student_Name Roll_Number
Vivek 13
Geeta 17
Shyam 21
Rohan 25
Consider the following table of Students having different optional subjects in their course.
π(Student_Name)FRENCH U π(Student_Name)GERMAN
Student_Name
Ram
Mohan
Vivek
Geeta
Shyam
Rohan
Note: The only constraint in the union of two relations is that both relations must have the
same set of Attributes.
4. Set Difference(-): Set Difference in relational algebra is the same set difference operation as
in set theory.
Example: From the above table of FRENCH and GERMAN, Set Difference is used as follows
π(Student_Name)FRENCH - π(Student_Name)GERMAN
Student_Name
Ram
Mohan
Note: The only constraint in the Set Difference between two relations is that both relations
must have the same set of Attributes.
5. Set Intersection(∩): Set Intersection in relational algebra is the same set intersection
operation in set theory.
Example: From the above table of FRENCH and GERMAN, the Set Intersection is used as
follows
π(Student_Name)FRENCH ∩ π(Student_Name)GERMAN
Student_Name
Vivek
Geeta
Note: The only constraint in the Set Difference between two relations is that both relations
must have the same set of Attributes.
7. Cross Product(X): Cross-product between two relations. Let’s say A and B, so the cross
product between A X B will result in all the attributes of A followed by each attribute of B.
Each record of A will pair with every record of B.
Example:
A
Name Age Sex
Ram 14 M
Sona 15 F
Kim 20 M
B
ID Course
1 DS
2 DBMS
AXB
Name Age Sex ID Course
Ram 14 M 1 DS
Ram 14 M 2 DBMS
Sona 15 F 1 DS
Sona 15 F 2 DBMS
Kim 20 M 1 DS
Kim 20 M 2 DBMS
Note: If A has ‘n’ tuples and B has ‘m’ tuples then A X B will have ‘ n*m ‘ tuples.
Example-
ID Name Subject
ID Name Subject
Then, R ∪ S is-
ID Name Subject
Example-
ID Name Subject
ID Name Subject
Then, R ∩ S is-
ID Name Subject
Example-
Consider the following two relations R and S-
ID Name Subject
ID Name Subject
Then, R – S is-
ID Name Subject
JOINS
Join is an operation in DBMS(Database Management System) that combines the row of two or
more tables based on related columns between them. The main purpose of Join is to retrieve
the data from multiple tables in other words Join is used to perform multi-table query. It is
denoted by ⨝.
Syntax
R3 <- ⨝(R1) <join_condition> (R2)
where R1 and R2 are two relations to be join and R3 is a relation that will holds the result of
join operation.
Example
Temp <- ⨝(student) S.roll==E.roll(Exam)
where S and E are alias of the student and exam respectively
Types of Join
1. Inner Join
2. Outer join
3. Inner join
1, Inner Join is a join operation in DBMS that combines two or more table based on related
columns and return only rows that have matching values among tables.Inner join of two types.
1. Equi Join
2. Natural Join
Equi Join
Equi Join is a type of Inner join in which we use euivalence(‘=’) condition in join condition
Example:
Table A
Column A Column B
a a
a b
Table B
Column A Column B
a a
a c
A ⨝ A.Column B = B.Column B (B)
Result:
Column A Column B
a a
Natural Join
Natural join is a type of inner join in which we not need of any comparison operators. In
natural join columns should have the same name and domain. There should be at least one
common attribute between two tables.
Eample:
Table A
Number Square
2 4
3 9
Table B
Number Cube
2 8
3 27
A⨝B
Number Square Cube
2 4 8
3 9 27
Outer Join
Outer join is a type of join that retrieve matching as well as non-maching records from related
tables.
There three types of outer join
1. Left outer join
2. Right outer join
3. Full outer join
4. Left Outer Join
It is also called left join. This type of outer join retrieve all records from left table and retrive
maching record from right table.
Example:
Table A
Number Square
2 4
3 9
4 16
Table B
Number Cube
2 8
3 27
5 75
A⟕B
Result:
Number Square Cube
2 4 8
3 9 27
4 16 –
2 4 8
3 9 27
5 – 75
2 4 8
3 9 27
4 16 –
5 – 75
Example: Employee
EmpId Empname Empsal
1001 sundar 25000
1002 manoj 20000
1003 Bharath 10000
1004 Prakash 5000
1005 Shreedhar 100000
3) Max(): This function is used to check the maximum value of numerical column.
Syntax: select max(col-list) from table name;
Example: select max(emp sal) from employee;
Max(emp sal)
100000
5) Count():
• This function is used to return the number of records in the table.)
Syntax: select count(exp) from <table name>where<condition>;
Example: select count(emp sal) from employee;
Count(emp sal)
5
Views
View is a virtual table based on one or more tables.
• A view can be created from one or more tables which depends on the writtens SQL query to
create a
view.
• A view can contain all rows of a table/selected rows from a table.
• Most of the operations that can be carried on the table can be carried out on view.
• Any updating of rows in a table will automatically reflect in the veiewtable.
Syntax: Create view<view name>as select<col-list>from<tablename>where<condition>;
Student
Rno Name Total
1 anu 700
2 ashu 900
3 chinnu 850
4 priya 700
5 Chitti 680
Nested Subquery: An SQL query that has another query enclosed in its WHERE, FROM, or
SELECT clause.
Example:SELECT product_name, priceFROM productsWHERE price > (SELECT AVG(price)
FROM products);
PL/SQL(intro,charecteristics,features)
It is declarative, that defines what needs to be PL/SQL is procedural that defines how the
done, rather than how things need to be done. things needs to be done.
structure:
DECLARE
declaration statements;
BEGIN
executable statements
EXCEPTIONS
exception handling statements
END;
1. Declare section starts with DECLARE keyword in which variables, constants, records as
cursors can be declared which stores data temporarily. It basically consists definition of
PL/SQL identifiers. This part of the code is optional.
2. Execution section starts with BEGIN and ends with END keyword.This is a mandatory
section and here the program logic is written to perform any task like loops and conditional
statements. It supports all DML commands, DDL commands and SQL*PLUS built-in
functions as well.
3. Exception section starts with EXCEPTION keyword.This section is optional which
contains statements that are executed when a run-time error occurs. Any exceptions can be
handled in this section.
--PL/SQL code to print sum of two numbers taken from the user.
DECLARE
a integer := &a
b integer := &b ;
c integer ;
BEGIN
c := a + b ;
dbms_output.put_line('Sum of '||a||' and '||b||' is = '||c);
END;
/
Oiutput
Enter value for a: 2
Enter value for b: 3
Sum of 2 and 3 is = 5
Anomalies : means problems or inconsistency which happened during the operations performed
on the table. There can be many reasons that anomaly occur for example,It occurs when data is
stored multiple times unnecessarily in the database i.e. redundant data is present or it occur when
all the data is stored in a single table. normalization is used to overcome the anomalies.
Example
Consider a manufacturing firm that keeps worker information in a table called employee, which
has four columns: w_id for the employee’s id, w_name for the employee’s name, w_address for
the employee’s address, and w_dept for the employee’s department. The table will look like this
at some point:
w_id w_name w_address w_dept
201 David Delhi F001
201 David Delhi F002
223 Mike Agra F890
266 Berry Chennai F900
266 Berry Chennai F004
The table above has not been normalized. We’ll look at the issues that arise when the table isn’t
normalized.
Type of Anomalies
1. Update
2. Insert
3. Delete
1. Update Anomaly
Employee David has two rows in the table given above since he works in two different
departments. If we want to change David’s address, we must do so in two rows, else the data
would become inconsistent.
If the proper address is updated in one of the departments but not in another, David will have two
different addresses in the database, which is incorrect and leads to inconsistent data.
2. Insert Anomaly If a new worker joins the firm and is currently unassigned to any
department, we will be unable to put the data into the table because the w_dept field does not
allow nulls.
3. Delete Anomaly: If the corporation closes the department F890 at some point in the future,
deleting the rows with w_dept as F890 will also erase the information of employee Mike, who is
solely assigned to this department.
Decomposition The process of breaking up or dividing a single relation into two or more sub
relations is called as decomposition of a relation.
<DeptDetails>
Dept_ID Emp_ID Dept_Name
Dpt1 E001 Operations
Dpt2 E002 HR
Dpt3 E003 Finance
2. Lossy Decomposition- As the name suggests, when a relation is decomposed into two or
more relational schemas, the loss of information is unavoidable when the original relation
is retrieved.
<DeptDetails>
Dept_ID Dept_Name
Dpt1 Operations
Dpt2 HR
Dpt3 Finance
Now, you won’t be able to join the above tables, since Emp_ID isn’t part of
the DeptDetails relation.
Therefore, the above relation has lossy decomposition.
Functional Dependency
The functional dependency is a relationship that exists between two attributes. It typically exists
between the primary key and non-key attribute within a table.
X → Y
The left side of FD is known as a determinant, the right side of the production is known as a
dependent.
For example:
Assume we have an employee table with attributes: Emp_Id, Emp_Name, Emp_Address.Here
Emp_Id attribute can uniquely identify the Emp_Name attribute of employee table because if we
know the Emp_Id, we can tell that employee name associated with it.
Functional dependency can be written as:
Emp_Id → Emp_Name
We can say that Emp_Name is functionally dependent on Emp_Id.
Normalization
Types of Normalization
Normal Description
Form
2NF A relation will be in 2NF if it is in 1NF and all non-key attributes are fully
functional dependent on the primary key.
4NF A relation will be in 4NF if it is in Boyce Codd's normal form and has no
multi-valued dependency.
5NF A relation is in 5NF. If it is in 4NF and does not contain any join
dependency, joining should be lossless.
Advantages of Normalization
1. Normalization helps to minimize data redundancy.
2. Greater overall database organization.
3. Data consistency within the database.
4. Much more flexible database design.
5. Enforces the concept of relational integrity.
Disadvantages of Normalization
1. You cannot start building the database before knowing what the user needs.
2. The performance degrades when normalizing the relations to higher normal forms, i.e.,
4NF, 5NF.
3. It is very time-consuming and difficult to normalize relations of a higher degree.
4. Careless decomposition may lead to a bad database design, leading to serious problems.
14 John 7272826385, UP
9064738238
14 John 7272826385 UP
14 John 9064738238 UP
25 Chemistry 30
25 Biology 30
47 English 35
83 Math 38
83 Computer 38
To convert the given table into 2NF, we decompose it into two tables:
TEACHER_DETAIL table:
TEACHER_ID TEACHER_AGE
25 30
47 35
83 38
TEACHER_SUBJECT table:
TEACHER_ID SUBJECT
25 Chemistry
25 Biology
47 English
83 Math
83 Computer
Example:
EMPLOYEE_DETAIL table:
EMP_ID EMP_NAME EMP_ZIP EMP_STATE EMP_CITY
EMPLOYEE table:
EMP_ID EMP_NAME EMP_ZIP
EMPLOYEE_ZIP table:
201010 UP Noida
02228 US Boston
60007 US Chicago
06389 UK Norwich
462007 MP Bhopal
EMPLOYEE table:
EMP_ID EMP_COUNTRY EMP_DEPT DEPT_TYPE EMP_DEPT_NO
The table is not in BCNF because neither EMP_DEPT nor EMP_ID alone are keys.
To convert the given table into BCNF, we decompose it into three tables:
EMP_COUNTRY table:
EMP_ID EMP_COUNTRY
264 India
264 India
EMP_DEPT table:
EMP_DEPT DEPT_TYPE EMP_DEPT_NO
EMP_DEPT_MAPPING table:
EMP_ID EMP_DEPT
D394 283
D394 300
D283 232
D283 549
Functional dependencies:
EMP_ID → EMP_COUNTRY
EMP_DEPT → {DEPT_TYPE, EMP_DEPT_NO}
Candidate keys:
For the first table: EMP_ID
For the second table: EMP_DEPT
For the third table: {EMP_ID, EMP_DEPT}
Now, this is in BCNF because left side part of both the functional dependencies is a key.
TRANSACTION :
1. A transaction is a logical unit of work of database processing that includes one or more
database access operations.
2. . A transaction can be defined as an action or series of actions that is carried out by a single
user or application program to perform operations for accessing the contents of the database.
The operations can include retrieval, (Read), insertion (Write), deletion and modification. A
transaction must be either completed or aborted
3. It can either be embedded within an application program or can be specified interactively via
a high-level query language such as SQL. Its execution preserves the consistency of the
database.Each transaction should access shared data without interfering with the other
transactions and whenever a transaction successfully completes its execution; its effect
should be permanent.
This basic abstraction frees the database application programmer from the following
concerns :
1. Inconsistencies caused by conflicting updates from concurrent users.
2. Partially completed transactions in the event of systems failure.
3. User-directed undoing of transactions.
A transaction is a sequence of READ and WRITE actions that are grouped together to from a
database access. A transaction may consist of a simple SELECT operation to generate a list of
table contents, or it may consist of a series of related UPDATE command sequences.
Transaction that changes the contents of the database must alter the database from
one consistent state to another. A consistent database state is one in which all data integrity
constraints are satisfied.
To ensure database consistency, every transaction must begin with the database in a known
consistent state.
❖ A single user system in DBMS refers to a database management system that allows only one
user to access and manipulate the database at a time. In other words, a single user system
provides exclusive access to the database, and no other user can make any changes or
modifications to the data until the first user has finished.
❖ These systems are typically used in small organizations.
❖ A multi-user system in DBMS is a database management system that allows multiple users to
access and manipulate the database simultaneously. In this type of system, multiple users can
perform different operations on the database at the same time, such as adding, modifying or
deleting data. The system is designed to manage concurrent access and ensure data consistency,
so that multiple users can work on the same database without conflicting with each other.
• Collaboration: Multi-user DBMS allows multiple users to access the same data simultaneously,
making it easier for teams to work together and collaborate on projects.
• Scalability: Multi-user DBMS is designed to handle large amounts of data and multiple users,
making it suitable for organizations that need to scale up their operations.
• Data sharing: Multi-user DBMS makes it possible for multiple users to access and share data,
allowing for better data sharing and collaboration between team members.
Basic unit of data transfer from the disk to the computer main memory is one disk block
A lost update problem occurs due to the update of the same record by two different transactions at
the same time.
In simple words, when two transactions are updating the same record at the same time in a DBMS
then a lost update problem occurs. The first transaction updates a record and the second
transaction updates the same record again, which nullifies the update of the first transaction. As
the update by the first transaction is lost this concurrency problem is known as the lost update
problem.
Transaction A reads the value of data DT as 1000 and modifies it to 1500 which gets stored in the
temporary buffer. The transaction B reads the data DT as 1500 and commits it and the value of
DT permanently gets changed to 1500 in the database DB. Then some server errors occur in
transaction A and it wants to get rollback to its initial value, i.e., 1000 and then the dirty read
problem occurs.
TYPES OF FAILURES:
Failures are generally classified as transaction, system, and media failures. There are several
possible reasons for a transaction to fail in the middle of execution:
• A computer failure (system crash): A hardware, software, or network error occurs in the
computer system during transaction execution. Hardware crashes are usually media failuresfor
example, main memory failure.
• A transaction or system error: Some operation in the transaction may cause it to fail, such as
integer overflow or division by zero. Transaction failure may also occur because of erroneous
parameter values or because of a logical programming error.' In addition, the user may interrupt
the transaction during its execution.
• Local errors or exception conditions detected by the transaction: During transaction
execution, certain conditions may occur that necessitate cancellation of the transaction. For
example, data for the transaction may not be found. Notice that an exception condition," such as
insufficient account balance in a banking database, may cause a transaction, such as a fund
withdrawal, to be canceled. This exception should be programmed in the transaction itself, and
hence would not be considered a failure.
• Concurrency control enforcement: The concurrency control method may decide to abort the
transaction, to be restarted later, because it violates serializability or because several transactions
are in a state of deadlock.
• Disk failure: Some disk blocks may lose their data because of a read or write malfunction or
because of a disk read/write head crash. This may happen during a read or a write operation of
the transaction.
• Physical problems and catastrophes: This refers to an endless list of problems that includes
power or air-conditioning failure, fire, theft, sabotage, overwriting disks or tapes by mistake, and
mounting of a wrong tape by the operator.
STATES OF TRANSACTION
Active state
The active state is the first state of every transaction. In this state, the transaction is being
executed.
For example: Insertion or deletion or updating a record is done here. But all the records are still
not saved to the database.
Partially committed
In the partially committed state, a transaction executes its final operation, but the data is still not
saved to the database.
In the total mark calculation example, a final display of the total marks step is executed in this
state.
Committed
A transaction is said to be in a committed state if it executes all its operations successfully. In
this state, all the effects are now permanently saved on the database system.
Failed state
If any of the checks made by the database recovery system fails, then the transaction is said to be
in the failed state.
In the example of total mark calculation, if the database is not able to fire a query to fetch the
marks, then the transaction will fail to execute.
Aborted
If any of the checks fail and the transaction has reached a failed state then the database recovery
system will make sure that the database is in its previous consistent state. If not then it will abort
or roll back the transaction to bring the database into a consistent state.
If the transaction fails in the middle of the transaction then before executing the transaction, all
the executed transactions are rolled back to its consistent state.
After aborting the transaction, the database recovery module will select one of the two
operations:
Re-start the transaction
Kill the transaction
Consider the following transaction T consisting of T1 and T2: Transfer of 100 from
account X to account Y.
If the transaction fails after completion of T1 but before completion of T2.( say,
after write(X) but before write(Y)), then the amount has been deducted from X but not added
to Y. This results in an inconsistent database state. Therefore, the transaction must be executed
in its entirety in order to ensure the correctness of the database state.
Consistency:
This means that integrity constraints must be maintained so that the database is consistent
before and after the transaction. It refers to the correctness of a database. Referring to the
example above,
The total amount before and after the transaction must be maintained.
Total before T occurs = 500 + 200 = 700.
Total after T occurs = 400 + 300 = 700.
Therefore, the database is consistent. Inconsistency occurs in case T1 completes but T2 fails.
As a result, T is incomplete.
Isolation:
This property ensures that multiple transactions can occur concurrently without leading to the
inconsistency of the database state. Transactions occur independently without interference.
Changes occurring in a particular transaction will not be visible to any other transaction until
that particular change in that transaction is written to memory or has been committed. This
property ensures that the execution of transactions concurrently will result in a state that is
equivalent to a state achieved these were executed serially in some order.
Let X= 500, Y = 500.
Consider two transactions T and T”.
Suppose T has been executed till Read (Y) and then T’’ starts. As a result, interleaving of
operations takes place due to which T’’ reads the correct value of X but the incorrect value
of Y and sum computed by
T’’: (X+Y = 50, 000+500=50, 500)
is thus not consistent with the sum at end of the transaction:
T: (X+Y = 50, 000 + 450 = 50, 450).
This results in database inconsistency, due to a loss of 50 units. Hence, transactions must take
place in isolation and changes should be visible only after they have been made to the main
memory.
Durability:
This property ensures that once the transaction has completed execution, the updates and
modifications to the database are stored in and written to disk and they persist even if a system
failure occurs. These updates now become permanent and are stored in non-volatile memory.
The effects of the transaction, thus, are never lost.
LOCKS
Binary Locks: A Binary lock on a data item can either locked or unlocked states.
Shared/exclusive: This type of locking mechanism separates the locks in DBMS based on their
uses. If a lock is acquired on a data item to perform a write operation, it is called an exclusive
lock.
For example, when a transaction needs to update the account balance of a person. You can
allows this transaction by placing X lock on it. Therefore, when the second transaction wants to
read or write, exclusive lock prevent this operation.
TIMESTAMP-BASED PROTOCOLS
1. Timestamp based Protocol in DBMS is an algorithm which uses the System Time or
Logical Counter as a timestamp to serialize the execution of concurrent transactions. The
Timestamp-based protocol ensures that every conflicting read and write operations are
executed in a timestamp order.
2. The older transaction is always given priority in this method. It uses system time to
determine the time stamp of the transaction. This is the most commonly used concurrency
protocol.
3. Lock-based protocols help you to manage the order between the conflicting transactions
when they will execute. Timestamp-based protocols manage conflicts as soon as an operation
is created.
Example:
Suppose there are there transactions T1, T2, and T3.
T1 has entered the system at time 0010
T2 has entered the system at 0020
T3 has entered the system at 0030
Priority will be given to transaction T1, then transaction T2 and lastly Transaction T3.
RULE 1:
1. If a transaction Ti issues a read(X) operation.
If TS(Ti) < W-timestamp(X)
Abort and Rollback T, Operation Rejected
2. If TS(Ti) >= W-timestamp(X)
Set Read(X)=Max, {RTS(A), TS (T) } and Operation executed.
RULE 2 :
1. If a transaction Ti issues a write(X) operation
If TS(Ti) < R-timestamp(X), Operation Executed
If TS(Ti) < W-timestamp(X)
Operation rejected and T rolled back.
2. Otherwise, the operation is executed.
DEADLOCK
A deadlock occurs when two or more processes need some resource to complete their execution
that is held by the other process.
In the above diagram, the process 1 has resource 1 and needs to acquire resource 2. Similarly
process 2 has resource 2 and needs to acquire resource 1. Process 1 and process 2 are in deadlock
as each of them needs the other’s resource to complete their execution but neither of them is
willing to relinquish their resources.
A deadlock will only occur if the four Coffman conditions hold true. These conditions are not
necessarily mutually exclusive. They are given as follows −
Mutual Exclusion
Mutual exclusion implies there should be a resource that can only be held by one process at a
time. This means that the resources should be non-sharable.
STARVATION
Starvation occurs if a process is indefinitely postponed. This may happen if the process requires
a resource for execution that it is never alloted or if the process is never provided the processor
for some reason.