0% found this document useful (0 votes)
27 views75 pages

REVISED DBMS ALL MODULES

The document provides an overview of database systems, including their applications, purposes, and architecture. It covers key concepts such as data models, database languages, and the roles of database users and administrators, as well as the importance of database design and ER diagrams. Additionally, it introduces the relational model, integrity constraints, and logical database design, emphasizing the significance of views and querying relational data.

Uploaded by

cc619701
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views75 pages

REVISED DBMS ALL MODULES

The document provides an overview of database systems, including their applications, purposes, and architecture. It covers key concepts such as data models, database languages, and the roles of database users and administrators, as well as the importance of database design and ER diagrams. Additionally, it introduces the relational model, integrity constraints, and logical database design, emphasizing the significance of views and querying relational data.

Uploaded by

cc619701
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 75

Module-I

Introduction to Database Systems and database design.


>Introduction to Database Systems:
Database system applications, Purpose of database systems, View of data Data abstraction,
Instances and schemas, Data models; Database languages Data Definition Language, Data
Manipulation Language; Database architecture, Database users and administrators.
-------------------------------------------------------------------------------------------------------------------------------

Introduction to Database Systems:


In the realm of modern computing, Database Systems play an integral role, acting as the foundational
bedrock for managing and organizing vast amounts of data efficiently. A Database System encompasses
a structured collection of data that is organized, stored, and accessed electronically, often through a set
of software applications.
APPLICATIONS OF DATABASE SYSTEM:
Banking: Manages customer accounts, transactions, and financial records.
Airlines: Tracks flight schedules, reservations, and ticketing.
Ecommerce: Manages product catalogs, customer orders, and inventory.
Universities: Handles student records,

Purpose of Database System:


1. To ensure data integrity and consistency across the system.
2. To provide efficient data retrieval and manipulation capabilities.
3. To ensure data integrity and consistency across the system.
4. To allow multiple users to access and work with the data concurrently.

View of dataData Abstraction:


 Physical level: Describes how data is stored on physical storage devices.
 Logical level: Describes what data is stored and the relationships between data.
 View level: Describes how users interact with the data and the interfaces provided.
 Data abstraction simplifies interaction with the database by hiding complex details

Instances and Schemas:


Instance: The actual data stored in the database at a specific point in time.
Schema: The structure of the database, defining how data is organized.
Database schema: Includes tables, fields, relationships, and constraints.
Instances can change frequently, while schemas are more stable.

Data Models:
 Hierarchical model: Organizes data in a treelike structure with parentchild relationships.
 Network model: Uses graph structures to represent data and relationships.
 Relational model: Organizes data into tables with rows and columns, using keys to define
relationships.
Data models provide a framework for structuring and managing data in databases.

1. Hierarchical Data Model:

2. Network Data Model:

3. Relational Data Model:


Database Languages in Database Systems:
Database languages can be used to read, store and update the data in the database.
DBMS Languages are classified into
 DDL Used to define database structures (e.g., CREATE TABLE, ALTER TABLE).
 DML Used to manipulate data (e.g., SELECT, INSERT, UPDATE, DELETE).

1.Data Definition Language (DDL):


Data Definition Language (DDL) is a subset of SQL (Structured Query Language) used to define and
manage the structure and schema of a database. It allows database administrators and users to perform
operations that define and modify the structure of the database.
DDL Commands:
*CREATE: Defines a new database object (table, index, view, etc.).
Syntax: CREATE TABLE table_name (column1 datatype, column2 datatype, ...);
Example: CREATE TABLE students (id INT, name VARCHAR(50), age INT);
*ALTER: Modifies an existing database object.
Syntax: ALTER TABLE table_name ADD column_name datatype;
Example: ALTER TABLE students ADD COLUMN grade VARCHAR(2);
*DROP: Deletes an existing database object.
Syntax: DROP TABLE table_name;
Example: DROP TABLE students;
*TRUNCATE: Removes all records from a table, but not the table itself.
Syntax: TRUNCATE TABLE table_name;
Example: TRUNCATE TABLE students;
2.Data Manipulation Language (DML):
Data Manipulation Language (DML) is a subset of SQL responsible for managing and manipulating data
within the database. It enables users to retrieve, insert, update, and delete data from the database
tables.
DML Commands:
*SELECT: Retrieves data from one or more tables.
Syntax: SELECT column1, column2, ... FROM table_name;
Example: SELECT name, age FROM students;
*INSERT: Adds new records to a table.
Syntax: INSERT INTO table_name (column1, column2, ...) VALUES (value1, value2, ...);
Example: INSERT INTO students (id, name, age) VALUES (1, 'John Doe', 20);
*UPDATE: Modifies existing records in a table.
Syntax: UPDATE table_name SET column1 = value1, column2 = value2, ... WHERE condition;
Example: UPDATE students SET age = 21 WHERE id = 1;
*DELETE: Removes existing records from a table.
Syntax: DELETE FROM table_name WHERE condition;
Example: DELETE FROM students WHERE id = 1;

Database Architecture:
Database Architecture refers to the arrangement of components that constitute a Database
Management System (DBMS). It outlines the structure, organization, and interaction of various modules
within the DBMS, defining how data is stored, accessed, and managed.
Components of Database Architecture:
 Storage Structures: This component deals with how data is physically stored on storage devices
like hard drives or SSDs. It includes mechanisms for organizing data into files, pages, and blocks
for efficient storage and retrieval.
 Query Processor and Optimizer: The query processor interprets user queries written in a
database language (like SQL) and converts them into an execution plan. The optimizer chooses
the most efficient way to execute the query by considering various factors such as indexes, join
methods, and access paths.
 Transaction Manager: It ensures the ACID properties (Atomicity, Consistency, Isolation,
Durability) of transactions. It oversees the execution of concurrent transactions, managing their
execution and ensuring data integrity.
 Database Buffer and Cache Manager: These components manage the memory buffers that
store frequently accessed data and control caching strategies to optimize data retrieval.
 Database Communication Interfaces: These interfaces facilitate communication between
applications or users and the database system. They allow external programs to interact with
the database through APIs or network protocols.
 Security and Authorization: Database architecture includes mechanisms to control access to the
database, ensuring that only authorized users have the appropriate permissions to perform
specific operations on the data.
Types of DBMS Architecture:

Database architecture can be seen as a single tier or multitier. But


logically, database architecture is of two types like:
2tier architecture and 3tier architecture.

1Tier Architecture:
In this architecture, the database is directly available to the user. It
means the user can directly sit on the DBMS and uses it.
Any changes done here will directly be done on the database itself. It
doesn't provide a handy tool for end users.
The 1Tier architecture is used for development of the local application,
where programmers can directly communicate with the database for
the quick response.
2Tier Architecture:
The 2Tier architecture is same as basic clientserver. In the twotier
architecture, applications on the client end can directly communicate
with the database at the server side. For this interaction, API's like:
ODBC, JDBC are used.
The user interfaces and application programs are run on the clientside.
The server side is responsible to provide the functionalities like: query
processing and transaction management.
To communicate with the DBMS, clientside application establishes a
connection with the server side

3Tier Architecture:
The 3Tier architecture contains another layer between the client and
server. In this architecture, client can't directly communicate with the
server.
The application on the clientend interacts with an application server
which further communicates with the database system.
End user has no idea about the existence of the database beyond the
application server. The database also has no idea about any other user
beyond the application.
The 3Tier architecture is used in case of large web application.

Database Users and Administrators:


Database Users and Administrators play distinct roles in utilizing and managing a Database System.
Database Users:
End users: People who interact with the database through applications.
Database administrators (DBAs): Manage the database system, ensuring performance,
security, and availability.
Application programmers: Develop applications that interact with the database.
System analysts: Design the database structure and ensure it meets business requirements.
Database Administrators (DBAs):
Database Administrators are responsible for managing, configuring, and maintaining the database
system. Their roles and responsibilities include:
 Database Design: Designing the database schema, defining structures, relationships, and
constraints.
 Security Management: Setting up user access permissions, roles, and ensuring data security.
 Performance Tuning: Monitoring and optimizing database performance, indexing, and query
optimization.
 Backup and Recovery: Creating and maintaining backups of the database and planning for
disaster recovery.
 Routine Maintenance: Performing routine tasks like software updates, patches, and ensuring
data consistency and integrity.

Introduction to Database Design:


 Database design involves creating a detailed data model of the database.
 Ensures data is organized efficiently and logically.
 Focuses on defining the data elements, structures, and relationships.
 Helps in reducing redundancy and improving data integrity.

ER Diagrams:
 ER (EntityRelationship) diagrams: Visual representation of the database structure.
 ER diagrams help in designing the database at a conceptual level.
 Show entities, attributes, and relationships.
 Aid in understanding and communicating database design.
Entities, Attributes, and Entity Sets:
 Entity: An object with a distinct existence (e.g., a person, a book).
 Attributes: Characteristics of an entity (e.g., name, age).
 Entity set: A collection of similar entities (e.g., all students).
Entities and attributes are fundamental building blocks of database design.

Relationships and Relationship Sets


Relationships in the EntityRelationship (ER) model establish connections between entities. They illustrate how
entities are associated or linked within a database.
 OnetoOne Relationship: A single entity in one entity set is related to exactly one entity in another entity
set.
 OnetoMany Relationship: A single entity in one entity set is related to multiple entities in another entity
set.
 ManytoOne Relationship: Multiple entities in one entity set is related to exactly one entity in another
entity set.
 ManytoMany Relationship: Multiple entities in one entity set are related to multiple entities in another
entity set.
Relationship Sets group similar relationships together. For instance:
If "Employee" is related to "Department" through "Works In" relationship, the set of all "Works In" relationships
forms the "Works In" relationship set.
Relationships model:
Relationship Sets :

Additional Features of the ER Model:


Weak Entities:
 Represent entities that cannot be uniquely identified by their attributes alone and depend on a related
entity for identification. They typically have a partial key, requiring a strong entity for identification.
Subtypes and Supertypes:
 Allow entities to be classified into subtypes based on specific attributes or characteristics. This feature
supports hierarchy, inheritance, and specialization within the database.
Aggregation:
 Enables the representation of a higherlevel entity that is composed of multiple related entities or
relationships. It simplifies complex relationships by treating them as a single unit.

Conceptual Design with ER Model:


The ER model serves as a vital tool in the conceptual design phase of database development.
Steps in Conceptual Design:
 Requirements Gathering: Gather and analyze business requirements, identifying entities, attributes,
relationships, and constraints.
 Entity Identification: Identify relevant entities, attributes, and their relationships based on the analyzed
requirements.
 ER Modeling: Create an ER diagram depicting entities, attributes, relationships, and any additional
features based on the identified business rules.
 Normalization: Apply normalization techniques to ensure data integrity, reducing redundancy and
anomalies in the database design.
 Refinement and Validation: Review and refine the ER model, ensuring it accurately represents the
business domain. Validate against business rules and stakeholders' feedback.
MODULE-II
RELATION MODEL, RELATION ALGEBRA AND
TUPLE CALCULUS.
>Relational Model: Creating and modifying relations, Integrity constraints over relations, enforcing
integrity constraints, querying relational data, Logical database design, Introduction to views,
Destroying/altering tables and views.
>Relational Algebra: Preliminaries, Relational Algebra operators.
--------------------------------------------------------------------------------------------------------------------------------------

RELATIONAL MODEL:
The relational model represents how data is stored in Relational Databases. A
relational database consists of a collection of tables, each of which is assigned a
unique name.

Note:
Relation (Table): A relation is usually represented as a table, organized into
rows and columns.
Attribute (Column names): Attributes are the properties that define an entity
Tuple: Each row in the relation is known as a tuple.
Relation Schema: A relation schema defines the structure of the relation and
represents the name of the relation with its attributes.
CREATING AND MODIFYING RELATIONS:
A database query is a request to get data or information from a database. This
is done using a special language, like SQL, which stands for Structured Query
Language. SQL is the most common language used to work with databases, and
it organizes the results into rows and columns. SQL is used to retrieve and
manage data in a relational database.
With SQL, you can:
 Access data in relational databases.
 Describe the data.
 Change or modify the data.
 Create or delete databases and tables.
 Make views, stored procedures, and functions in the database.
 Set permissions for tables, procedures, and views.

 CREATE TABLE:

 INSERT, UPDATE & DELETE IN TABLE:

Integrity constraints over relations:

 DOMAIN CONSTRAINT:
In DBMS, Domain constraints can be defined as a set of values that are valid for
an attribute. The domain’s data type includes string, character, integer, time
etc. The value must be in the corresponding domain of the attribute.
 ENTITY INTEGRITY CONSTRAINTS:
 The entity integrity constraint states that primary key value can't be null.
 This is because the primary key value is used to identify individual rows
in relation and if the primary key has a null value, then we can't identify
those rows.
 A table can contain a null value other than the primary key field.

 REFERENTIAL INTEGRITY CONSTRAINT:


This constraint is defined between two tables. In the Referential integrity
constraints, if a foreign key in Table 1 refers to the Primary Key of Table 2,
then every value of the Foreign Key in Table 1 must be null or be available in
Table 2.
 KEY CONSTRAINTS:
 Keys are the entity set that is used to identify an entity within its entity
set uniquely.
 An entity set can have multiple keys, but out of which one key will be the
primary key. A primary key can contain a unique and null value in the
relational table.

Enforcing Integrity Constraints:


Enforcement of the constraints are as follows:
 DOMAIN CONSTRAINT:
 Data Type Check: The DBMS enforces domain constraints by checking
that the data entered matches the specified data type (e.g., integer,
string, date).
 Range Check: Values are checked to ensure they fall within the specified
range or set (e.g., age between 0 and 120).
 Pattern Check: For string data, patterns (e.g., regular expressions) can
enforce formats like email addresses or phone numbers.

 ENTITY INTEGRITY CONSTRAINTS:

 Primary Key Constraint: The DBMS enforces uniqueness by


disallowing duplicate values in the primary key column(s).

 NOT NULL Constraint: Ensures that no null values are allowed in


the primary key column(s).

 REFERENTIAL INTEGRITY CONSTRAINT:


 Foreign Key Constraint: The DBMS checks that the foreign key value
in one table corresponds to a primary key value in the referenced
table.
 Cascading Actions: The DBMS can enforce referential integrity
through cascading actions such as ON DELETE CASCADE or ON
UPDATE CASCADE, which automatically propagate changes to foreign
keys when the referenced primary key changes.

Querying Relational Data:


General Syntax of Select

Example:
Logical Database Design:
Logical Database Design refers to the process of creating a blueprint or
structure for how data will be organized in a database. It focuses on defining
the logical relationships between data elements without considering the
physical aspects of how the data will be stored.
Key Steps in Logical Database Design:
1. Identify Entities: Determine the objects or entities that need to be
represented in the database (e.g., Customers, Orders, Products).
2. Define Relationships: Establish how these entities are related to each
other (e.g., onetomany, manytomany).
3. Define Attributes: List the key pieces of information (attributes) for each
entity (e.g., Customer Name, Product Price).
4. Normalize the Data: Ensure that data is organized efficiently to reduce
redundancy and improve consistency by applying normalization rules.
5. Create EntityRelationship (ER) Diagrams: Visualize the design through ER
diagrams, which show entities, attributes, and their relationships.
Here is a simpler EntityRelationship Diagram (ERD) that represents a basic
database design with entities like Students and Courses, along with their
attributes and relationships.

Introduction to views:
 Views in SQL are considered as a virtual table. A view also contains rows
and columns.
 To create the view, we can select the fields from one or more tables
present in the database.
 A view can either have specific rows based on certain condition or all the
rows of a table.
 Single table view:
Syntax:
CREATE VIEW view_name AS SELECT column1, column2 FROM
table_name WHERE condition;
Ex:
CREATE VIEW EmployeeView AS SELECT EmployeeName, Department
FROM Employees WHERE Department = 'HR';
 Multiple table view:
Syntax:
CREATE VIEW view_name AS SELECT t1.column1, t2.column2 FROM
table1 t1 JOIN table2 t2 ON t1.common_column = t2.common_column
WHERE condition;

Ex:
CREATE VIEW OrderDetailsView AS SELECT Customers.CustomerName,
Orders.OrderID, Orders.OrderDate FROM Customers JOIN Orders ON
Customers.CustomerID = Orders.CustomerID;

 Deleting view:
Syntax:
DROP VIEW EmployeeView;
Ex:
DROP VIEW EmployeeView;

Destroying/Altering tables and views:


Relational Algebra and Tuple Calculus:
Note:

1. Unary operation involves one operand


a. selection (σ):
 The select operation selects tuples that satisfy a given predicate.
 It is denoted by sigma (σ).
 Notation: σ p(r)


b. Projection (∏):
 This operation shows the list of those attributes that we wish to appear in the result. Rest of the
attributes are eliminated from the table.
 It is denoted by ∏.
 Notation: ∏ A1, A2, An (r)

2. Binary operation involves two operands.


a. Union operation (∪):
 Suppose there are two tuples R and S. The union operation contains all

It eliminates the duplicate tuples. It is denoted by ∪.


the tuples that are either in R or S or both in R & S.

Notation: R ∪ S

b. Difference operation():
 Suppose there are two tuples R and S. The set intersection operation contains all tuples that are in R
but not in S.
 It is denoted by intersection minus ().
 Notation: R S

c. Cartesian product operation(X):


 The Cartesian product is used to combine each row in one table with each row in the other table. It is
also known as a cross product.
 It is denoted by X.
 Notation: E X D

d. Intersection operation (∩):


 Suppose there are two tuples R and S. The set intersection operation contains all tuples that are in
both R & S.
 It is denoted by intersection ∩.
 Notation: R ∩ S

e. Join operations operation

only if a given join condition is satisfied. It is denoted by ⋈.


A Join operation combines related tuples from different relations, if and

1.Natural Join:
 A natural join is the set of tuples of all combinations in R and S that are

 It is denoted by ⋈.
equal on their common attribute names.



2.Outer Join:
 The outer join operation is an extension of the join operation. It is used
to deal with missing information.

a.Left Outer Join:

 Left outer join contains the set of tuples of all combinations in R and S
that are equal on their common attribute names.

 It is denoted by ⟕.
 In the left outer join, tuples in R have no matching tuples in S.


b.Right Outer Join:
 Right outer join contains the set of tuples of all combinations in R and S
that are equal on their common attribute names.

 It is denoted by ⟖.
 In right outer join, tuples in S have no matching tuples in R.



c.Full Outer Join:

 Full outer join is like a left or right join except that it contains all rows
from both tables.
 In full outer join, tuples in R that have no matching tuples in S and tuples

 It is denoted by ⟗.
in S that have no matching tuples in R in their common attribute name.


3.Equi Join:
 It is also known as an inner join. It is the most common join. It is based
on matched data as per the equality condition. The equi join uses the
comparison operator(=).

MODULE-III
SQL AND PL/SQL
>SQL: Form of basic SQL query, Nested queries, Aggregate operators, Null values, Complex integrity
constrains in SQL, Triggers and active databases.

>PL/SQL: Generic PL/SQL block, PL/SQL data types, Control structure, Procedures and functions, Cursors,
Database triggers.

SQL:
Generic PL/SQL block:
Basic SQL queries are fundamental in interacting with a database using SQL
(Structured Query Language). Here’s an overview of the main forms of SQL
queries, explained in detail, with visual elements to enhance
understanding.
1. SELECT Queries:
A SELECT query is used to retrieve data from a database.
Syntax:
SELECT column1, column2, ... FROM table_name;
Example:
SELECT name, age FROM students;
This retrieves the name and age columns from the students table.
Visualization:
Think of a SELECT query as a way to ask the database to "show" specific
columns in a table. Imagine the table like a spreadsheet, and you're
selecting certain columns.

2. INSERT Queries
An INSERT query is used to add new data into a table.
Syntax:
INSERT INTO table_name (column1, column2, ...) VALUES (value1,
value2, ...);
Example:
INSERT INTO students (name, age) VALUES ('Alice', 23);
Visualization:

3. UPDATE Queries
An UPDATE query modifies existing records in a table.
Syntax:
UPDATE table_name SET column1 = value1, column2 = value2, ... WHERE
condition;
Example:
UPDATE students SET age = 24 WHERE name = 'Alice';
Visualization:
Think of UPDATE as editing an existing cell in your table.

4. DELETE Queries
A DELETE query removes data from a table.
Syntax:
DELETE FROM table_name WHERE condition;
Example:
DELETE FROM students WHERE name = 'Bob';
Visualization:
Deleting a row from your table:
5. WHERE Clause
The WHERE clause is used to filter records. It’s used with SELECT, UPDATE,
or DELETE queries to specify conditions.
Syntax:
SELECT column1, column2 FROM table_name WHERE condition;
Example:
SELECT name FROM students WHERE age > 20;
This retrieves names of students whose age is greater than 20.

6. JOIN Queries
JOIN is used to combine rows from two or more tables based on a related
column.
Syntax:
SELECT columns FROM table1
JOIN table2 ON table1.column = table2.column;
Example:
SELECT students.name, courses.course_name
FROM students
JOIN enrollments ON students.id = enrollments.student_id
JOIN courses ON enrollments.course_id = courses.id;

Nested Queries:
Nested queries, also known as subqueries, are SQL queries embedded inside
another query. They allow you to retrieve data based on the results of another
query. Nested queries can be used with SELECT, INSERT, UPDATE, and DELETE
statements.
Types of Nested Queries:
1. Single row subqueries: Return only one row of data.
2. Multiple row subqueries: Return multiple rows of data.
3. Multiple column subqueries: Return multiple columns of data.

1. Single Row Subquery:


A subquery that returns a single value, typically used with comparison
operators like =, <, >, etc.
Example:
Find the name of the student whose age is the maximum in the students table.
SELECT name
FROM students
WHERE age = (SELECT MAX(age) FROM students);
Explanation:
The inner query (SELECT MAX(age) FROM students) retrieves the maximum
age.
The outer query selects the name where age matches the maximum age.
Visualization:

2. MultipleRow Subquery:
A subquery that returns multiple rows, typically used with operators like IN,
ANY, or ALL.
Example:
Select the names of students who are enrolled in courses with a fee greater
than 500.
SELECT name
FROM students
WHERE id IN (SELECT student_id FROM enrollments WHERE course_fee > 500);
Explanation:
The inner query retrieves student_id from the enrollments table where the
course_fee is greater than 500.
The outer query retrieves the names of students whose id is in the result of
the inner query.
Visualization:

3. Multiple Column Subquery:


A subquery that returns more than one column, typically used in complex
queries.
Example:
Find students whose age and city match specific criteria from another table
(graduates).
SELECT name
FROM students
WHERE (age, city) IN (SELECT age, city FROM graduates WHERE degree =
'MBA');
Explanation:
The inner query selects the age and city from the graduates table where the
degree is MBA.
The outer query retrieves students where both age and city match the result
of the inner query.
Visualization:
4. Correlated Subquery:
A correlated subquery is a subquery that depends on the outer query for its
values. It is executed once for each row of the outer query.
Example:
Find students who have enrolled in more than 2 courses.
SELECT name
FROM students s
WHERE (SELECT COUNT(*) FROM enrollments e WHERE e.student_id = s.id) >
2;
Explanation:
The inner query (SELECT COUNT(*) FROM enrollments e WHERE e.student_id
= s.id) counts the courses each student has enrolled in.
The outer query selects the names of students who have enrolled in more
than 2 courses.
Visualization:

Aggregate Operators in SQL


Aggregate operators perform calculations on multiple rows of a table and return a
single result. Commonly used aggregate functions are:
COUNT (): Returns the number of rows.
SUM (): Returns the sum of a numeric column.
AVG (): Returns the average value of a numeric column.
MAX (): Returns the maximum value in a column.
MIN (): Returns the minimum value in a column.
Example:
Find the average age of students.
SELECT AVG (age) FROM students;
Visualization:

Null Values:
NULL represents a missing or undefined value in SQL. It is important to note that
NULL is different from 0 or an empty string. SQL provides several functions to
handle NULL values:
IS NULL: Checks if a value is NULL.
IS NOT NULL: Checks if a value is not NULL.
COALESCE (): Returns the first nonnull value.
Example:
Find the names of students whose age is not known (i.e., NULL).
SELECT name FROM students WHERE age IS NULL;
Visualization:
Complex Integrity Constraints:
 Integrity constraints are rules that ensure the accuracy and consistency of
data in a database. Common types include:
 Primary Key Constraint: Ensures each row in a table has a unique identifier.
 Foreign Key Constraint: Ensures the referential integrity between two
related tables.
 Unique Constraint: Ensures that all values in a column are unique.
 Check Constraint: Ensures that values in a column satisfy a specific
condition.
Example (Foreign Key): Ensure that each enrollment references a valid student.
CREATE TABLE enrollments (
id INT PRIMARY KEY,
student_id INT,
course_name VARCHAR(50),
CONSTRAINT fk_student FOREIGN KEY (student_id) REFERENCES students(id)
);
Visualization:

Triggers:
A trigger is a set of actions executed automatically when a specified database
event occurs, such as an INSERT, UPDATE, or DELETE. Triggers are often used to
enforce business rules, maintain data integrity, or log changes.
Example:
CREATE TRIGGER update_age
AFTER UPDATE ON students
FOR EACH ROW
WHEN (NEW.birthday <> OLD.birthday)
BEGIN
UPDATE students SET age = age + 1 WHERE id = NEW.id;
END;
Explanation:
This trigger fires after an UPDATE on the students table when the birthday
column is changed.
It increases the age by 1 for the affected student.

Active Databases:
An active database responds to events automatically through triggers or rules
without user intervention. The concept of active databases is closely related to
triggers and constraints, where the database system is programmed to react to
changes or certain conditions.
Example Use Case:
In an ecommerce application, an active database could automatically:
Trigger a reorder of stock when inventory is low.
Log changes in an order's status (e.g., shipped, delivered).
Send an alert when a certain event occurs, such as an unauthorized data access.

PL/SQL:
Generic PL/SQL block:
In PL/SQL, the code is not executed in single line format, but it is always
executed by grouping the code into a single element called Blocks.
• Blocks contain both PL/SQL as well as SQL instruction. All these instructions
will be executed as a whole rather than executing a single instruction at a
time.
• PL/SQL blocks have a predefined structure in which the code is to be
grouped. Below are different sections of PL/SQL blocks.
– Declaration section
– Execution section
– Exception Handling section

Declaration Section
• This is the first section of the PL/SQL blocks.
• This section is an optional part.
• This is the section in which the declaration of variables, cursors, exceptions,
subprograms, instructions and collections that are needed in the block will
be declared.
• This section should always be followed by execution section.

Execution Section
• Execution part is the main and mandatory part which actually executes the
code that is written inside it.
• This can contain both PL/SQL code and SQL code.
• This can contain one or many blocks inside it as a nested block.
• This section starts with the keyword ‘BEGIN’.
• This section should be followed either by ‘END’ or Exception Handling
section (if present)

Exception Handling Section:


• The exception is unavoidable in the program which occurs at runtime and
to handle this Oracle has provided an Exceptionhandling section in blocks.
• This is an optional section of the PL/SQL blocks.
• This is the section where the exception raised in the execution block is
handled.
• This section is the last part of the PL/SQL block.
• Control from this section can never return to the execution block.
• This section starts with the keyword ‘EXCEPTION’.
• This section should always be followed by the keyword ‘END’.
• The Keyword ‘END’ marks the end of PL/SQL block.

ExceptionHandling Section:
• The exception is unavoidable in the program which occurs at runtime and
to handle this Oracle has provided an Exceptionhandling section in blocks.
• This is an optional section of the PL/SQL blocks.
• This is the section where the exception raised in the execution block is
handled.
• This section is the last part of the PL/SQL block.
• Control from this section can never return to the execution block.
• This section starts with the keyword ‘EXCEPTION’.
• This section should always be followed by the keyword ‘END’.
• The Keyword ‘END’ marks the end of PL/SQL block.

Develop a PL/SQL program that displays the name and address of a
studentwhose ID is given. If there is no student with the given student ID in
thedatabase, the program should raise a runtime exception
NO_DATA_FOUND,whichshould becapturedin theEXCEPTIONblock.

PL/SQL data types:


Providing a datatype specifies how any data will be stored and processed
by Oracle when any PL/SQL code block is executed.

Scalar Types: These are basic datatypes which generally holds a single value
like a number or a string of characters. Scalar types have 4 different
categories which are listed in the diagram above, namely Number
Types, Character and String, Boolean Types and Date and Time etc.
• LOB Types: This datatype deals with large objects and is used to specify
location of these large objects like text files, images etc which are generally
not stored outside the database
Reference Types: This datatype is used to hold pointer values which
generally stores address of other program items.
• Composite Types:As the name suggests this type of data is a composition of
individual data which can be manipulated/processed separatel as well.

NUMBER(p,s)
• Range: p= 1 to 38 s= 84 to 127
• This datatype is used to store numeric data. Here, p is precision s is scale.
Example:
• Age NUMBER(2); where , Age is a variable that can store 2 digits
• percentage NUMBER(4,2); where, percentage is a variable that can store 4
(p) digits before decimal and 2 (s) digits after decimal.

CHAR(size)
• Range: 1 to 2000 bytes
• This datatype is used to store alphabetical string of fixed length.
• Its value is quoted in single quotes.
• Occupies the whole declared size of memory even if the space is not
utilized by the data.
Example:
• rank CHAR(10); where, rank is a variable that can store upto 10 characters.
If the length of data(charcaters) stored in rank is 5 then it will still occupy all
the 10 spaces. 5 space in the memory will get used and the rest blank
memory spaces will be wasted.

VARCHAR(size)
• Range: 1 to 2000 bytes
• This datatype is used to store alphanumeric string of variable length.
• Its value is quoted in single quotes.
• Occupies the whole declared size of memory even if the space is not
utilized by the data.
Example:
• address VARCHAR(10); where, address is a variable that can occupy
maximum 10 bytes of memory space and can store alphanumeric value in
it. Unused spaces are wasted.

VARCHAR2(size)
• Range: 1 to 4000 bytes
• This datatype is used to store alphanumeric string of variable length.
• Its value is quoted in single quotes.
• It releases the unused space in memory, hence saving the unused space.
Example:
• name VARCHAR2(10); where, name is a variable that can occupy maximum
10 bytes of memory to store an alphanumeric value. The unused memory
space is released.

VARCHAR2(size)
• Range: 1 to 4000 bytes
• This datatype is used to store alphanumeric string of variable length.
• Its value is quoted in single quotes.
• It releases the unused space in memory, hence saving the unused space.
Example:
• name VARCHAR2(10); where, name is a variable that can occupy maximum
10 bytes of memory to store an alphanumeric value. The unused
memoryspace is released.

%TYPE
• It stores value of that variable whose datatype is unknown and when we
want the variable to inherit the datatype of the table column.
• Also, its value is generally being retrieved from an existing table in the
database, hence it takes the datatype of the column for which it is used.

Example:
• Student sno %TYPE;, where Student is the name of the table created in
database and sno is variable whose datatype is unknown and %TYPE is used
to store its value.

DECLARE
surname employees.last_name%TYPE;
BEGIN
DBMS_OUTPUT.PUT_LINE('surname=' || surname);
END;
BOOLEAN
• This datatype is used in conditional statements.
• It stores logical values.
• It can be either TRUE or FALSE
• Example:
• isAdmin BOOLEAN; where, isAdmin is a variable whose value can be TRUE
or FALSE depending upon the condition being checked.

Control Structures:
According to the structure theorem, any computer program can be written
using the basic control structures.

IFTHEN Statement:
• The simplest form of IF statement associates a condition with a sequence of
statements enclosed by the keywords THEN and END IF (not ENDIF), as
follows:
• IF condition THEN
sequence_of_statements
END IF;

IFTHENELSE Statement:
• The second form of IF statement adds the keyword ELSE followed by an
alternative sequence of statements, as follows:
• IF condition THEN
sequence_of_statements1
ELSE
sequence_of_statements2
END IF;

IFTHENELSIF Statement:
• Sometimes you want to select an action from several mutually exclusive
alternatives. The third form of IF statement uses the
keyword ELSIF (not ELSEIF) to introduce additional conditions, as follows:
• IF condition1 THEN
sequence_of_statements1
ELSIF condition2 THEN
sequence_of_statements2
ELSE
sequence_of_statements3
END IF;

CASE Statement:
• Like the IF statement, the CASE statement selects one sequence of
statements to execute.
• However, to select the sequence, the CASE statement uses a selector rather
than multiple Boolean expressions.

[<<label_name>>]
CASE selector
WHEN expression1 THEN
sequence_of_statements1;
WHEN expression2 THEN
sequence_of_statements2;
...
WHEN expressionN THEN
sequence_of_statementsN;
END CASE [label_name];

Write a PL/SQL program to display the description against a student’s grade
using CASE statement.
Iterative Control: LOOP and EXIT Statements:
• LOOP statements let you execute a sequence of statements multiple times.
• There are three forms of LOOP statements: LOOP, WHILELOOP,
and FORLOOP.

LOOP:
• The simplest form of LOOP statement is the basic (or infinite) loop, which
encloses a sequence of statements between the
keywords LOOP and END LOOP, as follows:
• LOOP
sequence_of_statements
END LOOP;
• If further processing is undesirable or impossible, you can use
an EXIT statement to complete the loop.
• There are two forms of EXIT statements: EXIT and EXITWHEN.
WHILELOOP
• The WHILELOOP statement associates a condition with a sequence of
statements enclosed by the keywords LOOP and END LOOP, as follows:
• WHILE condition LOOP
sequence_of_statements
END LOOP;
FORLOOP
• The number of iterations through a FOR loop is known before the loop is
entered.
• FOR loops iterate over a specified range of integers.
• The range is part of an iteration scheme, which is enclosed by the
keywords FOR and LOOP.
• A double dot (..) serves as the range operator. The syntax follows:
• FOR counter IN
lower_bound..higher_bound LOOP
sequence_of_statements
END LOOP;

PL/SQL Subprograms:
• A PL/SQL block of code that accepts parameters and can be invoked is
called a subprogram.
• There are two types of subprograms: Procedures and Functions.
• A function is used for calculating value and a procedure is used to do an
action.
• A subprogram can be considered as a module that is integrated to build a
large program. This helps to give a modular architecture.
• A subprogram consists of the following sections:
• 1) Declarative Section
2) Executable Section
3) Exception Handling Section
• Modes Of Parameter In PL/SQL Subprogram
• Modes of Parameter are of three types IN, OUT, and INOUT.
• 1) INThis is a constant parameter inside a subprogram. It allows passing a
value to a subprogram.
• 2) OUT This parameter return values to the calling subprogram.
• 3) IN OUT This parameter can be used for getting both input and output
from the subprograms.
Parameter Passing Methods In PL/SQL:
• Parameter passing in PL/SQL can be done by Positional Notation, Named
Notation, and Mixed Notation.
• 1) Positional Notation: Here the first actual parameter is acting as the first
formal parameter, and the second actual parameter is acting as the second
formal parameter, and so on.
• Syntax: findAdd(i, j, k);
• 2) Named Notation: Here the actual parameter is linked with the formal
parameter with the help of arrow notation (=>).
• Syntax: findAdd(num1=>i, num2=>j, sum=>k);
• 3) Mixed Notation: Here, we can have a mixture of both positional and
named notation.
• Syntax:findAdd(i, num2=>j, sum=>k);

PL SQL PROCEDURES
• The procedure is a part of the PL/SQL subprogram which performs a
particular task.
• All procedures possess a unique name and are an independent block of
code.
• A procedure may contain a nested block or can be described inside other
packages or blocks.
• A procedure has parameters included while calling a procedure.
• We can build a procedure with a CREATE OR REPLACE statement.
PL/SQL Functions:
• The functions are similar to procedures in PL/SQL except for the fact that it
has the ability to return a value (specified with keyword RETURN) and
performs computation tasks.
• It has a unique name and acts as an independent block of code.
• The data type of a function is set at the time of the creation of function.
• A function may contain a nested block or describe inside other packages or
blocks.
• A function can return values with the help of the RETURN keyword and OUT
parameter.

The syntax for creating a function:
CREATE [OR REPLACE] FUNCTION name
[(parameter_name [IN | OUT | IN OUT] type [, ...])]
RETURN statement with data type
{IS | AS}
BEGIN
Block of code
EXCEPTION
Exception handling
END name;
1 Return is used to return values. Out is used to return values.

2 Always returns a value. Not always returns values.

3 The data type of Return value is The data type of Return value is
specified at the time of creation. not specified at the time of
creation.

4 It is mainly used for computation It is mainly used for performing


purposes. certain processes.
5 It can take a single input parameter. It can take multiple or zero input
parameters.

6 Trycatch block for exceptions cannot Trycatch block for exceptions can
be used. be used.

7 It can only have input parameters. It can have both input and output
parameters.
Here, ‘name’ is the function name. ‘OR REPLACE’ keyword informs the compiler
about updating a function. ‘AS’ keyword is used if the function is standalone.
‘IS’ keyword is used if the function is a nested one.
‘Block of code’ is the actual implementation logic of the program.
‘Exception handling’ contains error checking at the runtime.

• Default Functions In PL/SQL

1) PL/SQL String Functions


• LENGTH(string): To get the length of the string.
• LOWER(string): To get the lower case of the string.
• UPPER(string): To get the upper case of the string.
• LTRIM (string): To get the string without leading white spaces.
• RTRIM (string): To get the string without trailing white spaces.
• SUBSTR (string, begin, length): To get the string from the beginning point
to the length of the substring.
• INSTR (string, search, begin, occurrence): To get the position of the search
string in the given string.
• TRIM (string): To get the string without leading or trailing white spaces.

Conversion Functions
• TO_DATE (string, format): Conversion to the specified date format.
• TO_CHAR (): Conversion to the character data type.
• TO_NUMBER(string, format): Conversion from string to the specified
number format.
3) PL/SQL Date Functions
• SYSDATE: To get the present date and time of the server.
• TRUNC: To get the date rounded to the lower value.
• ROUND: To get the date to the closest range (high or low).
• ADD_MONTHS (date, months): To get a date by adding months to the
date.

PL/SQL – Cursors:
Oracle creates a memory area, known as the context area, for processing
an SQL statement, which contains all the information needed for processing
the statement;
for example, the number of rows processed, etc.
• A cursor is a pointer to this context area.
• PL/SQL controls the context area through a cursor.
• A cursor holds the rows (one or more) returned by a SQL statement.
• The set of rows the cursor holds is referred to as the active set.
• You can name a cursor so that it could be referred to in a program to fetch
and process the rows returned by the SQL statement, one at a time.
• There are two types of cursors −
1.Implicit cursors 2.Explicit cursors
Implicit Cursors
• Implicit cursors are automatically created by Oracle whenever an SQL
statement is executed. Programmers cannot control the implicit cursors
and the information in it.
• Whenever a DML statement (INSERT, UPDATE and DELETE) is issued, an
implicit cursor is associated with this statement. For INSERT operations, the
cursor holds the data that needs to be inserted. For UPDATE and DELETE
operations, the cursor identifies the rows that would be affected.
• In PL/SQL, you can refer to the most recent implicit cursor as the SQL
cursor, which always has attributes such as %FOUND, %ISOPEN,
%NOTFOUND, and %ROWCOUNT.
%FOUND
Returns TRUE if an INSERT, UPDATE, or DELETE statement affected one or more
rows or a SELECT INTO statement returned one or more rows. Otherwise, it returns
FALSE.
%ISOPEN
Always returns FALSE for implicit cursors, because Oracle closes the SQL cursor
automatically after executing its associated SQL statement.
%ROWCOUNT
Returns the number of rows affected by an INSERT, UPDATE, or DELETE statement,
or returned by a SELECT INTO statement.


The following program will update the table and increase the salary of each
customer by 500 and use the SQL%ROWCOUNT attribute to determine the
number of rows affected
• DECLARE
total_rows number(2);
BEGIN
UPDATE customers
SET salary = salary + 500;
IF sql%notfound THEN
dbms_output.put_line('no customers selected');
ELSIF sql%found THEN
total_rows := sql%rowcount;
dbms_output.put_line( total_rows || '
customers selected ');
END IF;
END;
/

Explicit Cursors
• Explicit cursors are programmerdefined cursors for gaining more control
over the context area.
• An explicit cursor should be defined in the declaration section of the PL/SQL
Block.
• It is created on a SELECT Statement which returns more than one row.
• The syntax for creating an explicit cursor is −
CURSOR cursor_name IS select_statement;

Declaring the Cursor
• Declaring the cursor defines the cursor with a name and the associated
SELECT statement. For example −
CURSOR c_customers IS
SELECT id, name, address FROM customers;
• Opening the Cursor
• Opening the cursor allocates the memory for the cursor and makes it ready
for fetching the rows returned by the SQL statement into it. For example −
OPEN c_customers;

Fetching the Cursor
• Fetching the cursor involves accessing one row at a time. For example−
FETCH c_customers INTO c_id, c_name, c_addr;
• Closing the Cursor
• Closing the cursor means releasing the allocated memory. For example−
CLOSE c_customers;

output:

6 Database triggers
Triggers are stored programs, which are automatically executed or fired
when some events occur. Triggers are, in fact, written to be executed in
response to any of the following events −
• A database manipulation (DML) statement (DELETE, INSERT, or UPDATE)
• A database definition (DDL) statement (CREATE, ALTER, or DROP).
• A database operation (SERVERERROR, LOGON, LOGOFF, STARTUP, or
SHUTDOWN).
• Triggers can be defined on the table, view, schema, or database with which
the event is associated.
CREATE [OR REPLACE] TRIGGER trigger_name − Creates or replaces an existing
trigger with the trigger_name.
• {BEFORE | AFTER | INSTEAD OF} − This specifies when the trigger will be
executed. The INSTEAD OF clause is used for creating trigger on a view.
• {INSERT [OR] | UPDATE [OR] | DELETE} − This specifies the DML operation.
• [OF col_name] − This specifies the column name that will be updated.
• [ON table_name] − This specifies the name of the table associated with the
trigger.
• [REFERENCING OLD AS o NEW AS n] − This allows you to refer new and old
values for various DML statements, such as INSERT, UPDATE, and DELETE.
• [FOR EACH ROW] − This specifies a rowlevel trigger, i.e., the trigger will be
executed for each row being affected. Otherwise the trigger will execute
just once when the SQL statement is executed, which is called a table level
trigger.
• WHEN (condition) − This provides a condition for rows for which the trigger
would fire. This clause is valid only for rowlevel triggers.

The following program creates a rowlevel trigger for the customers table
that would fire for INSERT or UPDATE or DELETE operations performed on
the CUSTOMERS table.
• This trigger will display the salary difference between the old values and
new values −

Let us perform some DML operations on the CUSTOMERS table.


• Here is one INSERT statement, which will create a new record in the table −
INSERT INTO CUSTOMERS (ID,NAME,AGE,ADDRESS,SALARY)
VALUES (7, 'Kriti', 22, 'HP', 7500.00 );
• When a record is created in the CUSTOMERS table, the above created
trigger, display_salary_changes will be fired and it will display the following
result −
Old salary:
New salary: 7500
Salary difference:
• Because this is a new record, old salary is not available and the above result
comes as null.
Let us now perform one more DML operation on the CUSTOMERS table.

• The UPDATE statement will update an existing record in the table −


UPDATE customers
SET salary = salary + 500
WHERE id = 2;

• When a record is updated in the CUSTOMERS table, the above created


trigger, display_salary_changes will be fired and it will display the following
result −
Old salary: 1500
New salary: 2000
Salary difference: 500
MODULE-IV
SCHEMA REFINEMENT AND TRANSACTIONS
>Schema Refinement:
Problems caused by redundancy, Decompositions, Problems related to decomposition, Functional
dependencies, Reasoning about FDs, First normal form, Second normal form, Third normal form,
BoyceCodd normal form, Multivalued dependencies, Fourth normal form, Join dependencies, Fifth
normal form.
------------------------------------------------------------------------------------------------------------------------------------------

Schema Refinement:
Schema refinement refers to the process of improving a database schema to
ensure it is wellstructured, efficient, and free of potential anomalies. It involves
analyzing the schema to optimize its design, usually by normalizing it to eliminate
redundancy, minimize update anomalies, and ensure data integrity.
What is Redundancy in DBMS?
Redundancy in DBMS is the problem that arises when the database is not
normalized. It is the concept of storing multiple copies of the same data in
different parts of the database.
Problems caused by redundancy in Database
Redundancy in DMBS can cause several problems while performing operations on
data such as insert, delete, and update. Let's use the below student table to
understand insertion, updation, and deletion anomalies.
student student_na student_ dept dept_name dept_head
_id me age _id
1 Tony Stark 18 100 Computer Steve Rogers
Science
2 Thor 18 100 Steve Rogers Computer
Odinson Science
3 Bruce 18 101 Natasha Mechanical
Banner Romanoff
Insertion Anomaly
An insertion anomaly occurs when specific details cannot be inserted into the
database without the other details.
Example: Without knowing the department details, we cannot insert the student
details in the above table. The student details (student_id, student_name, and
student_age) depends on the department details (dept_id, dept_name, and
dept_head).
Deletion Anomaly
Deletion anomaly occurs when deleting specific details loses some unrelated
information from the database.
Example:
If we delete the student with student_id 3 from the above student table, we also
lose the department details with dept_id 101. Deleting student details result in
losing unrelated department details.
Updation Anomaly
Updation anomaly occurs when there is data inconsistency resulting from a partial
data update.
Example: We wanted to update the dept_head to Peter Parker for dept_id 101;
we need to update it in all places.If the update didn't occur in all the places
(partial update), it may result in data inconsistency.
Decomposition
Decomposition in Database Management System is to break a relation into
multiple relations to bring it into an appropriate normal form. It helps to remove
redundancy, inconsistencies, and anomalies from a database. The decomposition
of a relation R in a relational schema is the process of replacing the original
relation R with two or more relations in a relational schema.

PROBLEMS RELATED TO DECOMPOSITION:


Decomposition in database design refers to the process of breaking down a
database schema into smaller, more manageable pieces, often to achieve
normalization and eliminate redundancy. However, improper decomposition can
lead to several problems:
1. Loss of Information
 Problem: If decomposition is not done carefully, it might lead to the loss of
some information.
 Example: Decomposing a table that contains a multivalued attribute into
separate tables without a proper key to link them might result in losing the
relationship between the values.
2. Dependency Preservation
 Problem: Decomposition can lead to the loss of some functional
dependencies, making it difficult to enforce certain constraints.
 Example: If a functional dependency A→BA \rightarrow BA→B exists, but
after decomposition, A and B end up in different tables, enforcing this
dependency becomes challenging.
3. Join Dependency
 Problem: After decomposition, the original table can be reassembled by
joining the decomposed tables. If the decomposition isn't done properly, it
might result in an incorrect join or loss of data.
 Example: If a relation is decomposed into two tables, but these tables do
not contain a common key to join them accurately, reconstructing the
original relation may not be possible.
4. Redundancy and Anomalies
 Problem: If decomposition is not done according to normalization
principles, it might introduce redundancy, leading to insertion, deletion,
and update anomalies.
 Example: If a table is decomposed but the same data appears in multiple
tables, any change in the data requires updates in all places, leading to
potential inconsistency.
5. Performance Issues
 Problem: Excessive decomposition can lead to performance problems due
to the need for frequent joins, especially in large databases.
 Example: Querying a highly decomposed schema might require multiple
joins, which can slow down query performance.
6. Complexity
 Problem: Decomposition can make the schema more complex, making it
harder to understand and manage.
 Example: A single table decomposed into many small tables can make
writing queries and maintaining the database more challenging for
developers.
What is Functional Dependency in DBMS?
· Let us try to understand the concept thoroughly. If X is a
relation that has attributes P and Q, then their functional
dependency would be represented by > (arrow sign)
·
· Thus, here, the following would represent the functional
dependency between the attributes using an arrow sign:
·
· P>Q
·
· In this case, the left side of this arrow is a Determinant. The
right side of this arrow is a Dependent. P will be the primary
key attribute, while Q will be a dependent nonkey attribute
from a similar table as the primary key. It shows that the
primary key attribute P is functionally dependent on the
nonkey attribute Q. In simpler words, If the column P attribute
of a table identifies the column Q attribute of the very same
table uniquely, then the functional dependency of column Q on
column P is symbolised as P → Q.
· Let us look at an example that makes it easier to
comprehend functional dependency. Suppose we have a
<Student> table with two separate attributes − Stu_Id and
Stu_Name. Stu_Id = Student ID
· Stu_Name = Student Name
·
· The StuId is our primary key.
· And StuId here identifies the StuName attribute uniquely.
·
· It is because if someone wants to know the student’s name,
then you need to have the StuId at first.
Stu_I Stu_Nam
d e
011 Marketin
g
022 HR
033 Finance
044 Accountin
g
055 Sales
066 Telecom
·
· The functional dependency given above between Stu_Id and
Stu_Name can be specified as Stu_Id is functionally dependent
on Stu_Name.
Reasoning about FunctionalDependences
Reasoning about Functional Dependencies (FDs) is a critical aspect
of database design, particularly when normalizing a database
schema. Functional dependencies help identify the relationships
between attributes in a relational database and are used to enforce
data integrity and eliminate redundancy.
Reasoning about Functional Dependencies:
1. Closure of a Set of Attributes (A+A^+A+):
o The closure of a set of attributes AAA, denoted as
A+A^+A+, is the set of attributes that can be functionally
determined by AAA using a given set of functional
dependencies.
o How to Compute Closure:
§ Start with A+=AA^+ = AA+=A.
§ Add all attributes BBB such that A→BA \
rightarrow BA→B is in the set of functional
dependencies.
§ Repeat until no more attributes can be added.
o Use Case: The closure helps determine if a functional
dependency holds in a relation or if an attribute is a
candidate key.
2. Finding Candidate Keys:
o A candidate key is a minimal set of attributes that can
uniquely identify a tuple in a relation.
o How to Determine:
§ Compute the closure of different combinations of
attributes.
§ The smallest set of attributes whose closure
includes all attributes in the relation is a
candidate key.
3. Minimal Basis (Canonical Cover):
o A minimal basis for a set of functional dependencies is a
simplified set of dependencies that preserves the
original dependencies and is minimal in terms of the
number of dependencies.
o Steps to Find Minimal Basis:
§ Decompose: Break down composite
dependencies AB→CAB \rightarrow CAB→C into
simpler ones like A→CA \rightarrow CA→C and
B→CB \rightarrow CB→C if possible.
§ Remove Redundancy: Eliminate any redundant
functional dependencies.
§ Minimality: Ensure that each dependency is
minimal and that removing any part of it would
violate the dependency.
4. Decomposition Using Functional Dependencies:
o Decompose a relation into smaller relations based on
functional dependencies to achieve normalization (e.g.,
2NF, 3NF, BCNF).
o Preserve Functional Dependencies: Ensure that the
original set of functional dependencies is preserved or
can be inferred from the decomposed relations.
o Lossless Join: The decomposition should allow the
original relation to be reconstructed without loss of
information.
5. Normalization:
o Normalization involves decomposing a relation into
smaller relations to eliminate redundancy and
anomalies.
o Functional dependencies guide the process by
identifying which attributes should be grouped together
in a relation.
First Normal Form (1NF)
A relation is in 1NF if every attribute is a singlevalued attribute or it
does not contain any multi valued or composite attribute, i.e.,
every attribute is an atomic attribute.
If there is a composite or multivalued attribute, it violates the 1NF.
To solve this, we can create a new row for each of the values of
the multivalued attribute to convert the table into the 1NF.
Let’s take an example of a relational table <EmployeeDetail> that
contains the details of the employees of the company.
Second Normal Form (2NF)
2NF builds on 1NF by requiring that each nonprimary key column in
a table is fully functionally dependent on the primary key. This
means that a table should not have partial dependencies, where a
nonprimary key column depends on only part of the primary key.
Third Normal Form (3NF)
We can say that a relation is in the third normal form when it holds any of these
given conditions in case of a functional dependency P > Q that is nontrivial:
1. P acts as a super key
2. Q acts as a nonprime attribute. Meaning, every element
of Q forms a part of a candidate key.
BCNF (Boyce Codd Normal Form)
It is the advanced version of 3NF. A table is in BCNF if every functional
dependency X>Y, X is the super key of the table. For BCNF, the table should be
in 3NF, and for every FD. LHS is super key.
Example
Consider a relation R with attributes (student, subject, teacher).

Student Teacher Subject


Jhansi P.Nares Databas
h e
jhansi K.Das C
subbu P.Nares Databas
h e
subbu R.Prasad C
F: { (student,
Teacher) > subject
(student, subject) >
Teacher Teacher >
subject}
Candidate keys are (student, teacher) and (student, subject).
The above relation is in 3NF [since there is no transitive
dependency]. A relation R is in BCNF if for every nontrivial FD X>Y,
X must be a key.
The above relation is not in BCNF, because in the FD
(teacher>subject), teacher is not a key. This relation suffers with
anomalies −
For example, if we try to delete the student Subbu, we will lose the information
that R. Prasad teaches C. These difficulties are caused by the fact the teacher is
determinant but not a candidate key.

What is Multivalued Dependency?


If the following requirements are met, a table is said to have a multivalued
dependency,
· For a single value of A in the dependency A > B, multiple values of B
exist.
· A table should have at least 3 columns.
· For the relation R(A,B,C)R(A,B,C), if A and B have a
multivalued dependency, then B and C should be
independent of each other.
Let's have an example to understand multivalued dependency :
The below table shows the details of an office department
exchange event having the columns, EMPLOYEE_ID,
DEPARTMENT, and HOBBY.
EMPLOYEE_ID DEPARTMENT HOBBY
E901 HR Badminto
n
E901 Sales Reading
E902 Marketing Cricket
E903 Finance Football

As you can see in the above table, Employee E901 is


interested in two departments HR and Sales, and, has two
hobbies Badminton and Reading. This will result in multiple
records for E901 as,
EMPLOYEE_ID DEPARTMENT HOBBY
E901 HR Badminto
n
E901 Sales Reading
E901 HR Reading
E901 Sales Badminto
n

In the above table, you can see that for the Employee E901
multiple records exist in the DEPARTMENT and the HOBBY
attribute. Hence the multivalued dependencies are,

EMPLOYEE_ID −>−>
DEPARTMENT and
EMPLOYEE_ID −>−>
HOBBY
Also, the DEPARTMENT and HOBBY attributes are independent
of each other thus leading to a multivalued dependency in the
above table.
Introduction to 4NF in DBMS
Before learning about 4NF in DBMS, let's have a brief on
Normalization and Normal Forms. 4NF in DBMS stands for Fourth
Normal Form, a relation is said to be in 4NF if the relation is in
Boyce Codd Normal Form (BCNF) and has no multivalued
dependency.
Now let's have a brief of BCNF, then we will move to MultiValued
Dependency.
A relation is said to be in BCNF if the relation's attributes contain
only an atomic/single value, all the nonkey attributes must be
fully functionally dependent on the primary key,
there is no transitive dependency for nonkey attributes, and
for every functional dependency X−>−>Y, X is the super key of
the table.
We will understand this better once we go through the examples.
Join Dependency
Join Dependency is similar to MultiValued Dependency as Join
Dependency is also a constraint.
Let R be a relation schema and the decompositions of R are
R1,R2,R3,...,RnR1,R2,R3,...,Rn. R is said to be in Join Dependency
if and only if every instance of R, r is equal to the join of its
projections on R1,R2,R3,...,RnR1,R2,R3,...,Rn.
You can read more about Join Dependency here at Join Dependency in DBMS
5Th normal form
· A relation is in 5NF if it is in 4NF and not contains any
join dependency and joining should be lossless.
· 5NF is satisfied when all the tables are broken into as
many tables as possible in order to avoid redundancy.
· 5NF is also known as Projectjoin normal form (PJ/NF)
Transactions in DBMS
•Transactions are a set of operations used to perform a logical set of work.
•A transaction is an action or series of actions.
•It is performed by a single user to perform operations for accessing the
contents of the database.
•A transaction usually means that the data in the database has changed.
•One of the major uses of DBMS is to protect the user’s data from system
failures.
It is done by ensuring that all the data is restored to a consistent state when the
computer is restarted after a crash
Atomicity:
•It states that all operations of the transaction take place at once if not, the
transaction is aborted.
•There is no midway, i.e., the transaction cannot occur partially. Each transaction
is treated as one unit and either run to completion or is not executed at all.
Atomicity involves the following two operations:
•Abort: If a transaction aborts then all the changes made are not visible.
•Commit: If a transaction commits then all the changes made are visible.
Durability:
•The database should be durable enough to hold all its latest updates even if the
system fails or restarts.
•If a transaction updates a chunk of data in a database and commits, then the
database will hold the modified data.
•If a transaction commits but the system fails before the data could be written on
to the disk, then that data will be updated once the system springs back into
action.
Serializability:
•When multiple transactions are running concurrently then there is a possibility
that the database may be left in an inconsistent state.
•Serializability is a concept that helps to identify which nonserial schedules are
correct and will maintain the consistency of the database.
Serializable Schedules:
If a given nonserial schedule of ‘n’ transactions is equivalent to some serial
schedule of ‘n’ transactions, then it is called as a serializable schedule.
Types of Serializability:
1.Conflict Serializability
2.View Serializability
Recoverability:
Recoverable Schedule:
•Schedules in which transactions commit only after all transactions whose
changes they read commit are called recoverable schedules.
• In other words, if some transaction Tj is reading value updated or written by
some other transaction Ti, then the commit of Tj must occur after the commit
of Ti.
Isolation:
•It shows that the data which is used at the time of execution of a transaction
cannot be used by the second transaction until the first one is completed.
•In isolation, if the transaction T1 is being executed and using the data item X,
then that data item can't be accessed by any other transaction T2 until the
transaction T1 ends.
Testing for serializability.
•The Serializability of a schedule is tested using a Serialization graph.
•Assume a schedule S, we can construct it directed graph or precedence graph.
•A graph G is a pair G=(V,E) where V is a set of vertices and E is a set of edges.
•Set of vertices consists of all the transactions participating in the schedule.
•Set of edges consists of all the edges Ti to Tj for which one of the following three
conditions hold:
i.Ti executes W(A) before Tj executes R(A)
ii.Ti executes R(A) before Tj executes W(A)
iii.Ti executes W(A) before Tj executes W(A
MODULE-V
Concurrency control, Storage and Indexing
>Concurrency Control: Lock Based Protocols, Timestamp Based Protocols, Validation Based
Protocols, Multiple Granularity, Deadlock Handling.
>Storage and Indexing: Data on external storage, File organizations and indexing – Clustered indexes,
Primary and secondary indexes; Index data structures – Hash based indexing, Tree based indexing;
Comparison of file organizations
--------------------------------------------------------------------------------------------------------------------------------------
Concurrency control:
Lock based protocols:
Lockbased protocols help control how transactions access data in a database.
Here are the common types of lockbased protocols explained simply:
1. Simple Locking Protocol:
What it does: A transaction can lock data when it wants to read or write, and it
releases the lock when done.
Issue: It might not always ensure proper transaction order, leading to
inconsistencies.
2. TwoPhase Locking (2PL):
What it does: A transaction goes through two stages:
Growing Phase: It can only acquire locks.
Shrinking Phase: It releases locks but can’t acquire new ones after that.
Why it's good: Ensures transactions are done in the right order, preventing
problems like data conflicts.
3. Strict TwoPhase Locking (Strict 2PL):
What it does: Same as 2PL, but the transaction holds all its locks until it is
completely finished (committed or rolled back).
Why it's useful: Prevents other transactions from accessing data until
everything is finalized, reducing errors.

4. Rigorous TwoPhase Locking:


What it does: Locks (both shared and exclusive) are held until the transaction
is fully completed.
Why it's extra safe: It further reduces the chance of conflicts but can make
other transactions wait longer.
These protocols ensure that transactions don’t interfere with each other,
keeping the database consistent and safe.

Time Stamp based protocol:


 The Timestamp Ordering Protocol is used to order the transactions
based on their Timestamps. The order of transaction is nothing but the
ascending order of the transaction creation.
 The priority of the older transaction is higher that's why it executes first.
To determine the timestamp of the transaction, this protocol uses
system time or logical counter.
 The lockbased protocol is used to manage the order between conflicting
pairs among transactions at the execution time. But Timestamp based
protocols start working as soon as a transaction is created.
 Let's assume there are two transactions T1 and T2. Suppose the
transaction T1 has entered the system at 007 times and transaction T2
has entered the system at 009 times. T1 has the higher priority, so it
executes first as it is entered the system first.
 The timestamp ordering protocol also maintains the timestamp of last
'read' and 'write' operation on a data.
Validations based protocol:
 Validation phase is also known as optimistic concurrency control
technique. In the validation based protocol, the transaction is executed
in the following three phases:
 Read phase: In this phase, the transaction T is read and executed. It is
used to read the value of various data items and stores them in
temporary local variables. It can perform all the write operations on
temporary variables without an update to the actual database.
 Validation phase: In this phase, the temporary variable value will be
validated against the actual data to see if it violates the serializability.
 Write phase: If the validation of the transaction is validated, then the
temporary results are written to the database or system otherwise the
transaction is rolled back.
 Here each phase has the following different timestamps:
 Start(Ti): It contains the time when Ti started its execution.
 Validation (Ti): It contains the time when Ti finishes its read phase and
starts its validation phase.
 Finish(Ti): It contains the time when Ti finishes its write phase.
This protocol is used to determine the time stamp for the transaction for
serialization using the time stamp of the validation phase, as it is the
actual phase which determines if the transaction will commit or rollback.
 Hence TS(T) = validation(T).
 The serializability is determined during the validation process. It can't be
decided in advance.
 While executing the transaction, it ensures a greater degree of
concurrency and also less number of conflicts.
 Thus it contains transactions which have less number of rollbacks.

Multiple Granularity protocol:


The Multiple Granularity protocol organizes the database into a hierarchical
structure to improve concurrency and reduce lock overhead. It involves
breaking the database into blocks, with each level representing different
granularities of data. The protocol manages locks efficiently by tracking what
and how to lock.

Hierarchical Structure Example:


1. Database (Top level): The entire database.
2. Area: Represents sections within the database.
3. File: Each area consists of files, and no file exists in more than one area.
4. Record: Files contain records, with each record belonging to only one file.

Dead Lock Handling:


1. Deadlock Prevention:
This method prevents deadlocks by controlling how transactions acquire locks.
A common approach is preacquisition, where a transaction grabs all needed
locks before it starts executing. If a transaction can’t get all locks, it waits until
all are free, preventing deadlock.

2. Deadlock Avoidance:
In this method, the system avoids deadlocks by analyzing whether granting a
lock will cause a deadlock.
If a lock is unavailable, the system runs algorithms to decide whether to let a
transaction wait or abort it:
WaitDie: If an older transaction requests a lock, it waits; if younger, it’s
aborted.
WoundWait: If an older transaction requests a lock, the younger transaction
is aborted; if younger, it waits.

3. Deadlock Detection and Removal:


This method detects deadlocks periodically.
It doesn’t check for deadlocks when locks are requested but runs a detection
algorithm at intervals. If a deadlock is found, it removes it by aborting a
transaction.

Storage and Indexing:


Data on External Storage:
The storage system in a DBMS is a hierarchical structure used to store,
manage, and retrieve data efficiently. The hierarchy varies based on factors
like speed, capacity, cost, and volatility.

1. Registers: Fastest, smallest, located in the CPU, holds data currently being
processed.
2. Cache Memory (L1, L2, L3): Near CPU, fast but small, stores frequently
accessed data.
3. Main Memory (RAM): Fast, volatile, holds active data.
4. Flash Storage (SSD): Fast, nonvolatile, durable.
5. Magnetic Disks (HDD): Slower, nonvolatile, large capacity.
6. Optical Disks (CD/DVD): Slower, portable, used for media distribution.
7. Magnetic Tapes: High capacity, slow access, used for backups.
8. Remote Storage/Cloud: Scalable, accessed via the internet, dependent
on network speed.

Types of Storage:
Primary Storage: Registers, cache, RAM for quick access.
Secondary Storage: HDDs, SSDs for nonvolatile storage.
Tertiary Storage: Magnetic tapes, optical disks for backups.
Quaternary Storage: Remote/cloud storage for scalable and remote data
access.

File Organizing and indexing:


1. Sequential File Organization:
Description: Records are stored in sequence based on a key field.
Advantages: Simple design, efficient for batch processing, low overhead.
Disadvantages: Inefficient for random access, slow insertions and
deletions, potential redundancy.
Example: Adding a new record requires shifting existing records to
maintain order.

2. Direct (Hashed) File Organization:


Description: Uses a hash function to determine the record's storage
address.
Advantages: Fast access, uniform record distribution, efficient searching.
Disadvantages: Collisions, dependence on hash function, rehashing
needed for dynamic growth.
Example: Books indexed by ISBN with a hash function to locate storage
addresses.

3. Indexed File Organization:


Description: Uses an index to map key values to record locations.
Advantages: Quick random access, flexible searches, supports ordered
access.
Disadvantages: Maintenance and space overhead, increased complexity.
Example: Student records indexed by student ID for quick lookup.

4. Indexed Sequential Access Method (ISAM):


Description: Indexed file organization with sequential storage and a static
primary index.
Advantages: Efficient sequential access, static indexing.
Disadvantages: Overflow areas may need merging.
Example: Sequentially ordered file with overflow areas for new records.
Indexing
Structure:
Search Key: Column with key values.
Data Reference: Pointers to record locations.

Types of Indexes:
Single Level: Direct pointers to data.
Multilevel: Hierarchical index for fewer disk accesses.
Dense Index: Entry for every key value.
Sparse Index: Fewer entries, each pointing to multiple records.
Primary Index: Ordered by primary key, with pointers.
Secondary Index: Index for secondary key values.
Clustered Index: Data rows stored in index order.
No Clustered Index: Data rows not in index order.
Bitmap Index: Uses bitmaps for limited distinct values.
BTrees/B+ Trees: Balanced trees for efficient access.

Indexes data Structures:


1. SingleLevel Index:
Description: A simple index with a single table of keys and pointers to
data records.
Structure:
Search Key: Column of indexed values.
Pointer: Address of the data record.
Use Case: Basic indexing for quick lookups.

2. MultiLevel Index:
Description: A hierarchical index to reduce the number of accesses
needed.
Structure:
TopLevel Index: Points to middlelevel indexes.
MiddleLevel Indexes: Point to lowerlevel indexes or directly to data.
Use Case: Efficient for large datasets with high access requirements.

3. Dense Index:
Description: Contains an index entry for every search key value.
Structure:
Index Entry: Each entry corresponds to a record in the main data file.
Use Case: Suitable for manageable index sizes.

4. Sparse Index:
Description: Contains fewer index entries, each pointing to multiple
records.
Structure:
Index Entry: Points to a range of records or blocks.
Use Case: Efficient when records are stored in contiguous blocks.

5. Primary Index:
Description: An index based on the primary key of a table.
Structure:
Primary Key: Field used as the key.
Pointer: Points to the data block where the record is stored.
Use Case: Ensures uniqueness and efficient data access.

6. Secondary Index:
Description: Provides access based on nonprimary key fields.
Structure:
Secondary Key: Field used for indexing.
Pointer: Points to the primary index or data block.
Use Case: Facilitates searches on nonprimary key fields.

7. Clustered Index:
Description: The data in the database matches the order of the index.
Structure:
Data Storage: Rows are stored in the same order as the index.
Use Case: Optimal for range queries and ordered data retrieval.

8. NonClustered Index:
Description: The index is separate from the data storage order.
Structure:
Index: Points to data locations but doesn’t affect storage order.
Use Case: Allows multiple indexes on the same table.

9. Bitmap Index:
Description: Uses bitmaps to represent the presence of values.
Structure:
Bitmap: Bit array where each bit represents a value’s presence.
Use Case: Effective for columns with limited distinct values (e.g., gender,
status).

10. BTree Index:


Description: A balanced tree structure with all leaf nodes at the same
level.
Structure:
Nodes: Contain keys and pointers to child nodes.
Use Case: Suitable for dynamic datasets with frequent updates.

11. B+ Tree Index:


Description: An extension of BTree where all values are in leaf nodes and
internal nodes store only keys.
Structure:
Leaf Nodes: Contain data pointers.
Internal Nodes: Store keys and pointers to child nodes.
Use Case: Efficient for range queries and ordered data retrieval.

Each index type has its specific use cases, benefits, and limitations, allowing
for efficient data retrieval based on the needs of the database system.

You might also like