DBDMS Answerkey
DBDMS Answerkey
ANSWER KEY
Part - A
DBMS facilitates seamless data access, ensures data security, and enables real-time decision-
making, making it an indispensable asset for modern businesses and organizations.
2. State the different types of integrity constraints used in designing a relational database.
The dependency of an attribute on a set of attributes is known as trivial functional dependency if the
set of attributes includes that attribute.
For example: Consider a table with two columns Student_id and Student_Name.
4.An object is created without any reference to it, how can that object be deleted?
The object can be deleted by garbage collection or by using the DELETE command if there is
no reference to it.
SELECT BLBANKNAME FROM BANK AS B1, B2 WHERE BLASSETS> B2 ASSETS AND B2.
BANKLOCATION TAMILNADU.
The first step is known as parsing, and then the second step is query optimization, and finally,
the query execution.
A transaction is an action or series of actions that are being performed by a single user or application
program, which reads or updates the contents of the database.
A transaction can be defined as a logical unit of work on the database. This may be an entire program,
a piece of a program, or a single command (like the SQL commands such as INSERT or UPDATE),
and it may engage in any number of operations on the database. In the database context, the execution
of an application program can be thought of as one or more transactions with non-database processing
taking place in between.
8.State the benefits of strict two-phase locking.
The strict 2PL mechanism has the advantage of guaranteeing recoverable transactions.
Total Rollback: A total rollback undoes all the changes made by a transaction, returning the
database to its state before the transaction began. It is used when a transaction fails entirely or
encounters an error that requires undoing all its operations.
Partial Rollback: A partial rollback undoes changes only up to a certain point within a
transaction. It is used when only a part of the transaction fails, allowing the successful parts to
be committed while undoing the unsuccessful operations.
10.State denormalization.
Part – B
A Database stores a lot of critical information to access data quickly and securely. Hence it is
important to select the correct architecture for efficient data management. DBMS Architecture
helps users to get their requests done while connecting to the database. We choose database
architecture depending on several factors like the size of the database, number of users, and
relationships between the users. There are two types of database models that we generally use,
logical model and physical model. Several types of architecture are there in the database
which we will deal with in the next section.
There are several types of DBMS Architecture that we use according to the usage
requirements. Types of DBMS Architecture are discussed here.
1-Tier Architecture
2-Tier Architecture
3-Tier Architecture
1-Tier Architecture
In 1-Tier Architecture the database is directly available to the user, the user can directly sit on
the DBMS and use it that is, the client, server, and Database are all present on the same
machine. For Example: to learn SQL we set up an SQL server and the database on the local
system. This enables us to directly interact with the relational database and execute
operations. The industry won’t use this architecture they logically go for 2-tier and 3-tier
Architecture.
Easy to Implement: 1-Tier Architecture can be easily deployed, and hence it is mostly used in
small projects.
2-Tier Architecture
The 2-tier architecture is similar to a basic client-server model. The application at the client
end directly communicates with the database on the server side. APIs like ODBC and JDBC
are used for this interaction. The server side is responsible for providing query processing and
transaction management functionalities. On the client side, the user interfaces and application
programs are run. The application on the client side establishes a connection with the server
side to communicate with the DBMS.
An advantage of this type is that maintenance and understanding are easier, and compatible
with existing systems. However, this model gives poor performance when there are a large
number of users.
Easy to Access: 2-Tier Architecture makes easy access to the database, which makes fast
retrieval.
Scalable: We can scale the database easily, by adding clients or upgrading hardware.
Low Cost: 2-Tier Architecture is cheaper than 3-Tier Architecture and Multi-Tier Architecture
Simple: 2-Tier Architecture is easily understandable as well as simple because of only two
components.
3-Tier Architecture
In 3-Tier Architecture, there is another layer between the client and the server. The client does
not directly communicate with the server. Instead, it interacts with an application server which
further communicates with the database system and then the query processing and transaction
management takes place. This intermediate layer acts as a medium for the exchange of
partially processed data between the server and the client. This type of architecture is used in
the case of large web applications.
Data Integrity: 3-Tier Architecture maintains Data Integrity. Since there is a middle layer
between the client and the server, data corruption can be avoided/removed.
Security: 3-Tier Architecture Improves Security. This type of model prevents direct
interaction of the client with the server thereby reducing access to unauthorized data.
Disadvantages of 3-Tier Architecture
Difficult to Interact: It becomes difficult for this sort of interaction to take place due to the
presence of middle layers.
11(B). Construct an Elt-diagram for hospital management system with a set of patients and a set
of doctors. Associate with each patient a log of the various tests and examination conducted
(ii) For each entity set and relationship used, indicate primary key, 1-1. many to one and one to
many relationships.
12(a) Distinguish between procedural and non procedural languages. Is relational algebra
procedural or non procedural. Explain the operations with example.
1.Procedural Language:
Definition:
A procedural language specifies how to perform a task, detailing the exact steps required to
achieve the desired result. The user must outline a sequence of operations to be executed.
Characteristics:
Control: The user has full control over the process and specifies the exact sequence of actions.
Step-by-Step Execution: It requires the user to describe the algorithm or procedure to achieve
the outcome.
Example in DBMS:
A typical SQL stored procedure or function that involves loops, conditionals, and multiple
SQL statements is procedural. For instance, calculating the total sales for each department by
iterating over records and summing the values in PL/SQL.
2. Non-Procedural Language:
Definition:
A non-procedural language specifies what the task is without requiring the user to detail how
it should be done. The system determines the most efficient way to execute the query.
Characteristics:
Control: The user specifies what they want to achieve, and the DBMS decides the best
method to retrieve or process the data.
Declarative Nature: Users only declare the conditions or relationships they are interested in
without specifying how to fulfill them.
Simplicity: Easier to use and requires less detailed knowledge of the underlying processes.
Example in DBMS:
SQL is a non-procedural language where users can write queries like SELECT * FROM
Employee WHERE age > 30; without worrying about how the database will search for the
records.
Example: σ_{age > 30}(Employee) selects all employees older than 30.
Example: π_{name, salary}(Employee) retrieves the names and salaries of all employees.
Set Difference (−):Retrieves tuples that are in one relation but not in another.
Cartesian Product (×):Combines tuples from two relations in all possible ways.
Example: Employee × Department gives all possible pairs of employee and department.
Each of these operations forms the basis of constructing complex queries in a procedural
manner within relational algebra.
12(B)Consider the following relational database Emplovee (person name, street, city)
Manager (person name, manager name) Give an SQL DDL, definition of this database. Identify
referential integrity constraints that should hold, and include them in DDL definition
Street VARCHAR(100),
City VARCHAR(100)
);
City VARCHAR(100)
);
Person_name VARCHAR(100),
Company_name VARCHAR(100),
);
Person_name VARCHAR(100),
Manager_name VARCHAR(100),
);
13. a)Consider the relation R(A,B,C,D,E) with Functional dependencies {A→BC, CD→E, B→D,
E → A}. Identify superkeys. Find F*
- b+ = bD (using B→D)
- d+ = dE (using CD→E)
- e+ = eA (using E→A)
From the closures calculated above, we can see that the following are superkeys:
- abce
- abde
- acde
- bcde
F+ is the set of all functional dependencies that can be derived from the given dependencies using the
Armstrong’s axioms.
After analyzing the given dependencies, we can add the following dependencies to F+:
F+ includes all the original dependencies and the additional ones derived above.
13(b) Discuss the procedure used for loss-less decomposition with an example.
The original relation and relation reconstructed from joining decomposed relations must
contain the same number of tuples if the number is increased or decreased then it is Lossy
Join decomposition.
Lossless join decomposition ensures that never get the situation where spurious tuples are
generated in relation, for every value on the join attributes there will be a unique tuple in one
of the relations.
Lossless join decomposition is a decomposition of a relation R into relations R1, and R2 such
that if we perform a natural join of relation R1 and R2, it will return the original relation R.
This is effective in removing redundancy from databases while preserving the original data.
Only 1NF,2NF,3NF, and BCNF are valid for lossless join decomposition.
In Lossless Decomposition, we select the common attribute and the criteria for selecting a
common attribute is that the common attribute must be a candidate key or super key in either
relation R1, R2, or both.
Only data that is relevant to that relation. This can help to reduce data inconsistencies and
errors.
Improved Flexibility: Lossless decomposition can improve the flexibility of the database
system by allowing for easier modification of the schema.
Increased Complexity: Lossless decomposition can increase the complexity of the database
system, making it harder to understand and manage.
Increased Processing Overhead: The process of decomposing a relation into smaller relations
can result in increased processing overhead. This can lead to slower query performance and
reduced efficiency.
Join Operations: Lossless decomposition may require additional join operations to retrieve
data from the decomposed relations. This can also result in slower query performance.
Costly: Decomposing relations can be costly, especially if the database is large and complex.
This can require additional resources, such as hardware and personnel.
Conclusion
•Recoverable schedules are desirable because failure of a transaction might otherwise bring the
system into an irreversibly inconsistent state.
•A recoverable schedule is one where, for each pair of transactions Ti and Tj such that Tj reads data
items previously written by Ti, the commit operation of Ti appears before the commit operation of Tj.
Recoverable schedules are desirable because failure of a transaction might otherwise bring the system
into an irreversibly inconsistent state. Non recoverable schedules may sometimes be needed when
updates must be made visible early due to time constraints, even if they have not yet been committed,
which may be required for very long duration transactions.
b) What is the need for concurrency control mechanisms? Explain the working of lock-based
protocol.
→ Concurrency control is a very important concept of DBMS which ensures the simultaneous
execution or manipulation of data by several processes or user without resulting in data inconsistency.
Concurrency control provides a procedure that is able to control concurrent execution of the
operations in the database.
1. Data Consistency: It prevents data from getting messed up when multiple users are changing
the same information simultaneously.
2. Isolation: Ensures that each transaction is treated as if it’s the only one happening, so one
person’s work doesn’t interfere with another’s.
3. Deadlock Prevention: Helps avoid situations where transactions get stuck waiting for each
other, ensuring smooth processing.
4. Efficiency: Lets multiple transactions happen at the same time without slowing things down
unnecessarily.
5. Durability and Atomicity: Makes sure all changes in a transaction are either fully completed
or not done at all, protecting the integrity of the data.
→In a database management system (DBMS), lock-based concurrency control (BCC) is used to
control the access of multiple transactions to the same data item. This protocol helps to maintain data
consistency and integrity across multiple users. In the protocol, transactions gain locks on data items
to control their access and prevent conflicts between concurrent transactions.
A lock is a variable associated with a data item that describes the status of the data item to possible
operations that can be applied to it. They synchronize the access by concurrent transactions to the
database items. It is required in this protocol that all the data items must be accessed in a mutually
exclusive manner. Let me introduce you to two common locks that are used and some terminology
followed in this protocol.
It is the simplest method for locking data during a transaction. Simple lock-based protocols enable all
transactions to obtain a lock on the data before inserting, deleting, or updating it. It will unlock the
data item once the transaction is completed.
Pre-claiming Lock Protocols assess transactions to determine which data elements require locks.
Before executing the transaction, it asks the DBMS for a lock on all of the data elements. If all locks
are given, this protocol will allow the transaction to start. When the transaction is finished, it releases
all locks. If all of the locks are not provided, this protocol allows the transaction to be reversed and
waits until all of the locks are granted.
The two-phase locking protocol divides the execution phase of the transaction into three parts.
•In the first part, when the execution of the transaction starts, it seeks permission for the lock it
requires.
•In the second part, the transaction acquires all the locks. The third phase is started as soon as the
transaction releases its first lock.
•In the third phase, the transaction cannot demand any new locks. It only releases the acquired locks.
A transaction is said to follow the Two-Phase Locking protocol if Locking and Unlocking can be done
in two phases:
Growing Phase: New locks on data items may be acquired but none can be released.
Shrinking Phase: Existing locks may be released but no new locks can be acquired.
Strict Two-Phase Locking requires that in addition to the 2-PL all Exclusive(X) locks held by the
transaction be released until after the Transaction Commits. The first phase of Strict-2PL is similar to
2PL. The only difference between 2PL and strict 2PL is that Strict-2PL does not release a lock after
using it.
•Strict-2PL waits until the whole transaction to commit, and then it releases all the locks at a time.
To represent the complex real world problems there was a need for a data model that is closely related
to real world. Object Oriented Data Model represents the real world problems easily.
In Object Oriented Data Model, data and their relationships are contained in a single structure which
is referred as object in this data model. In this, real world problems are represented as objects with
different attributes. All objects have multiple relationships between them. Basically, it is combination
of Object Oriented programming and Relational Database Model as it is clear from the following
figure :
Objects –
An object is an abstraction of a real world entity or we can say it is an instance of class. Objects
encapsulates data and code into a single unit which provide data abstraction by hiding the
implementation details from the user. For example: Instances of student, doctor, engineer in above
figure.
Attribute –
An attribute describes the properties of object. For example: Object is STUDENT and its attribute are
Roll no, Branch, Setmarks() in the Student class.
Methods –
Method represents the behavior of an object. Basically, it represents the real-world action. For
example: Finding a STUDENT marks in above figure as Setmarks().
Class –
A class is a collection of similar objects with shared structure i.e. attributes and behavior i.e. methods.
An object is an instance of class. For example: Person, Student, Doctor, Engineer in above figure.
Class student
Char Name[20];
Int roll_no;
--
--
Public:
Void search();
Void update();
In this example, students refers to class and S1, S2 are the objects of class which can be created in
main function.
Inheritance –
By using inheritance, new class can inherit the attributes and methods of the old class i.e. base class.
For example: as classes Student, Doctor and Engineer are inherited from the base class Person.
•Easily understandable.
•Cost of maintenance can reduced due to reusability of attributes and functions because of inheritance.
•Hbase is a database that is an open-source platform and Hbase is a distributed, scalable, NoSQL
database modeled after Google’s Bigtable and runs on top of Hadoop’s HDFS (Hadoop Distributed
File System).
• The Hbase database is column-oriented thus it makes it unique from other databases. One of the
unique qualities of Hbase is it doesn’t care about data types because we can store different data types
of data for the same column in different rows.
•It contains different sets of tables that maintain the data in key-value format. Hbase is best suitable
for sparse data sets which are very common in the case of big data.
•The Hbase data model is quite different from traditional relational databases.
Row Key: Each row in an Hbase table is identified by a unique row key, which is a byte array. Row
keys are stored in lexicographical order, so data access is fast and efficient when rows are accessed by
key.
Column Families: Columns in Hbase are grouped into column families, which are predefined and
stored together on disk. A table must have at least one column family, and each column family can
have multiple columns.Column families are defined at table creation and are stored as separate files,
which makes retrieval of columns from the same family efficient.
Columns: Columns are identified by their column family and a qualifier. The full name of a column is
formed as column_family:qualifier.
Cell: The intersection of a row key and a column (column family + qualifier) forms a cell. Each cell
stores a versioned value, where the version is identified by a timestamp (by default) or a custom
version number.The latest version is returned by default when querying.
Timestamp: Hbase allows storing multiple versions of data within a cell. Each version is identified
by a timestamp, which can either be automatically generated by Hbase or provided by the user. The
latest version is always returned by default unless specified otherwise.
Row Key Personal: name Personal: age Contact: email Contact: phone
User 1 John Doe 30 [email protected] 1234567890
User 2 Jane Smith 25 [email protected] 0987654321
m
In this example:
•The personal column family contains the columns name and age.
•The contact column family contains the columns email and phone.
16. a)Describe normalisation upto 3NF and BCNF with example. State the desirable properties
of decomposition.
Normalization:
→Normalization is used to minimize the redundancy from a relation or set of relations. It is also used
to eliminate undesirable characteristics like Insertion, Update, and Deletion Anomalies.
→Normalization divides the larger table into smaller and links them using relationships.
→The normal form is used to reduce redundancy from the database table.
Example:
EMPLOYEE TABLE:
Example: Let's assume, a school can store the data of teachers and the subjects they teach. In a
school, a teacher can teach more than one subject.
TEACHER TABLE
To convert the given table into 2NF, we decompose it into two tables:
TEACHER_DETAIL table:
TEACHER_ID TEACHER_AGE
25 30
47 35
83 38
TEACHER_SUBJECT table:
TEACHER_ID SUBJECT
25 Chemistry
25 Biology
47 English
83 Maths
83 Computer
Example:
• Here, EMP_STATE & EMP_CITY dependent on EMP_ZIP and EMP_ZIP dependent on EMP_ID.
The non-prime attributes (EMP_STATE, EMP_CITY) transitively dependent on super key(EMP_ID).
It violates the rule of third normal form.
• That’s why we need to move the EMP_CITY and EMP_STATE to the new <EMPLOYEE _ZIP>
table, with EMP_ZIP as a primary key.
EMPLOYEE table:
Example: Let’s assume there is a company where employees work in more than one department.
•The Table is not in BCNF because neither EMP_DEPT nor EMP_ID alone are keys.
•To convert the given table into BCNF, we decompose it into three tables:
EMP_COUNTRY table:
EMP_ID EMP_COUNTRY
264 India
364 UK
EMP_DEPT table:
EMP_DEPT_MAPPING table:
DEPT_TYPE EMP_DEPT_TYPE
D394 283
D394 300
D283 232
D283 549
PROPERTIES OF DECOMPOSITION:
Decomposition refers to the division of tables into multiple tables to produce consistency in the data.
In this article, we will learn about the Database concept. This article is related to the concept of
Decomposition in DBMS. It explains the definition of Decomposition, types of Decomposition in
DBMS, and its properties.
PROPERTIES:
Lossless: All the decomposition that we perform in Database management system should be lossless.
All the information should not be lost while performing the join on the sub-relation to get back the
original relation. It helps to remove the redundant data from the database.
Lack of Data Redundancy: Data Redundancy is generally termed as duplicate data or repeated data.
This property states that the decomposition performed should not suffer redundant data. It will help us
to get rid of unwanted data and focus only on the useful data or information.
• The query optimizer is a critical component of a Database Management System (DBMS) that
determines the most efficient way to execute a given query.
• It generates various query plans and chooses the one with the least cost, ensuring optimal
performance in terms of time and resource utilization.
Steps of Query Optimization:
The process starts when a user or application submits an SQL query to the DBMS. This query
specifies the data to be retrieved or manipulated.
2. Parsing:
Parsing is the first step where the SQL query is checked for syntax errors and then translated into an
internal representation, typically an Abstract Syntax Tree (AST).
Steps in Parsing:
•Lexical Analysis: The query is broken down into tokens such as keywords, identifiers, and
operators.Syntactical Analysis: The sequence of tokens is checked against the SQL grammar rules.
•Semantic Analysis: The query is checked for semantic correctness, such as verifying that the tables
and columns exist.
Once the query is parsed, a Logical Query Plan is generated. This plan describes the sequence of high-
level operations (e.g., select, project, join) needed to execute the query.
•Transformation Rules: The logical plan is transformed using a set of rules to optimize the sequence
of operations (e.g., pushing selections down to reduce the size of intermediate results).
4.Cost Estimation:
The Cost Estimator evaluates different possible execution strategies for the query and assigns a cost to
each. The cost usually considers factors like I/O operations, CPU usage, and memory consumption.
•Cost Metrics: Costs are usually estimated in terms of disk I/O (number of page reads and writes),
CPU usage, and response time.
•Plan Alternatives: For each logical operation (e.g., join), multiple physical methods (e.g., nested
loop join, hash join) are considered and their costs are estimated.
The logical plan is converted into a Physical Query Plan. This plan specifies the actual algorithms and
data access methods that will be used to execute the query.
Physical Operators:
•Table Access Methods: Methods like table scan, index scan, or index seek are chosen based on the
data and indexes available
•Join Algorithms: The optimizer selects between different join methods (e.g., nested loop join, sort-
merge join, hash join) based on cost and data characteristics.
•Sorting and Aggregation: Operations like sorting, grouping, and aggregation are optimized using
efficient algorithms like quicksort, hash aggregation, etc.
6. Plan Selection:
The optimizer compares the costs of all possible physical plans and selects the one with the lowest
estimated cost.
•Search Space: The optimizer explores a large search space of possible execution plans, considering
various transformations and join orders.
•Heuristics vs. Exhaustive Search: Some optimizers use heuristics to limit the search space (e.g.,
limiting the number of joins considered), while others may use exhaustive search methods like
dynamic programming.
7. Result of query:
The selected query execution plan is passed to the Query Executor, which actually retrieves the data
from the database.
•Materialized Execution: Intermediate results are stored temporarily, and subsequent operations are
performed on these stored results.