DBMS Tutorial - 3 Solutions Final
DBMS Tutorial - 3 Solutions Final
Soln:
A multivalued dependency X → Y specified on relation schema R, where X and Y are both subsets of R,
specifies the following constraint on any relation state r of R: If two tuples t1 and t2 exist in r such that
t1[X] = t2[X], then two tuples t3 and t4 should also exist in r with the following properties, where we use
Z to denote (R − (X ∪ Y)):
t3[X] = t4[X] = t1[X] = t2[X]
t3[Y] = t1[Y] and t4[Y] = t2[Y]
t3[Z] = t2[Z] and t4[Z] = t1[Z]
Here, StudentID Course and StudentID Hobby are multi-valued dependencies. To achieve
4NF, we decompose the relation into two:
i) STUDENT_COURSE(StudentID, Course)
ii) STUDENT_HOBBY(StudentID, Hobby)
STUDENT_COURSE STUDENT_HOBBY
StudentID Course StudentID Hobby
1 Maths 1 Football
1 Science 1 Basketball
2) Which normal form is based on transitive functional dependency and full functional dependency
explain the same with example.
n
Sol :
3NF is based on the concept of full functional dependency and transitive dependency.
A functional dependency X → Y is a full functional dependency if removal of any attribute A from X means
that the dependency does not hold anymore; that is, for any attribute A X, (X − {A}) does not
functionally determine Y.
A functional dependency X → Y in a relation schema R is a transitive dependency if there exists a set of
attributes Z in R that is neither a candidate key nor a subset of any key of R, and both X → Z and Z → Y
hold.
Definition: A relation schema R is in 3NF if it satisfies 2NF and no nonprime attribute of R is transitively
dependent on the primary key.
Alternative Definition: A relation schema R is in 3NF if every nonprime attribute of R meets both of the
following conditions:
It is fully functionally dependent on every key of R.
It is non-transitively dependent on every key of R.
Intuitively, we can see that any functional dependency in which the left-hand side is part of the primary key,
or any functional dependency in which the left-hand side is a non-key attribute, is a problematic FD.
2NF and 3NF normalization remove these problem FDs by decomposing the original relation into new
relations.
In terms of the normalization process, 3NF has been defined with the assumption that a relation is tested for
2NF first before it is tested for 3NF.
Ex:
The relation schema EMP_DEPT is in 2NF, since no partial dependencies on a key exist.
However, EMP_DEPT is not in 3NF because of the transitive dependency of Dmgr_ssn and also Dname
on Ssn via Dnumber.
We can normalize EMP_DEPT by decomposing it into the two 3NF relation schemas ED1 and ED2.
3) What is functional dependency. Write an algorithm to find the minimal cover for set of functional
dependencies. Construct minimal cover m for set of functional dependencies which are
Soln:
A functional dependency is a constraint between two sets of attributes from the database.
A functional dependency, denoted by X → Y, between two sets of attributes X and Y that are subsets of
R specifies a constraint on the possible tuples that can form a relation state r of R. The constraint is that,
for any two tuples t1 and t2 in r that have t1[X] = t2[X], they must also have t1[Y] = t2[Y].
Algorithm:
Input: A set of functional dependencies E.
Step 1: Set F := E.
Step 2: Replace each functional dependency X → {A1, A2, …, An} in F by the n functional dependencies
X → A1, X → A2, …, X → An.
Step 3: For each functional dependency X → A in F
for each attribute B that is an element of X
if {{F − {X → A}} ∪ {(X − {B}) → A}} is equivalent to F
then replace X → A with (X − {B}) → A in F.
Step 4: For each remaining functional dependency X → A in F
if {F − {X → A}} is equivalent to F,
then remove X → A from F.
Problem Solution:
Given, E:{B A, D A, AB D}
All above dependencies are in canonical form (that is, they have only one attribute on the right-hand
side), so we have completed step 1 of Algorithm 1 and can proceed to step 2.
In step 2 we need to determine if AB → D has any redundant attribute on the left-hand side; that is,
can it be replaced by B → D or A → D?
Since B → A, by augmenting with B on both sides (IR2), we have BB → AB, or B → AB (i).
However, AB → D as given (ii).
Hence by the transitive rule (IR3), we get from (i) and (ii), B → D. Thus AB → D may be
replaced by B → D.
We now have a set equivalent to original E, say E′: {B → A, D → A, B → D}. No further
reduction is possible in step 2 since all FDs have a single attribute on the left-hand side.
In step 3 we look for a redundant FD in E′. By using the transitive rule on B → D and D → A, we
derive B → A. Hence B → A is redundant in E′ and can be eliminated.
Therefore, the minimal cover of E is F: {B → D, D → A}.
4) Explain the usage of aggregate function in SQL. Write an SQL query to find sum of salaries of the
employees, maximum salary, minimum salary and average salary by renaming the columns in a
single row table.
n
Sol :
Aggregate functions are used to summarize information from multiple tuples into a single-tuple summary.
Grouping is used to create subgroups of tuples before summarization. Grouping and aggregation are
required in many database applications.
A few number of built-in aggregate functions exist: COUNT, SUM, MAX, MIN, and AVG.
The COUNT function returns the number of tuples or values as specified in a query.
The functions SUM, MAX, MIN, and AVG can be applied to a set or multiset of numeric values and
return, respectively, the sum, maximum value, minimum value, and average (mean) of those values.
These functions can be used in the SELECT clause or in a HAVING clause.
Problem Solution
We could use AS to rename the column names in the resulting single-row table.
SELECT
SUM(Salary) AS TotalSalary,
MAX(Salary) AS MaxSalary,
MIN(Salary) AS MinSalary,
AVG(Salary) AS AvgSalary
FROM Employees;
5) How the assertions and triggers defined in SQL are explain with examples. Explain transaction
support in SQL.
Soln:
Assertions:
Assertions are constraints that apply to the database as a whole, rather than to individual tables.
It is defined as shown in below example.
Ex:
CREATE ASSERTION SalaryCheck
CHECK (Salary >= 0);
Triggers:
Triggers are procedural code that automatically executes in response to certain events on a particular
table or view.
It is defined as shown in below example.
Ex:
CREATE TRIGGER SalaryUpdate
BEFORE UPDATE ON Employees
FOR EACH ROW
BEGIN
IF NEW.Salary < 0 THEN
SET NEW.Salary = 0;
END IF;
END;
Transaction Support
The basic definition of an SQL transaction is similar to our already defined concept of a transaction. That
is, it is a logical unit of work and is guaranteed to be atomic.
A single SQL statement is always considered to be atomic, either it completes execution without an error
or it fails and leaves the database unchanged.
Transaction initiation is done implicitly when particular SQL statements are encountered.
However, every transaction must have an explicit end statement, which is either a COMMIT or a
ROLLBACK. Every transaction has certain characteristics attributed to it.
These characteristics are specified by a SET TRANSACTION statement in SQL.
The characteristics are the Access mode, Diagnostic area size and Isolation level.
i) Access mode:
It can be specified as READ ONLY or READ WRITE. The default is READ WRITE, unless the
isolation level of READ UNCOMMITTED is specified (see below), in which case READ ONLY
is assumed.
A mode of READ WRITE allows select, update, insert, delete, and create commands to be executed.
A mode of READ ONLY, as the name implies, is simply for data retrieval.
ii) Diagnostic area size :
This DIAGNOSTIC SIZE n, specifies an integer value n, which indicates the number of conditions
that can be held simultaneously in the diagnostic area.
These conditions supply feedback information (errors or exceptions) to the user or program on the n
most recently executed SQL statement.
iii) Isolation level:
This option is specified using the statement ISOLATION LEVEL <isolation>, where the value for
<isolation> can be READ UNCOMMITTED, READ COMMITTED, REPEATABLE READ, or
SERIALIZABLE.
The default isolation level is SERIALIZABLE, although some systems use READ COMMITTED
as their default.
The use of the term SERIALIZABLE here is based on not allowing violations that cause dirty read,
unrepeatable read, and phantoms, and it is thus not identical to the way serializability.
If a transaction executes at a lower isolation level than SERIALIZABLE, then one or more of the
following three violations may occur:
a) Dirty read:
A transaction T1 may read the update of a transaction T2, which has not yet committed. If T2
fails and is aborted, then T1 would have read a value that does not exist and is incorrect.
b) Nonrepeatable read:
A transaction T1 may read a given value from a table. If another transaction T2 later updates
that value and T1 reads that value again, T1 will see a different value.
c) Phantoms:
A transaction T1 may read a set of rows from a table, perhaps based on some condition
specified in the SQL WHERE-clause.
Now suppose that a transaction T2 inserts a new row r that also satisfies the WHERE-
clause condition used in T1, into the table used by T1.
The record r is called a phantom record because it was not there when T1 starts but is
there when T1 ends.
T1 may or may not see the phantom, a row that previously did not exist. If the equivalent
serial order is T1 followed by T2, then the record r should not be seen; but if it is T2
followed by T1, then the phantom record should be in the result given to T1.
If the system cannot ensure the correct behavior, then it does not deal with the phantom
record problem.
Ex:
BEGIN TRANSACTION;
UPDATE Employees
SET Salary = Salary * 1.10
WHERE Project = 'IoT';
COMMIT;
6) With Example Explain Correlated and Non-Correlated Queries.
Soln:
A correlated query is a subquery that references columns from the outer query.
Ex:
SELECT e1.Name
FROM Employees e1
WHERE e1.Salary > (SELECT AVG(e2.Salary)
FROM Employees e2
WHERE e2.Department = e1.Department);
Suppose that transactions T1 and T2 are submitted at approximately the same time, and suppose that
their operations are interleaved as shown in Figure (a); then the final value of item X is incorrect
because T2 reads the value of X before T1 changes it in the database, and hence the updated value
resulting from T1 is lost.
The above figure shows an example where T1 updates item X and then fails before completion, so the
system must roll back X to its original value.
Before it can do so, however, transaction T2 reads the temporary value of X, which will not be recorded
permanently in the database because of the failure of T1.
The value of item X that is read by T2 is called dirty data because it has been created by a transaction
that has not completed and committed yet; hence, this problem is also known as the dirty read
problem.
A transaction is an atomic unit of work that should either be completed in its entirety or not done at all.
For recovery purposes, the system needs to keep track of when each transaction starts, terminates, and
commits, or aborts. Therefore, the recovery manager of the DBMS needs to keep track of the following
operations:
a) BEGIN_TRANSACTION: This marks the beginning of transaction execution.
b) READ or WRITE: These specify read or write operations on the database items that are executed as
part of a transaction.
c) END_TRANSACTION: This specifies that READ and WRITE transaction operations have ended and
marks the end of transaction execution. However, at this point it may be necessary to check whether
the changes introduced by the transaction can be permanently applied to the database (committed).
d) COMMIT_TRANSACTION: This signals a successful end of the transaction so that any changes
(updates) executed by the transaction can be safely committed to the database and will not be undone.
e) ROLLBACK (or ABORT): This signals that the transaction has ended unsuccessfully, so that any
changes or effects that the transaction may have applied to the database must be undone.
Active state: A transaction goes into an active state immediately after it starts execution, where it can
execute its READ and WRITE operations.
Partially committed state: When the transaction ends, it moves to the partially committed state. At this
point, some types of concurrency control protocols may do additional checks to see if the transaction can
be committed or not. Also, some recovery protocols need to ensure that a system failure will not result in
an inability to record the changes of the transaction permanently (usually by recording changes in the
system log,).
Committed state: If the above checks are successful, the transaction is said to have reached its commit
point and enters the committed state.
Failed state: A transaction can go to the failed state if one of the checks fails or if the transaction is aborted
during its active state. The transaction may then have to be rolled back to undo the effect of its WRITE
operations on the database.
Terminated state: The terminated state corresponds to the transaction leaving the system. The transaction
information that is maintained in system tables while the transaction has been running is removed when
the transaction terminates.
Failed or aborted transactions may be restarted later—either automatically or after being resubmitted by
the user—as brand new transactions.
10) Explain different types of locks used in concurrency control
Soln:
The two types of locks are used in concurrency control. They are:
i) Binary Locks:
A binary lock has two states or values: locked (1) and unlocked (0).
A distinct lock is associated with each database item X.
If the value of the lock on X is 1, item cannot be accessed by any database operation that requests X.
If the value of the lock on X is 0, item can be accessed by any database operation that requests X and
the lock value will be changed to zero.
It includes two operations, lock_item and unlock_item.
If the binary locking system is used, then every transaction must obey the following rules:
a) A transaction T must issue the operation lock_item(X) before any read_item(X) or write_item(X)
operations are performed on T.
b) A transaction T must issue the operation ulock_item(X) after all read_item(X) and write_item(X)
operations are performed on T.
c) A transaction T will not issue a lock_item(X) if it already holds the lock on item X (1).
d) A transaction T will not issue a unlock_item(X) on item X(0) unless it already holds the lock on
item X(1).
b. UPDATE EMPLOYEE E
SET Salary = Salary * 1.10
WHERE E.SSN IN (SELECT W.SSN FROM WORKS_ON W
JOIN PROJECT P ON W.PNo = P.PNo
WHERE P.PName = 'IoT');
c. SELECT
SUM(E.Salary) AS TotalSalary,
MAX(E.Salary) AS MaxSalary,
MIN(E.Salary) AS MinSalary,
AVG(E.Salary) AS AvgSalary
FROM EMPLOYEE E
JOIN DEPARTMENT D ON E.DNo = D.DNo
WHERE D.DName = 'Accounts';
iii) Consistency:
MongoDB provides tunable consistency levels. By default, it offers strong consistency for reads and writes with
the option to configure read preferences for eventual consistency in specific scenarios.
v) Distributed Data:
Data is distributed across multiple servers or data centers, allowing for better load balancing, improved read and
write performance, and geographic distribution.
vii) Elasticity:
MongoDB's architecture allows for easy scaling up or down. Additional nodes can be added to the cluster
without significant downtime, and data is rebalanced automatically.