0% found this document useful (0 votes)
20 views

DBMS Tutorial - 3 Solutions Final

study material

Uploaded by

Mahesh .v
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views

DBMS Tutorial - 3 Solutions Final

study material

Uploaded by

Mahesh .v
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

1) Define multi-valued dependency. Explain 4NF with an example.

Soln:
 A multivalued dependency X → Y specified on relation schema R, where X and Y are both subsets of R,
specifies the following constraint on any relation state r of R: If two tuples t1 and t2 exist in r such that
t1[X] = t2[X], then two tuples t3 and t4 should also exist in r with the following properties, where we use
Z to denote (R − (X ∪ Y)):
 t3[X] = t4[X] = t1[X] = t2[X]
 t3[Y] = t1[Y] and t4[Y] = t2[Y]
 t3[Z] = t2[Z] and t4[Z] = t1[Z]

4th Normal Form (4NF)


 A relation is in 4NF if it is in Boyce-Codd Normal Form (BCNF) and has no non-trivial multi-valued
dependencies.
 Example:
 Consider a relation STUDENT with attributes StudentID, Course, and Hobby.
 A student can be enrolled in multiple courses and have multiple hobbies independently of each other.
STUDENT
StudentID Course Hobby
1 Maths Football
1 Maths Basketball
1 Science Football
1 Science Basketball

 Here, StudentID  Course and StudentID  Hobby are multi-valued dependencies. To achieve
4NF, we decompose the relation into two:
i) STUDENT_COURSE(StudentID, Course)
ii) STUDENT_HOBBY(StudentID, Hobby)

STUDENT_COURSE STUDENT_HOBBY
StudentID Course StudentID Hobby
1 Maths 1 Football
1 Science 1 Basketball
2) Which normal form is based on transitive functional dependency and full functional dependency
explain the same with example.
n
Sol :
 3NF is based on the concept of full functional dependency and transitive dependency.
 A functional dependency X → Y is a full functional dependency if removal of any attribute A from X means
that the dependency does not hold anymore; that is, for any attribute A  X, (X − {A}) does not
functionally determine Y.
 A functional dependency X → Y in a relation schema R is a transitive dependency if there exists a set of
attributes Z in R that is neither a candidate key nor a subset of any key of R, and both X → Z and Z → Y
hold.
 Definition: A relation schema R is in 3NF if it satisfies 2NF and no nonprime attribute of R is transitively
dependent on the primary key.
 Alternative Definition: A relation schema R is in 3NF if every nonprime attribute of R meets both of the
following conditions:
 It is fully functionally dependent on every key of R.
 It is non-transitively dependent on every key of R.
 Intuitively, we can see that any functional dependency in which the left-hand side is part of the primary key,
or any functional dependency in which the left-hand side is a non-key attribute, is a problematic FD.
 2NF and 3NF normalization remove these problem FDs by decomposing the original relation into new
relations.
 In terms of the normalization process, 3NF has been defined with the assumption that a relation is tested for
2NF first before it is tested for 3NF.
 Ex:

 The relation schema EMP_DEPT is in 2NF, since no partial dependencies on a key exist.
 However, EMP_DEPT is not in 3NF because of the transitive dependency of Dmgr_ssn and also Dname
on Ssn via Dnumber.
 We can normalize EMP_DEPT by decomposing it into the two 3NF relation schemas ED1 and ED2.
3) What is functional dependency. Write an algorithm to find the minimal cover for set of functional
dependencies. Construct minimal cover m for set of functional dependencies which are

Soln:
 A functional dependency is a constraint between two sets of attributes from the database.
 A functional dependency, denoted by X → Y, between two sets of attributes X and Y that are subsets of
R specifies a constraint on the possible tuples that can form a relation state r of R. The constraint is that,
for any two tuples t1 and t2 in r that have t1[X] = t2[X], they must also have t1[Y] = t2[Y].

 Algorithm:
Input: A set of functional dependencies E.
Step 1: Set F := E.
Step 2: Replace each functional dependency X → {A1, A2, …, An} in F by the n functional dependencies
X → A1, X → A2, …, X → An.
Step 3: For each functional dependency X → A in F
for each attribute B that is an element of X
if {{F − {X → A}} ∪ {(X − {B}) → A}} is equivalent to F
then replace X → A with (X − {B}) → A in F.
Step 4: For each remaining functional dependency X → A in F
if {F − {X → A}} is equivalent to F,
then remove X → A from F.

 Problem Solution:
 Given, E:{B  A, D  A, AB  D}
 All above dependencies are in canonical form (that is, they have only one attribute on the right-hand
side), so we have completed step 1 of Algorithm 1 and can proceed to step 2.
 In step 2 we need to determine if AB → D has any redundant attribute on the left-hand side; that is,
can it be replaced by B → D or A → D?
 Since B → A, by augmenting with B on both sides (IR2), we have BB → AB, or B → AB (i).
However, AB → D as given (ii).
 Hence by the transitive rule (IR3), we get from (i) and (ii), B → D. Thus AB → D may be
replaced by B → D.
 We now have a set equivalent to original E, say E′: {B → A, D → A, B → D}. No further
reduction is possible in step 2 since all FDs have a single attribute on the left-hand side.
 In step 3 we look for a redundant FD in E′. By using the transitive rule on B → D and D → A, we
derive B → A. Hence B → A is redundant in E′ and can be eliminated.
 Therefore, the minimal cover of E is F: {B → D, D → A}.
4) Explain the usage of aggregate function in SQL. Write an SQL query to find sum of salaries of the
employees, maximum salary, minimum salary and average salary by renaming the columns in a
single row table.
n
Sol :

 Aggregate functions are used to summarize information from multiple tuples into a single-tuple summary.
 Grouping is used to create subgroups of tuples before summarization. Grouping and aggregation are
required in many database applications.
 A few number of built-in aggregate functions exist: COUNT, SUM, MAX, MIN, and AVG.
 The COUNT function returns the number of tuples or values as specified in a query.
 The functions SUM, MAX, MIN, and AVG can be applied to a set or multiset of numeric values and
return, respectively, the sum, maximum value, minimum value, and average (mean) of those values.
 These functions can be used in the SELECT clause or in a HAVING clause.

Problem Solution
 We could use AS to rename the column names in the resulting single-row table.
 SELECT
SUM(Salary) AS TotalSalary,
MAX(Salary) AS MaxSalary,
MIN(Salary) AS MinSalary,
AVG(Salary) AS AvgSalary
FROM Employees;
5) How the assertions and triggers defined in SQL are explain with examples. Explain transaction
support in SQL.
Soln:
Assertions:
 Assertions are constraints that apply to the database as a whole, rather than to individual tables.
 It is defined as shown in below example.
 Ex:
CREATE ASSERTION SalaryCheck
CHECK (Salary >= 0);

Triggers:
 Triggers are procedural code that automatically executes in response to certain events on a particular
table or view.
 It is defined as shown in below example.
 Ex:
CREATE TRIGGER SalaryUpdate
BEFORE UPDATE ON Employees
FOR EACH ROW
BEGIN
IF NEW.Salary < 0 THEN
SET NEW.Salary = 0;
END IF;
END;

Transaction Support
 The basic definition of an SQL transaction is similar to our already defined concept of a transaction. That
is, it is a logical unit of work and is guaranteed to be atomic.
 A single SQL statement is always considered to be atomic, either it completes execution without an error
or it fails and leaves the database unchanged.
 Transaction initiation is done implicitly when particular SQL statements are encountered.
 However, every transaction must have an explicit end statement, which is either a COMMIT or a
ROLLBACK. Every transaction has certain characteristics attributed to it.
 These characteristics are specified by a SET TRANSACTION statement in SQL.
 The characteristics are the Access mode, Diagnostic area size and Isolation level.
i) Access mode:
 It can be specified as READ ONLY or READ WRITE. The default is READ WRITE, unless the
isolation level of READ UNCOMMITTED is specified (see below), in which case READ ONLY
is assumed.
 A mode of READ WRITE allows select, update, insert, delete, and create commands to be executed.
 A mode of READ ONLY, as the name implies, is simply for data retrieval.
ii) Diagnostic area size :
 This DIAGNOSTIC SIZE n, specifies an integer value n, which indicates the number of conditions
that can be held simultaneously in the diagnostic area.
 These conditions supply feedback information (errors or exceptions) to the user or program on the n
most recently executed SQL statement.
iii) Isolation level:
 This option is specified using the statement ISOLATION LEVEL <isolation>, where the value for
<isolation> can be READ UNCOMMITTED, READ COMMITTED, REPEATABLE READ, or
SERIALIZABLE.
 The default isolation level is SERIALIZABLE, although some systems use READ COMMITTED
as their default.
 The use of the term SERIALIZABLE here is based on not allowing violations that cause dirty read,
unrepeatable read, and phantoms, and it is thus not identical to the way serializability.
 If a transaction executes at a lower isolation level than SERIALIZABLE, then one or more of the
following three violations may occur:
a) Dirty read:
A transaction T1 may read the update of a transaction T2, which has not yet committed. If T2
fails and is aborted, then T1 would have read a value that does not exist and is incorrect.
b) Nonrepeatable read:
A transaction T1 may read a given value from a table. If another transaction T2 later updates
that value and T1 reads that value again, T1 will see a different value.
c) Phantoms:
 A transaction T1 may read a set of rows from a table, perhaps based on some condition
specified in the SQL WHERE-clause.
 Now suppose that a transaction T2 inserts a new row r that also satisfies the WHERE-
clause condition used in T1, into the table used by T1.
 The record r is called a phantom record because it was not there when T1 starts but is
there when T1 ends.
 T1 may or may not see the phantom, a row that previously did not exist. If the equivalent
serial order is T1 followed by T2, then the record r should not be seen; but if it is T2
followed by T1, then the phantom record should be in the result given to T1.
 If the system cannot ensure the correct behavior, then it does not deal with the phantom
record problem.
 Ex:
BEGIN TRANSACTION;
UPDATE Employees
SET Salary = Salary * 1.10
WHERE Project = 'IoT';
COMMIT;
6) With Example Explain Correlated and Non-Correlated Queries.
Soln:
 A correlated query is a subquery that references columns from the outer query.
 Ex:
SELECT e1.Name
FROM Employees e1
WHERE e1.Salary > (SELECT AVG(e2.Salary)
FROM Employees e2
WHERE e2.Department = e1.Department);

 A non-correlated query is a subquery that is independent of the outer query.


 Ex:
SELECT Name
FROM Employees
WHERE Salary > (SELECT AVG(Salary) FROM Employees);
7) Discuss the acid properties of the database transaction.
Soln:
The database transaction possesses some properties known as the ACID properties, which are enforced by the
concurrency control and recovery methods of the DBMS.
The following are the ACID properties:
i) Atomicity (A):
A transaction is an atomic unit of processing; it is either performed entirely or not performed at all.
ii) Consistency preservation (C):
A transaction should be consistency preserving i.e, if it is completely executed from beginning to
end without interface from other transactions, it should take the database from one consistent state
to another.
iii) Isolation (I):
A transaction should appear as though it is being executed in isolation from other transactions.
That is, the execution of a transaction should not be interfered with any other transactions which
are executing concurrently.
iv) Durability (D):
The changes applied to the database by a committed transaction must persist in the database. These
changes must not be lost because of any failure.
8) What are the anomalies occur due to interleave execution . Explain them with example.
Soln:
The anomalies occur due to interleave execution are:
i) The Lost update problem:
 This problem occurs when two transactions that access the same database items have their operations
interleaved in a way that makes the value of some database items incorrect.

 Suppose that transactions T1 and T2 are submitted at approximately the same time, and suppose that
their operations are interleaved as shown in Figure (a); then the final value of item X is incorrect
because T2 reads the value of X before T1 changes it in the database, and hence the updated value
resulting from T1 is lost.

ii) The temporary update (dirty read) problem:


 This problem occurs when one transaction updates a database item and then the transaction fails for
some reason.
 Meanwhile, the updated item is accessed (read) by another transaction before it is changed back (or
rolled back) to its original value.

 The above figure shows an example where T1 updates item X and then fails before completion, so the
system must roll back X to its original value.
 Before it can do so, however, transaction T2 reads the temporary value of X, which will not be recorded
permanently in the database because of the failure of T1.
 The value of item X that is read by T2 is called dirty data because it has been created by a transaction
that has not completed and committed yet; hence, this problem is also known as the dirty read
problem.

iii) The Incorrect summary problem:


 If one transaction is calculating an aggregate summary function on a number of database items while
other transactions are updating some of these items, the aggregate function may calculate some values
before they are updated and others after they are updated.
 For example, suppose that a transaction T3 is calculating the total number of reservations on all the
flights; meanwhile, transaction T1 is executing.
 If the interleaving of operations shown in Figure (c) occurs, the result of T3 will be off by an amount N
because T3 reads the value of X after N seats have been subtracted from it but reads the value of Y
before those N seats have been added to it.

iv) The phantom read problem:


 Another problem that may occur is called phantom or unrepeatable read, where a transaction T reads
the same item twice and the item is changed by another transaction T′ between the two reads. Hence, T
receives different values for its two reads of the same item.
 This may occur, for example, if during an airline reservation transaction, a customer inquires about seat
availability on several flights.
 When the customer decides on a particular flight, the transaction then reads the number of seats on that
flight a second time before completing the reservation, and it may end up reading a different value for
the item.
9) With neat diagram, explain different states of transaction.
Soln:

 A transaction is an atomic unit of work that should either be completed in its entirety or not done at all.
 For recovery purposes, the system needs to keep track of when each transaction starts, terminates, and
commits, or aborts. Therefore, the recovery manager of the DBMS needs to keep track of the following
operations:
a) BEGIN_TRANSACTION: This marks the beginning of transaction execution.
b) READ or WRITE: These specify read or write operations on the database items that are executed as
part of a transaction.
c) END_TRANSACTION: This specifies that READ and WRITE transaction operations have ended and
marks the end of transaction execution. However, at this point it may be necessary to check whether
the changes introduced by the transaction can be permanently applied to the database (committed).
d) COMMIT_TRANSACTION: This signals a successful end of the transaction so that any changes
(updates) executed by the transaction can be safely committed to the database and will not be undone.
e) ROLLBACK (or ABORT): This signals that the transaction has ended unsuccessfully, so that any
changes or effects that the transaction may have applied to the database must be undone.
 Active state: A transaction goes into an active state immediately after it starts execution, where it can
execute its READ and WRITE operations.
 Partially committed state: When the transaction ends, it moves to the partially committed state. At this
point, some types of concurrency control protocols may do additional checks to see if the transaction can
be committed or not. Also, some recovery protocols need to ensure that a system failure will not result in
an inability to record the changes of the transaction permanently (usually by recording changes in the
system log,).
 Committed state: If the above checks are successful, the transaction is said to have reached its commit
point and enters the committed state.
 Failed state: A transaction can go to the failed state if one of the checks fails or if the transaction is aborted
during its active state. The transaction may then have to be rolled back to undo the effect of its WRITE
operations on the database.
 Terminated state: The terminated state corresponds to the transaction leaving the system. The transaction
information that is maintained in system tables while the transaction has been running is removed when
the transaction terminates.
 Failed or aborted transactions may be restarted later—either automatically or after being resubmitted by
the user—as brand new transactions.
10) Explain different types of locks used in concurrency control
Soln:
The two types of locks are used in concurrency control. They are:
i) Binary Locks:
 A binary lock has two states or values: locked (1) and unlocked (0).
 A distinct lock is associated with each database item X.
 If the value of the lock on X is 1, item cannot be accessed by any database operation that requests X.
 If the value of the lock on X is 0, item can be accessed by any database operation that requests X and
the lock value will be changed to zero.
 It includes two operations, lock_item and unlock_item.
 If the binary locking system is used, then every transaction must obey the following rules:
a) A transaction T must issue the operation lock_item(X) before any read_item(X) or write_item(X)
operations are performed on T.
b) A transaction T must issue the operation ulock_item(X) after all read_item(X) and write_item(X)
operations are performed on T.
c) A transaction T will not issue a lock_item(X) if it already holds the lock on item X (1).
d) A transaction T will not issue a unlock_item(X) on item X(0) unless it already holds the lock on
item X(1).

ii) Shared / exclusive Locks:


 A shared/exclusive lock has three locking operations: read_lock(X), write_lock(X) and unlock(X).
 A read-locked item is also called called share-locked because other transactions are allowed to read the
item.
 A write-locked item is called exclusive-locked because a single transaction exclusively holds the lock
on the item.
 If the shared/exclusive locking scheme is used, then the system must enforce the following rules:
a) A transaction T must issue the operation read_lock(X) or write_lock(X) before any read_item(X)
operation is performed on T.
b) A transaction T must issue the operation write_lock(X) before any write_item(X) operation is
performed on T.
c) A transaction T must issue the operation unlock(X) after all read_item(X) and write_item(X)
operations are completed in T.
d) A transaction T will not issue a read_lock(X) if it already holds a read_lock or a write_lock on the
item X.
e) A transaction T will not issue a write_lock(X) operation if it already holds a read_lock or write_lock
on the item X.
f) A transaction T will not issue an unlock(X) operation unless it already holds a read_lock or
write_lock on the item X.
11) What is 2-phase locking protocol? How does it guarantee serializability?
Soln:
 Several types of locks are used in concurrency control.
 We mainly use two locking systems. A Binary Lock and Shared/Exclusive Lock.
 Binary locks are simple but also too restrictive for database concurrency control purposes.
 Shared/Exclusive locks have more general locking capabilities and are used in practical database locking.
i) Binary Locks:
 A binary lock has two states or values: locked (1) and unlocked (0).
 A distinct lock is associated with each database item X.
 If the value of the lock on X is 1, item cannot be accessed by any database operation that requests X.
 If the value of the lock on X is 0, item can be accessed by any database operation that requests X and
the lock value will be changed to zero.
 It includes two operations, lock_item and unlock_item.
 If the binary locking system is used, then every transaction must obey the following rules:
a) A transaction T must issue the operation lock_item(X) before any read_item(X) or write_item(X)
operations are performed on T.
b) A transaction T must issue the operation ulock_item(X) after all read_item(X) and write_item(X)
operations are performed on T.
c) A transaction T will not issue a lock_item(X) if it already holds the lock on item X (1).
d) A transaction T will not issue a unlock_item(X) on item X(0) unless it already holds the lock on
item X(1).

ii) Shared / exclusive Locks:


 A shared/exclusive lock has three locking operations: read_lock(X), write_lock(X) and unlock(X).
 A read-locked item is also called called share-locked because other transactions are allowed to read the
item.
 A write-locked item is called exclusive-locked because a single transaction exclusively holds the lock
on the item.
 If the shared/exclusive locking scheme is used, then the system must enforce the following rules:
a) A transaction T must issue the operation read_lock(X) or write_lock(X) before any read_item(X)
operation is performed on T.
b) A transaction T must issue the operation write_lock(X) before any write_item(X) operation is
performed on T.
c) A transaction T must issue the operation unlock(X) after all read_item(X) and write_item(X)
operations are completed in T.
d) A transaction T will not issue a read_lock(X) if it already holds a read_lock or a write_lock on the
item X.
e) A transaction T will not issue a write_lock(X) operation if it already holds a read_lock or write_lock
on the item X.
f) A transaction T will not issue an unlock(X) operation unless it already holds a read_lock or
write_lock on the item X.

Guaranteed Serializability by 2-phase Locking


 A transaction is said to follow the two-phase locking protocol if all locking operations precede the first
unlock operation in the transaction such a transaction can be divided into two phases:
i) Growing (First) Phase: This is the phase during which new locks on items can be acquired but no
locks can be released.
ii) Shrinking (Second) Phase: This is the phase during which the existing locks can be released but no
new locks can be acquired.
 If lock conversion is allowed, then upgrading of locks must be done during the expanding phase and
downgrading of locks must be done during the shrinking phase.
 Hence, a read_lock(X) operation that downgrades and an already held write_lock on X can appear only in
the shrinking phase.
12) Consider the schema for Company Database: EMPLOYEE(SSN, Name, Address, Sex, Salary,
SuperSSN, DNo) DEPARTMENT(DNo, DName, MgrSSN, MgrStartDate) DLOCATION(DNo,DLoc)
PROJECT(PNo, PName, PLocation, DNo) WORKS_ON(SSN, PNo, Hours)
a. Make a list of all project numbers for projects that involve an employee whose last name is
‘Scott’, either as a worker or as a manager of the department that controls the project.
b. Show the resulting salaries if every employee working on the ‘IoT’ project is given a 10 percent
raise.
c. Find the sum of the salaries of all employees of the ‘Accounts’ department, as well as the
maximum salary, the minimum salary, and the average salary in this department.
d. Retrieve the name of each employee who works on all the projects controlled by department
number 5 (use NOT EXISTS operator).
Soln:

a. SELECT DISTINCT P.PNo FROM PROJECT P


JOIN WORKS_ON W ON P.PNo = W.PNo
JOIN EMPLOYEE E ON W.SSN = E.SSN
WHERE E.Name = 'Scott'
UNION
SELECT DISTINCT P.PNo FROM PROJECT P
JOIN DEPARTMENT D ON P.DNo = D.DNo
JOIN EMPLOYEE E ON D.MgrSSN = E.SSN
WHERE E.Name = 'Scott';

b. UPDATE EMPLOYEE E
SET Salary = Salary * 1.10
WHERE E.SSN IN (SELECT W.SSN FROM WORKS_ON W
JOIN PROJECT P ON W.PNo = P.PNo
WHERE P.PName = 'IoT');

c. SELECT
SUM(E.Salary) AS TotalSalary,
MAX(E.Salary) AS MaxSalary,
MIN(E.Salary) AS MinSalary,
AVG(E.Salary) AS AvgSalary
FROM EMPLOYEE E
JOIN DEPARTMENT D ON E.DNo = D.DNo
WHERE D.DName = 'Accounts';

d. SELECT E.Name FROM EMPLOYEE E


WHERE NOT EXISTS (SELECT P.PNo FROM PROJECT P WHERE P.DNo = 5
AND NOT EXISTS (SELECT W.SSN FROM WORKS_ON W
WHERE W.PNo = P.PNo AND W.SSN = E.SSN));
13) Consider the schema for College Database: STUDENT(USN, SName, Address, Phone, Gender)
SEMSEC(SSID, Sem, Sec) CLASS(USN, SSID) COURSE(Subcode, Title, Sem, Credits)
IAMARKS(USN, Subcode, SSID, Test1, Test2, Test3, FinalIA)
Write SQL queries to
a. List all the student details studying in fourth semester ‘C’ section.
b. Compute the total number of male and female students in each semester and in each section.
c. Create a view of Test1 marks of student USN ‘1BI15CS101’ in all Courses.
Soln:

a. SELECT S.* FROM STUDENT S


JOIN CLASS C ON S.USN = C.USN
JOIN SEMSEC SS ON C.SSID = SS.SSID
WHERE SS.Sem = 4 AND SS.Sec = 'C';

b. SELECT SS.Sem, SS.Sec, S.Gender,


COUNT(*) AS Total
FROM STUDENT S
JOIN CLASS C ON S.USN = C.USN
JOIN SEMSEC SS ON C.SSID = SS.SSID
GROUP BY SS.Sem, SS.Sec, S.Gender;

c. CREATE VIEW Test1Marks AS


SELECT I.Subcode, I.Test1
FROM IAMARKS I
WHERE I.USN = '1BI15CS101';
14) List and explain MongoDB Distributed Systems Characteristics
Soln:
MongoDB is designed to work efficiently in distributed systems, exhibiting several key characteristics:
i) Scalability:
 MongoDB supports horizontal scaling through sharding. A large dataset can be partitioned across many servers,
enabling the database to handle increased load by adding more nodes.
 Each shard contains a subset of the data, improving performance and capacity.

ii) High Availability:


 MongoDB uses replica sets to ensure high availability. A replica set is a group of MongoDB servers that
maintain the same data set, providing redundancy and automatic failover.
 If the primary server fails, an automatic election process selects a new primary to minimize downtime.

iii) Consistency:
 MongoDB provides tunable consistency levels. By default, it offers strong consistency for reads and writes with
the option to configure read preferences for eventual consistency in specific scenarios.

iv) Fault Tolerance:


 MongoDB's replica sets provide fault tolerance. Data replication across multiple nodes ensures that even if some
nodes fail, the system can continue to operate without data loss.

v) Distributed Data:
 Data is distributed across multiple servers or data centers, allowing for better load balancing, improved read and
write performance, and geographic distribution.

vi) Operational Simplicity:


 MongoDB simplifies operations with automated sharding and replica set management, reducing the
administrative burden on developers and DBAs.

vii) Elasticity:
 MongoDB's architecture allows for easy scaling up or down. Additional nodes can be added to the cluster
without significant downtime, and data is rebalanced automatically.

viii) Global Distribution:


 MongoDB supports global distribution of data with features like zone sharding and geo-partitioning, allowing
data to be placed close to users for lower latency and compliance with data residency requirements.
15) Explain different Categories of NOSQL systems.
Soln:
NoSQL databases can be categorized into several types based on their data models and design principles:
i) Document Stores:
 Store and retrieve semi-structured data as JSON, BSON, or XML documents. Each document can have a
flexible schema.
 Ex: MongoDB
 Used in Content management systems, user profiles, and applications needing flexible schemas.
ii) Key-Value Stores:
 Store data as key-value pairs, where each key is unique. The value can be a simple string or a more complex
data structure.
 Ex: DynamoDB
 Used in Caching, session management, and real-time data processing.
iii) Column-Family Stores:
 Store data in columns rather than rows. Each column family contains rows identified by a unique key, and each
row can have a different number of columns.
 Ex: Apache Cassandra
 Used in Time-series data, large-scale analytics, and real-time data applications.
iv) Graph Databases:
 Store data as nodes, edges, and properties. Designed to represent and traverse relationships between entities
efficiently.
 Ex: Amazon Neptune
 Used in Social networks, recommendation engines, and network/IT operations.
v) Wide-Column Stores:
 Similar to column-family stores, but with more flexibility in terms of column families and schema design.
 Ex: Google Bigtable
 Used in Large-scale data storage, real-time analytics, and IoT applications.
vi) Object Stores:
 Store data as objects, which include the data itself, metadata, and a unique identifier.
 Ex: Amazon S3
 Used in Media storage, backups, and unstructured data management.
vii) Search Engines:
 Specialized databases designed for searching text and performing complex queries on large datasets.
 Ex: Elasticsearch
 Used in Full-text search, log analysis, and big data exploration.

You might also like