0% found this document useful (0 votes)
289 views

Two Mark Questions DBMSD

Fg

Uploaded by

Vasantha Kumari
Copyright
© © All Rights Reserved
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
289 views

Two Mark Questions DBMSD

Fg

Uploaded by

Vasantha Kumari
Copyright
© © All Rights Reserved
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 24

DATABASE MANAGEMENT SYSTEMS

Two Mark Questions


Unit I- Introduction
1. Define database
A database is a collection of information that is organized so that it can easily be accessed, managed,
and updated.

2. Define DBMS
A database-management system (DBMS) is a collection of interrelated data and a set of programs to
access those data. The collection of data, usually referred to as the database, contains information
relevant to an enterprise. The primary goal of a DBMS is to provide a way to store and retrieve
database information that is both convenient and efficient.

3. What are the applications of database systems?


• Banking:
• Airlines:
• Universities:
• Telecommunication:
• Finance:
• Sales:
• Manufacturing:
• Human resources:

4. What is the difference between file systems and database systems? (or)
Compare file systems and database systems (or)What are the advantages of database
systems?
 Data redundancy and inconsistency
 Difficulty in accessing data
 Data isolation
 Integrity problems
 Atomicity problems
 Concurrent-access anomalies
 Security problems

5. Define atomicity
Atomicity states that database modifications must follow an “all or nothing” rule. Each transaction is
said to be “atomic.” If one part of the transaction fails, the entire transaction fails.

6. What are the three levels of abstraction?


 Physical level
 Logical level
 View level

7. Define the two levels of data independence.


 Physical data independence
 Logical data independence

8. Define instance.
The collection of information stored in the database at a particular moment is called an instance of
the Database.
1
9. Define schema
The overall design of the database is called the database schema. Schemas are changed infrequently,
if at all.

10. Define data model


Data model is a collection of conceptual tools for describing data, data relationships, data semantics,
and consistency constraints.

11. Name the types of data model


 The Entity-Relationship Model
 Relational model
 Object oriented data model
 Semi structured model
 Hierarchical data model
 Network Model

12. What are the two types of database languages?


 Data definition language
 Data manipulation language

13. Define data dictionary


A data dictionary contains metadata—that is, data about data. The schema of a table is an example
of metadata. A database system consults the data dictionary before reading or modifying actual
data.

14. Define DDL


A database schema by a set of definitions expressed by a special language called a data-definition
language (DDL).

15. Define DML


A data-manipulation language (DML) is a language that enables users to access or manipulate data
as organized by the appropriate data model. There are basically two types:
• Procedural DMLs require a user to specify what data are needed and how to get those data.
(Eg: Relational Algebra)
• Declarative DMLs (also referred to as nonprocedural DMLs) require a user to specify what
data are needed without specifying how to get those data.(Eg: Tuple Relational Calculus & Domain
Relational Calculus)

16. Define query language


A query is a statement requesting the retrieval of information. The portion of a DML that involves
information retrieval is called a query language.

17. What are the different types of users work with the database system?
 Naive users
 Application programmers
 Sophisticated users
 Specialized users

2
18. What is the use of naïve users?
Naive users are unsophisticated users who interact with the system by invoking one of the
application programs that have been written previously. Naive users may also simply read reports
generated from the database.

19. Define Sophisticated users


Sophisticated users interact with the system without writing programs. Instead, they form their
requests in a database query language. They submit each such query to a query processor, whose
function is to break down DML statements into instructions that the storage manager understands.

20. Compare 2-tier and 2-tier architecture

Difference 2-tier 2-tier


Definition The application is partitioned into a The client machine acts as merely a
component that resides at the client front
machine, which invokes database end and does not contain any direct
system functionality at the server database calls. Instead, the client end
machine through query language communicates with an application
statements. server, usually through a forms
interface. The application server in turn
communicates with a database system
to access data.
Example ODBC and JDBC Business logic of the application,WWW
Diagram

21. Define Specialized users


Specialized users are sophisticated users who write specialized database applications that do not fit
into the traditional data-processing framework.

22. What is the role of DBA(Database Administrator)?


 Schema definition
 Storage structure and access-method definition
 Schema and physical-organization modification
 Granting of authorization for data access
 Routine maintenance

3
23. Define E-R Model
Entity-Relationship (ER) model, a high-level data model that is useful in developing a conceptual
design for a database. Creation of an ER diagram, which is one of the first steps in designing a
database, helps the designer(s) to understand and to specify the desired components of the
database and the relationships among those components. An ER model is a diagram containing
entities or "items", relationships among them, and attributes of the entities and the relationships.

24. Define entity


An entity is a “thing” or “object” in the real world that is distinguishable from all other objects. For
example, each person in an enterprise is an entity.

25. Define entity set


An entity set is a set of entities of the same type that share the same properties, or attributes. The
set of all persons who are customers at a given bank, for example, can be defined as the entity set
customer.

26. What are the attributes characterized in E-R model?

Attribute Definition Example


Simple and Simple- the attributes have Simple- Register_number
composite been simple; that is, they are not Composite – name(first-name, middle-
attributes divided into subparts. initial, and last-name)
Composite - Attributes, can be divided
into subparts.
Single-valued Single-valued - single value for a Single-valued – Register_Number
and particular entity. Multivalued attributes- Phone Number,
multivalued Multivalued attributes- Multiple values Account Number
attributes for a particular entity.

Derived The value for this type of attribute can Calculate Age from date_of_Birth
attribute be derived from
the values of other related attributes or
entities.

27. Define relationship


A relationship is an association among several entities.

28. Define relationship set


A relationship set is a set of relationships of the same type. Formally, it is a mathematical relation on
n ≥ 2 (possibly nondistinct) entity sets. If E1, E2, . . .,En are entity sets, then a relationship set R is a
subset of

where (e1, e2, . . . , en) is a relationship.

29. Define Mapping cardinalities. What are its types?


Mapping cardinalities, or cardinality ratios, express the number of entities to which another entity
can be associated via a relationship set. For a binary relationship set R between entity sets A and B,
the mapping cardinality must be one of the following:
4
• One to one.
• One to many.
• Many to one.
• Many to many.

30. Give example for one to one and one to many relationships.
One to one
 One employee belongs to one organization.
 One dog belongs to one person (or one family).
 One person has one passport.
 A car model is made by one company.
One to many
 A car and its parts. Each part belongs to one car and one car has multiple parts.
 A movie theater and screens. One theatre usually has multiple screens and each screen
belongs to one theatre.
 An ERD and its tables. An entity-relationship diagram has one or more tables and each of
thos tables belongs to one diagram.
 Houses in a street. One street had multiple houses and a house belongs to one street.

31. Define Keys


A key allows us to identify a set of attributes that suffice to distinguish entities from each other. Keys
also help uniquely identify relationships, and thus distinguish relationships from each other.

32. Define super key


A super key is defined in the relational model as a set of
attributes of a relation variable for which it holds that in
all relations assigned to that variable there are no two distinct
tuples (rows) that have the same values for the attributes in this set.

33. Define candidate key


A candidate key of a relationship is a set of attributes of that relationship such that there are no two
distinct tuples with the same values for these attributes. In simple example candidate key is a
minimal super key, i.e. a super key of which no proper subset is also a super key.

34. Define primary key


The primary key of a relational table uniquely identifies each record in the table. It can either be a
normal attribute that is guaranteed to be unique (such as Social Security Number in a table with no
more than one record per person) or it can be generated by the DBMS (such as a globally unique
identifier, or GUID, in Microsoft SQL Server). Primary keys may consist of a single attribute or
multiple attributes in combination.

35. What is the use of E-R diagram?


An entity-relationship (ER) diagram is a specialized graphic that illustrates the interrelationships
between entities in a database. ER diagrams often use symbols to represent three different types of
information. Boxes are commonly used to represent entities. Diamonds are normally used to
represent relationships and ovals are used to represent attributes.

5
36. Define weak entity sets
An entity set may not have sufficient attributes to form a primary key. Such an entity set is termed a
weak entity set.

37. Define strong entity sets


An entity set that has a primary key is termed a strong entity set.

38. Define Specialization


The process of designating sub groupings within an entity set is called specialization.

39. Define Generalization


Generalization, which is a containment relationship that exists between a higher-level entity set and
one or more lower-level entity sets.

40. Define Attribute Inheritance


A crucial property of the higher- and lower-level entities created by specialization and
generalization is attribute inheritance. The attributes of the higher-level entity sets are said to be
inherited by the lower-level entity sets.

41. Define aggregation


Aggregation is an abstraction through which relationships are treated as higher level entities.

42. What are the disadvantages of E-R model?


 Limited constraint representation
 Limited relationship representation
 No representation of data manipulation
 Loss of information

43. What are the advantages of E-R model?


 It is easy and simple to understand with minimal training. Therefore the model can be used
by the database designer to communicate design to the end user.
 It has explicit linkages between entities
 It is possible to find a connection from one node to all the other nodes.

6
44. What are the features of E-R model?
 It has a high degree of data independence and seeks to remove redundancy in data
representation based on mathematical theorem.
 The ER model is a top-down approach in system design
 It can be used as a basis for the unification of different views of data such as; network model,
relational or entity modeling
 It was developed after the relational database when the industry shifted its attention to
transaction processing

45. Define cardinality.


Cardinality: the cardinality of a relationship is the actual number of related occurrences for each of
the related entities. The connectivity of a relationship describes the mapping of associated entity
instances in the relationship. The values of connectivity are "one" or "many". The basic types of
connectivity for relations are: one-to-one, one-to-many, and many-to-many. Many-to-many
relationships cannot be translated to relational tables; these relations must be translated into two
or more one-to-many relationships.

46. Why is it better to use the n-ary relationship over the binary relationship?
The n-ary relation can accept more entity relations than binary which only accepts two. An n-ary
relationship set shows more clearly that several entities participate in a single relationship.

47. What is the difference between a single inheritance and multiple inheritances?
Single inheritance is when a given entity set is involved in a lower entity set in only one ISA
relationship. A multiple inheritance is when it is involved in more than one ISA relationship.

48. Define UML.


Unified Modeling Language (UML), is a proposed standard for creating specifications of various
components of a software system. Some of the parts of UML are:
• Class diagram.
• Use case diagram.
• Activity diagram.
• Implementation diagram.

Unit2- Relational Model

1. Define relational model.


Relational model is most widely used data model for commercial data-processing. The reason it’s
used so much is, because it’s simple and easy to maintain. The model is based on a collection of
tables. Users of the database can create tables, insert new tables or modify existing tables.

2. Define relational databases.


A relational database consists of a collection of tables, each of which is assigned a unique name. A
row in a table represents a relationship among a set of values. Since a table is a collection of such
relationships, there is a close correspondence between the concept of table and the mathematical
concept of relation, from which the relational data model takes its name.

7
3. Define foreign key.
A foreign key is a field in a relational table that matches the primary key column of another table.
The foreign key can be used to cross-reference tables.

4. What is the schema diagram?


A database schema, along with primary key and foreign key dependencies, can be depicted
pictorially by schema diagrams.

5. Define relational algebra


The relational algebra is a procedural query language. It consists of a set of operations that take one
or two relations as input and produce a new relation as their result.

6. What are the basic operations of relational algebra?


 Select
 Project
 Union
 set difference
 Cartesian product
 Rename
Note: The first three (unary operations & the remaining three are binary operation)

7. Define schema diagram


Schema diagram is defined as the pictorial representation of a database schema, along with primary
key and foreign key dependencies.
.
8. Define Relational algebra operations (selection, projection, union, set difference, Cartesian
with an example.
Explain the operations with example.

9. Define Views.
Any relation that is not part of the logical model, but is made visible to a user as a virtual relation, is
called views. The syntax for view is,
CREATE OR REPLACE VIEW <view_name> AS
SELECT <column_name>
FROM <table_name>;

8
10. Define Materialized Views
Database systems allow view relations to be stored, but they make sure that, if the actual relations
used in the view definition change, the view is kept up to date. Such views are called materialized
views.

11. What are the conditions used to updating an view?


 In view definition, from clause has only one table
 Select clause contains only the attribute name but does not have any expression, aggregation
function
 Any attribute in the select clause (not listed) can be set to NULL
 The query does not have a group by or having clause

12. What are the problems occurred when updating a view?


1. View Maintenance describes the problem of maintaining a materialized view while updating
the source database(s). Updates to the source database are either immediately propagated to
the view or are accumulated over time and the view is updates in frequent intervals (for
instance, during night).
2. View Update is the problem of propagating updates to the view to the source database. An
updatable view must have certain characteristics -- i.e. the view cannot be defined by an
arbitrary query on the source database.

13. Define Tuple Relational calculus (TRL).


The tuple relational calculus, is a nonprocedural query language. It describes the desired
information without giving a specific procedure for obtaining that information. A query in the tuple
relational calculus is expressed as
{t | P(t)}
that is, it is the set of all tuples t such that predicate P is true for t.

14. Define Domain Relational calculus(DRC)


Domain relational calculus, uses domain variables that take on values from an attributes domain,
rather than values for an entire tuple. An expression in the domain relational calculus is of the form
{< x1, x2, . . . , xn > | P(x1, x2, . . . , xn)}
where x1, x2, . . . , xn represent domain variables. P represents a formula composed of atoms, as was
the case in the tuple relational calculus.

15. Define referential integrity.


Referential integrity is a database concept that ensures that relationships between tables remain
consistent. When one table has a foreign key to another table, the concept of referential integrity
states that you may not add a record to the table that contains the foreign key unless there is a
corresponding record in the linked table.

16. Define Integrity constraints


Integrity constraints provide a mechanism for ensuring that data conforms to guidelines specified
by the database administrator. The constraints available in SQL are Foreign Key, Not Null, Unique,
Check.
Constraints can be defined in two ways
1) The constraints can be specified immediately after the column definition. This is called
column-level definition.
2) The constraints can be specified after all the columns are defined. This is called table-level
definition.
9
17. Define Domain Integrity.
The domain integrity states that every element from a relation should respect the type and
restrictions of it's corresponding attribute. A type can have a variable length which needs to be
respected. Restrictions could be the rage of a value that the element can have, the default value if
none is provided and if the element can be NULL.

18. What are the four broad categories of constraints?


 Domain Constraints
 Referential Integrity
 Assertions
 Triggers

19. What are primary key constraints?


 The PRIMARY KEY constraint uniquely identifies each record in a database table.
 Primary keys must contain unique values.
 A primary key column cannot contain NULL values.
 Each table should have a primary key, and each table can have only ONE primary key.
Example:
CREATE TABLE Persons (
P_Id int NOT NULL,
LastName varchar(255) NOT NULL,
FirstName varchar(255),
Address varchar(255),
City varchar(255),
PRIMARY KEY (P_Id)
)

20. What are the categories of SQL command?


SQL commands are divided in to the following categories:
 Data - definition language
 Data manipulation language
 Data query language
 Data control language
 Data administration statements
 Transaction control statements

21. What are aggregate functions?


The SQL Aggregate Functions are functions that provide mathematical operations. If you need to
add, count or perform basic statistics, these functions will be of great help. The functions include:
 count() - counts a number of rows
 sum() - compute sum
 avg() - compute average
 min() - compute minimum
 max() - compute maximum

22. What do you mean by triggers?


It defines the actions to be executed automatically when certain events occur and corresponding
conditions are satisfied. Triggers can be written for the following purposes:
 Generating some derived column values automatically
10
 Enforcing referential integrity
 Event logging and storing information on table access
 Auditing
 Synchronous replication of tables
 Imposing security authorizations
 Preventing invalid transactions

23. Define authentication.


Authentication is any process by which you verify that someone is who they claim they are.

24. Define authorization.


The process of giving individuals access to system objects based on their identity.

25. Define embedded SQL


Embedded SQL is defined as the process of embedding the SQL within procedural programming
languages. These language (sometimes referred to as 3GLs) include C/C++, Cobol, Fortran, and Ada.
Thus the embedded SQL provides the 3GL with a way to manipulate a database, supporting:
􀂄 highly customized applications
􀂄 background applications running without user intervention
􀂄 database manipulation which exceeds the abilities of simple SQL
􀂄 applications linking to Oracle packages, e.g. forms and reports
􀂄 Applications which need customized window interfaces

26. What are the advantages and disadvantages of embedded SQL?


Advantages:
 Small footprint database
 High performance
 Extensive SQL support
Disadvantages:
 Knowledge of C or C++ required
 Complex development model
 SQL must be specified at design time

27. Define dynamic SQL.


It allows programs to construct and submit SQL queries at run time. Dynamic SQL statements are
stored as strings of characters that are entered when the program runs. They can be entered by the
programmer or generated by the program itself, but unlike static SQL statements, they are not
embedded in the source program. Also in contrast to static SQL statements, dynamic SQL statements
can change from one execution to the next.

28. Define cursors


A mechanism for retrieving rows from the database one at a time

29. Define distributed databases.


It consists of a collection of sites, connected together via some kind of communication network in
which
 Each site is a full database system site on its own right,
 The sites have agreed to work together so that a auser at any site can access data anywhere
in the network exactly as if the data were all stored at the user’s own site

11
30. What are the advantages of distributed databases?
1. Local autonomy
2. No reliance on a central site (bottleneck, vulnerability)
3. Continuous operation (reliability, availability)
4. Location independence
5. Fragmentation independence
6. Replication independence
7. Distributed Query Processing (optimization)
8. Distributed Transaction Management (concurrency, recovery)
9. Hardware independence
10. OS independence
11. Network independence
12. DBMS independence

31. Define client server databases.


A client server databases consists of three primary software components (aside from the network
software and operating systems of the computers in question): the client application (also called the
front end), the data access layer (also called middleware), and the database server (also called a
database engine, DBMS, data source, or back end).

32. What are the advantages of client server databases?


 Data sharing
 Integrity services
 Data interchangeability
 Location Independence of Data and processing

33. What are the disadvantages of client server databdases?


 Traffic Congestion
 Robustness

34. Define Encryption.


Encryption is the process of transforming information (referred to as plaintext) using an algorithm
(called cipher) to make it unreadable to anyone except those possessing special knowledge, usually
referred to as a key.

UNIT –III
1. Define normalization
Database normalization is the process of removing redundant data from the database to improve
storage efficiency, data integrity, and scalability. Normalization generally involves splitting existing
tables into multiple ones, which must be re-joined or linked each time a query is issued.

2. List out the drawbacks of Redundant Information


 Wastage of Storage
 Causes problems with update anomalies
 Insertion anomalies
 Deletion anomalies
 Modification anomalies

3. List out the advantages of normalization.


Less storage space
12
Quicker updates
Less data inconsistency
Clearer data relationships
Easier to add data
Flexible Structure

4. Define functional dependency.


An attribute Y is said to have a functional dependency on a set of attributes X (written X →Y) if and
only if each X value is associated with precisely one Y value.

5. Define Trivial functional dependency


A trivial functional dependency is a functional dependency of an attribute on a superset of itself.
{Ssn,Pnumber} -> {Hours} Trivial
{Ssn} -> {Ename } Non trivial

6. Define Full functional dependency


An attribute is fully functionally dependent on a set of attributes X if it is
 functionally dependent on X, and
 not functionally dependent on any proper subset of X.
{Ssn,Pnumber} -> {Hours}

7. Define Transitive dependency


A transitive dependency is an indirect functional dependency, one in which X→Z only by virtue of
X→Y and Y→Z.

8. Define Multi valued dependency


A multi valued dependency is a constraint according to which the presence of certain rows in a table
implies the presence of certain other rows.

9. Define Join dependency


A table T is subject to a join dependency if T can always be recreated by joining multiple tables each
having a subset of the attributes of T.

10. What are the Inference Rules for FDs?


(Reflexive) If Y subset-of X, then X -> Y
(Augmentation) If X -> Y, then XZ -> YZ
(Transitive) If X -> Y and Y -> Z, then X -> Z
Decomposition: If X -> YZ, then X -> Y and X -> Z
Union: If X -> Y and X -> Z, then X -> YZ
Pseudo transitivity: If X -> Y and WY -> Z, then WX -> Z

13
11. Define First Normal Form.
A relation is said to be in First Normal Form (1NF) if and only if each attribute of the relation is
atomic.
It does not allows the composite and multi valued attributes.

12. Define second normal form.


A relation schema R is in second normal form (2NF) if a relation in 1NF and every non key attribute
A in R is fully functionally dependent on the primary key.

13. Define Third Normal Form.


A relation schema R is in third normal form (3NF) if a table is in second normal form (2NF) and
there are no transitive dependencies.

14. Define BCNF.


A relation is in BCNF, if and only if, every determinant is a candidate key.

15. Compare 3NF & BCNF.

3NF BCNF

A relation schema R is in 3NF if for every nontrivial A relation schema R is in Boyce- Codd Normal
FD X-> Y in R, X is not a candidate key Form (BCNF) if for every nontrivial FD X-> Y in
R, X is a candidate key
3NF has some redundancy BCNF removes all redundancies caused by
FD’s
Performance is Lesser than BCNF Better Performance than 3NF

16. Define multivalued dependency.


A multivalued dependency on R, X ->>Y, says that if two tuples of R agree on all the attributes of X,
then their components in Y may be swapped, and the result will be two tuples that are also in the
relation.

17. Define fourth normal form.


A relation R is in 4NF if and only if, for every one of its non-trivial multivalued dependencies XààY,
X is a superkey—that is, X is either a candidate key or a superset thereof.

18. Define fifth normal form.


An entity is in Fifth Normal Form (5NF) if, and only if, it is in 4NF and every join dependency for the
entity is a consequence of its candidate keys.

19. Define domain key normal form (DKNF).


A relation is in DK/NF if every constraint on the relation is a logical consequence of the definition of
keys and domains.

20. Define join dependency.


A join dependency is a constraint on the set of legal relations over a database scheme. A table T is
subject to a join dependency if T can always be recreated by joining multiple tables each having a
subset of the attributes of T. If one of the tables in the join has all the attributes of the table T, the

14
join dependency is called trivial. The join dependency plays an important role in the Fifth normal
form, also known as project-join normal form.

21. Define Lossless join decomposition.


 A decomposition of R into (X, Y) and (X, R-Y) is a A decomposition of R into (X, Y) and (X, R-Y)
is a lossless-join decomposition if and only if X ->> Y holds in R, if and only if X ->> Y holds
in R.

UNIT IV
1. Define transaction.
A transaction is a unit of program execution that accesses and possibly updates various data items.
A transaction must see a consistent database.

2. List the SQL statements used for transaction control.


 COMMIT: to save the changes.
 ROLLBACK: to roll back the changes.
 SAVEPOINT: creates points within groups of transactions in which to ROLLBACK
 SET TRANSACTION: Places a name on a transaction.

3. What are the transaction states?


 Active, the initial state; the transaction stays in this state while it is executing.
 Partially committed, after the final statement has been executed
 Failed, after the discovery that normal execution can no longer proceed.
 Aborted, after the transaction has been rolled backed and the database has been restored to
its state prior to the start of transaction.
 Committed, after successful completion

4. Define ACID properties.


 Atomicity. Either all operations of the transaction are properly reflected in the database or
none are.
 Consistency. Execution of a transaction in isolation preserves the consistency of the
database.
 Isolation. Although multiple transactions may execute concurrently, each transaction must
be unaware of other concurrently executing transactions.
 That is, for every pair of transactions Ti and Tj, it appears to Ti that either Tj, finished
execution before Ti started, or Tj started execution after Ti finished.
 Durability. After a transaction completes successfully, the changes it has made to the
database persist, even if there are system failures.

5. List the advantages of concurrency.


 increased processor and disk utilization
 reduced waiting time

6. List the two commonly used Concurrency Control techniques.


 Lock-Based Protocols
 Timestamp-Based Protocols
 Validation-Based Protocols
 Multiple Granularity

15
7. Define schedule.
Schedules – sequences that indicate the chronological order in which instructions of concurrent
transactions are executed
 a schedule for a set of transactions must consist of all instructions of those
transactions
 must preserve the order in which the instructions appear in each individual
transaction.

8. Define serializability.
Serializability is the classical concurrency scheme. It ensures that a schedule for executing
concurrent transactions is equivalent to one that executes the transactions serially in some order. It
assumes that all accesses to the database are done using read and write operations.

9. Define serial schedule.


A schedule that considers all the actions of a transaction T1, followed by all the actions of another
transaction T2 and so on is called serial schedule.

10. What are the types of serializability?


 conflict serializability
 view serializability

11. Define conflict serializability.


A schedule S is conflict serializable if it is conflict equivalent to a serial schedule. If a schedule S can
be transformed into a schedule S´ by a series of swaps of non-conflicting instructions, we say that S
and S´ are conflict equivalent.

12. Define view serializability


A schedule S is view serializable it is view equivalent to a serial schedule. Every conflict
serializable schedule is also view serializable.
16
13. Define recoverable schedule.
Recoverable schedule is the one where for each pair of transactions Ti and Tj such that Tj reads a
data item previously written by Ti, the commit operation of Ti appears before the commit operation
of Tj.

14. Define cascading rollback.


An uncommitted transaction will be rolled back because of the failure of the first transaction, from
which other transactions reads the data item. This phenomenon of wasting the desirable amount of
work is called cascading rollback.

15. What is blind write?


If a transaction writes a data item without reading the data is called blind write.
This sometimes causes inconsistency.

16. Define lock. What is the use of locking?


A lock is a mechanism to control concurrent access to a data item. It is used to prevent concurrent
transactions from interfering with one another and enforcing an additional condition that
guarantees serializability.

17. What is shared lock and Exclusive lock?


Shared lock allows other transactions to read the data item and write is not allowed. Exclusive lock
allows both read and write of data item and a single transaction exclusively holds the lock.

18. What are the pitfalls (problems) of lock based protocols?


 Deadlock
 Starvation

19. What are the three kinds of intent locks?


 intention-shared (IS): indicates explicit locking at a lower level of the tree but only
with shared locks.
 intention-exclusive (IX): indicates explicit locking at a lower level with exclusive or
shared locks
 shared and intention-exclusive (SIX): the subtree rooted by that node is locked
explicitly in shared mode and explicit locking is being done at a lower level with
exclusive-mode locks.

20. What is called as a time stamp?


A time stamp is a unique identifier for each transaction generated by the system. Concurrency
control protocols use this time stamp to ensure serializability.

21. When does a deadlock occur?

17
Deadlock occurs when one transaction T in a set of two or more transactions is waiting for some
item that is locked by some other transaction in the set.

22. What is meant by transaction rollback?


If a transaction fails for reasons like power failure, hardware failure or logical error in the
transaction after updating the database, it is rolled back to restore the previous value.

23. What are the objectives of concurrency control?


 To be resistant to site and communication failure.
 To permit parallelism to satisfy performance requirements.
 To place few constraints on the structure of atomic actions.

24. What is replication?


The process of generating and reproducing multiple copies of data at one or more
sites is called replication.

25. What are the two phases available in two phase locking protocol?
Phase 1: Growing Phase
 transaction may obtain locks
 transaction may not release locks
Phase 2: Shrinking Phase
 transaction may release locks
 transaction may not obtain locks

26. What is strict & rigorous two phase locking protocol?


 Strict two-phase locking. A transaction must hold all its exclusive locks till it commits/aborts.
 Rigorous two-phase locking is even stricter: here all locks are held till commit/abort. In this
protocol transactions can be serialized in the order in which they commit.

27. What benefit does strict two phase locking provides? What are its disadvantages?
Cascading rollbacks can be avoided by a modification of two-phase locking called the strict two-
phase locking protocol. This protocol requires not only that locking be two phase, but also that all
exclusive-mode locks taken by a transaction be held until that transaction commits. This
requirement ensures that any data written by an uncommitted transaction are locked in exclusive
mode until the transaction commits, preventing any other transaction from reading the data.
ADVANTAGE: It produces only cascade less schedules, recovery is very easy.
DISADVANTAGE: The set of schedules obtainable is a subset of those obtainable

28. Define upgrading & downgrading.


Upgrading -> convert shared lock to exclusive lock in growing phase
Downgrading -> convert exclusive lock to shared lock in growing phase

29. What is the role of lock manager?


A Lock manager can be implemented as a separate process to which transactions send lock and
unlock requests. The lock manager replies to a lock request by sending a lock grant messages. The
requesting transaction waits until its request is answered. The lock manager maintains a data
structure called a lock table to record granted locks and pending requests.

30. Define graph based protocol.


 Graph-based protocols are an alternative to two-phase locking
18
 Impose a partial ordering ® on the set D = {d1, d2 ,..., dh} of all data items.
o If di ® dj then any transaction accessing both di and dj must access di before
accessing dj.
o Implies that the set D may now be viewed as a directed acyclic graph, called a
database graph.
 The tree-protocol is a simple kind of graph protocol.

31. List down the SQL facilities for concurrency.


 READ UNCOMMITTED: permits dirty reads, non-repeatable reads, phantoms
 READ COMMITTED: permits non repeatable reads and phantoms, but prohibits dirty
reads
 REPEATABLE READ: permits phantoms, but not dirty reads, nor non repeatable reads
 SERIALIZABLE: true two phase locking, it keeps all transactions serializable

32. Define thomson’s write rule.


A transaction Ti issues write(Q).
 If TS(Ti) < R-timestamp(Q), then the value of Q that Ti is producing was needed previously,
and the system assumed that that value would never be produced. Hence, the write
operation is rejected, and Ti is rolled back.
 If TS(Ti) < W-timestamp(Q), then Ti is attempting to write an obsolete value of Q. Hence, this
write operation can be ignored
 Otherwise, the write operation is executed, and W-timestamp(Q) is set to TS(Ti).

33. What are the phases in validation based protocol?


1. Read phase. During this phase, the system executes transaction Ti. It reads the values of
the various data items and stores them in variables local to Ti. It performs all write operations on
temporary local variables, without updates of the actual database.
2. Validation phase. Transaction Ti performs a validation test to determine whether it can
copy to the database the temporary local variables that hold the results of write operations without
causing a violation of serializability.
3. Write phase. If transaction Ti succeeds in validation (step 2), then the system applies the
actual updates to the database. Otherwise, the system rolls backTi.

34. Define multiple granularities.


The multiple-granularity locking protocol, which ensures serializability, is this: Each transaction Ti
can lock a node Q by following these rules:
 It must observe the lock-compatibility function of Figure 16.17.
 It must lock the root of the tree first, and can lock it in any mode.
 It can lock a node Q in S or IS mode only if it currently has the parent of Q locked in either IX
or IS mode.
 It can lock a node Q in X, SIX, or IX mode only if it currently has the parent of Q locked in
either IX or SIX mode.
 It can lock a node only if it has not previously unlocked any node (that is, Ti is two phase).
 It can unlock a node Q only if it currently has none of the children of Q locked.

35. What are the various deadlock prevention technique?


 wait-die scheme — non-preemptive
 wound-wait scheme — preemptive
 Timeout-Based Schemes

19
36. Define log.
A log is kept on stable storage. The log is a sequence of log records, and maintains a record of
update activities on the database.

37. What is a checkpoint?


Check-pointing is saving enough state of a process so that the process can be restarted at the point
in the computa-tion where the checkpoint was taken.

 T1 can be ignored (updates already output to disk due to checkpoint)


 T2 and T3 redone.
 T4 undone

38. What are the concurrency problems are there?


 Lost updates
 Access to uncommitted data
 Non repeatable reads
 Phantom Read Phenomenon

39. List the SQL facilities for recovery.


 Backup
 RESTORE HEADER ONLY - gives you a list of all the backups in a file
 RESTORE LABEL ONLY - gives you the backup media information
 RESTORE FILELIST ONLY - gives you a list of all of the files that were backed up for a give
backup
 RESTORE DATABASE - allows you to restore a full, differential, file or filegroup backup
 RESTORE LOG - allows you to restore a transaction log backup
 RESTORE VERIFY ONLY - verifies that the backup is readable by the RESTORE process

UNIT V
1. What are the various physical storage media?

2. Describe flash memory.

20
 Flash memory is also known as electrically erasable programmable read-only memory
 (EEPROM).
 Reading data from flash memory takes less than 100 nanoseconds (a nanosecond is 1/1000
of a microsecond), which is roughly as fast as reading data from main memory.
 Writing data to flash memory is more complicated—data can be written once, which takes
about 4 to 10 microseconds, but cannot be overwritten directly.

3. What are the drawbacks of flash memory?


o Writing data to flash memory is more complicated—data can be written once, which
takes about 4 to 10 microseconds, but cannot be overwritten directly. To overwrite
memory that has been written already, we have to erase an entire bank of memory at
once; it is then ready to be written again.
o It can support only a limited number of erase cycles, ranging from 10,000 to 1 million.

4. Define access time, seek time, rotational latency, data transfer rate & mean time to failure
Access time – the time it takes from when a read or write request is issued to when data transfer
begins. Consists of:
Seek time – time it takes to reposition the arm over the correct track.
Rotational latency – time it takes for the sector to be accessed to appear under the head.
Data-transfer rate – the rate at which data can be retrieved from or stored to the disk.
Mean time to failure (MTTF) – the average time the disk is expected to run continuously without
any failure.

5. Define mirroring or shadowing.


Duplicate every disk. Logical disk consists of two physical disks. Every write is carried out on both
disks. Reads can take place from either disk. If one disk in a pair fails, data still available in the other.
Data loss would occur only if a disk fails, and its mirror disk also fails before the system is repaired.

6. How to measure the performance of RAID levels?


 Monetary cost
 Performance: Number of I/O operations per second, and bandwidth during normal operation
 Performance during failure
 Performance during rebuild

7. What is RAID?
RAID stands for Redundant Array of Inexpensive Disks. RAID is the organization of multiple disks
into a large, high performance logical disk. Disk arrays stripe data across multiple disks and access
them in parallel to achieve:
 Higher data transfer rates on large data accesses and
 Higher I/O rates on small data accesses.
Data striping also results in uniform load balancing across all of the disks, eliminating hot spots that
otherwise saturate a small number of disks, while the majority of disks sit idle.

8. What is the need for RAID?


 Monetary cost of extra disk storage requirements
 Performance requirements in terms of number of I/O operations
 Performance when a disk has failed
 Performance during rebuild (that is, while the data in a failed disk is being rebuilt on a new
disk)

21
9. What are the various file organizations?
 Heap – a record can be placed anywhere in the file where there is space
 Sequential – store records in sequential order, based on the value of the search key of each
record
 Hashing – a hash function computed on some attribute of each record; the result specifies in
which block of the file the record should be placed

10. Define data dictionary.


Data dictionary (also called system catalog) stores metadata: that is, data about data, such as
 Information about relations
 User and accounting information, including passwords
 Statistical and descriptive data
 Physical file organization information
 Information about indices

11. Define index.


An index that provides an alternate method of accessing records or portions of records in a data
base or file.

12. What are the types of indexing?


• Ordered index (Primary index or clustering index) – which is used to access data sorted by
order of values.
• Hash index (secondary index or non-clustering index ) - used to access data that is
distributed uniformly across a range of buckets.

13. What are Ordered Indices?


1. In order to allow fast random access, an index structure may be used.
2. A file may have several indices on different search keys.
3. If the file containing the records is sequentially ordered, the index whose search key specifies
the sequential order of the file is the primary index, or clustering index. Note: The search
key of a primary index is usually the primary key, but it is not necessarily so.
4. Indices whose search key specifies an order different from the sequential order of the file are
called the secondary indices, or nonclustering indices.

14. What are the factors that should be considered while choosing indexing methods?
• access type
• access time
• insertion time
• deletion time
• space overhead

15. What are the types of ordered indices?


• Dense index - an index record appears for every search-key value in the file.
• Sparse index - an index record that appears for only some of the values in the file.

16. Compare dense index and sparse index.

Dense index Sparse index

22
 An index record appears for every 􀂄 Index records are created only for some of the
search key value in file. records.
 This record contains search key value 􀂄 To locate a record, we find the index record
and a pointer to the actual record. with the largest search key value less than or
equal to the search key value we are looking for.

Faster Slower

17. Define primary and secondary indices.


The primary index defines the physical organization of the records in the database. Each db has one
and only one primary index. In addition, it also have any number of secondary indices.

The secondary indices provide alternate access paths to the data by allowing different fields in the
record to be used as index keys. Each secondary index is stored in a separate area with its own
storage allocation, and any number of secondary indices can be dynamically created and deleted.
Index names for a file must be unique.

18. What are the advantages of indexed sequential file?


Advantages of Indexed Sequential Files
1. Allows records to be accessed directly or sequentially.
2. Direct access ability provides vastly superior (average) access times.
Disadvantages of Indexed Sequential Files
1. The fact that several tables must be stored for the index makes for a considerable storage
overhead.
2. As the items are stored in a sequential fashion this adds complexity to the addition/deletion
of records. Because frequent updating can be very inefficient, especially for large files, batch
updates are often performed.

19. When is it preferable to use dense index rather than sparse index? Explain your answer.
Dense index: An index record appears for every search-key value in the file. In a dense primary
index, the index record contains the search-key value and a pointer to the first data record with that
search-key value. The rest of the records with the same search key-value would be stored
sequentially after the first record, since, because the index is a primary one, records are sorted on
the same search key.
Advantages:
it is generally faster to locate a record.

20. Define multi level indexing.


Multi level indexing is used, if primary index does not fit in memory, access becomes expensive. To
reduce number of disk accesses to index records, treat primary index kept on disk as a sequential
file and construct a sparse index on it.
23
 outer index – a sparse index of primary index
 inner index – the primary index file

21. Define hashing.


Hashing is a method to store data in an array so that storing, searching, inserting and deleting data
is fast. For this every record needs an unique key. The basic idea is not to search for the correct
position of a record with comparisons but to compute the position within the array. The function
that returns the position is called the 'hash function' and the array is called a 'hash table'.

22. What are the types of hashing?


 Static hashing
 Dynamic hashing (Extendable hashing)

23. Compare static hashing & dynamic hashing.


Static Hashing has the number of primary pages in the directory fixed. Thus, when a bucket is full,
we need an overflow bucket to store any additional records that hash to the full bucket. This can be
done with a link to an overflow page, or a linked list of overflow pages. The linked list can be
separate for each bucket, or the same for all buckets that overflow.

In dynamic Hashing, the size of the directory grows with the number of collisions to accommodate
new records and avoid long overflow page chains.

24. What are the steps to be performed in query processing?


1) The scanning, parsing, and validating module produces an internal representation of the query.
2) The query optimizer module devises an execution plan which is the execution strategy to retrieve
the result of the query from the database files. A query typically has many possible execution
strategies differing in performance, and the process of choosing a reasonably efficient one is known
as query optimization. Query optimization is beyond this course
3) The code generator generates the code to execute the plan.
4) The runtime database processor runs the generated code to produce the query result.

25. Define database tuning.


Database Tuning is the process of continuing to revise/adjust the physical database design by
monitoring resource utilization as well as internal DBMS processing to reveal bottlenecks such as
contention for the same data or devices.

--------------- ALL THE BEST -------------

24

You might also like