DBMS
DBMS
Introduction
o What is a database management system?
o Why study databases? Why not use file systems?
o The three-level architecture
o Schemas and instances
Overview
o Data models, E-R model, Relational model
o Data Definition Language, Data Manipulation Language
o SQL
o Transaction Management, Storage Management
o User types, database administrator
o System Structure
Data
Data is raw fact or figures or entity.
When activities in the organization takes place, the effect of these activities need to be recorded
which is known as Data.
Information
Processed data is called information
The purpose of data processing is to generate the information required for carrying out the
business activities.
Database
Database may be defined in simple terms as a collection of data
A database is a collection of related data.
Database Management System
A Database Management System (DBMS) is a collection of program that enables user to create
and maintain a database.
The DBMS is hence a general purpose software system that facilitates the process of defining
constructing and manipulating database for various applications.
History
o 1950s-60s: magnetic tape and punched cards
o 1960s-70s: hard disks, random access, file systems
o 1970s-80s: relational model becoming competitive
o 1980s-90s: relational model dominant, object-oriented databases
o 1990s-00s: web databases and XML
Advantages of DBMS.
Due to its centralized nature, the database system can overcome the disadvantages of the file system-
based system
1. Data independency: Application program should not be exposed to details of data representation
and storage DBMS provides the abstract view that hides these details.
2. Efficient data access: DBMS utilizes a variety of sophisticated techniques to store and retrieve data
efficiently.
3. Data integrity and security: Data is accessed through DBMS, it can enforce integrity constraints.
E.g.: Inserting salary information for an employee.
4. Data Administration: When users share data, centralizing the data is an important task, Experience
professionals can minimize data redundancy and perform fine tuning which reduces retrieval time.
5. Concurrent access and Crash recovery: DBMS schedules concurrent access to the data. DBMS
protects user from the effects of system failure.
1.3 The Levels of Abstraction
Entity-Relationship Model
An entity is a thing or object in the real world that is distinguishable from other objects.
The Entity – Relationship Model is based on a collection of basic objects, called entities, and
the relationship among these objects.
Example of schema in the entity-relationship model
The description of data in terms of tables is called as relations, from the above Customer and
Accounts relations, we can make a condition that customer details are maintained in Customer table
and their deposit details are maintained in the account table database.
Storage Manager
A storage manager is a program module that provides the interface between the low-level data stored in the
database and the application programs and queries submitted to the system.
The storage manager is responsible for storing, retrieving and updating data in the database.
The storage manager components include:
o Authorization and integrity Manager: This tests for the satisfaction of integrity constraints and
checks the authority of users to access data.
o Transaction Manager: This ensures that the database remains in a consistent state despite system
failures, and that concurrent transaction executions proceed without conflicting.
o File Manager: This manages the allocation of space on disk storage and the data structures used to
represent information stored on disk.
o Buffer Manager: This is responsible for fetching data from disk storage into main memory, and
deciding what data to cache in main memory.
The storage manager implements several data structures as part of the physical system implementation:
Database Users
Users are differentiated by the way they expect to interact with the system
Application programmers – interact with system through DML calls
Sophisticated users – form requests in a database query language
Specialized users – write specialized database applications that do not fit into the traditional
data processing framework
Naïve users – invoke one of the permanent application programs that have been written
previously
E.g. people accessing database over the web, bank tellers, clerical staff
Database Administrator
Coordinates all the activities of the database system; the database administrator has a good
understanding of the enterprise’s information resources and needs.
Database administrator's duties include:
o Schema definition
o Storage structure and access method definition
o Schema and physical organization modification
o Granting user authority to access the database
o Specifying integrity constraints
o Acting as liaison with users
o Monitoring performance and responding to changes in requirements
Transaction Management
Transaction-management component ensures that the database remains in a consistent state despite
system failures
Concurrency-control manager controls the interaction among the concurrent transactions, to ensure
the consistency of the database.
o E.g. simultaneous withdrawals
ACID Properties:
A - Atomicity / Accessing the Data
C - Concurrency Access
I - Integrity Problems / Inconsistency
D - Data Redundancy
UNIT - I
ENTITY-RELATIONSHIP MODEL (Part -II)
Topics :
Entity Sets
Relationship Sets
Mapping Constraints
Keys
E-R Diagram
Extended E-R Features
Design of an E-R Database Schema
Reduction of an E-R Schema to Tables
Entity Sets
Attributes
An entity is represented by a set of attributes that is descriptive properties possessed by all members
of an entity set.
Domain – the set of permitted values for each attribute
Attribute types:
Simple and composite attributes.
Simple Attribute
Single-valued and multi-valued attributes
E.g. multi valued attribute: phone-numbers
Derived attributes
Key Attribute
Represents primary key. (main characteristics of an entity). It is an attribute, that has distinct value for
each entity/element in an entity set. For example, Roll number in a Student Entity Type.
Key Attribute
Composite Attributes
Relationships sets is a set of relationships of the same type. It is a mathematical relation on entity sets
(n>=2). Relationship set R is a subset of –
where r1,r2,….rn are called relationships and E1,E2,….E n are entity sets.
The way in which two or more entity types are related is called relation type.
For example, consider a relationship type WORKS_FOR between the two entity types EMPLOYEE and
DEPARTMENT, which associates or links each employee with the department the employee works for.
The WORKS_FOR relation type is shown as –
In the above figure, each instance of relation type WORKS_FOR i.e.(r1, r2,…,r5) is connected to instances
of employee and department entities. Employee e1, e2 and e5 work for department d2 and employee e3 and
e4 work for department d1.
Notation to Represent Relation Type in ER Diagram-
Relation types are represented as diamond shaped boxes.
The above diagram can be read as – a supplier supplies the parts to projects
N-ary Relationship Set – A relationship type of degree n is called n ary relationship . For
example
Role Names-
A relationship type has a name which signifies what role a participating entity plays in that relationship
instance. The role names helps to explain what the relationship means.
In the first example WORKS_FOR relationship type, employee plays the role of worker and department
plays the role of employee(because a department consists of a number of employees.
Recursive Relationship
If the same entity type participate more than once in a relationship type in different roles then such
relationship types are called recursive relationship. For example, in the below figure REPORTS_TO is
a recursive relationship as the Employee entity type plays two roles – 1) Supervisor and 2) Subordinate.
Mapping Cardinalities
Express the number of entities to which another entity can be associated via a relationship set.
Most useful in describing binary relationship sets.
For a binary relationship set the mapping cardinality must be one of the following types:
• Can make access-date an attribute of account, instead of a relationship attribute, if each account can
have only one customer
o I.e., the relationship from account to customer is many to one,
E-R Diagrams
o Composite attributes: The attributes that can be divided into subparts are known as
composite attributes. Ex: name can be divided into first name, middle name and last name.
o Multi valued attributes: The attributes that have many values for a particular entity. Ex:
name. There can be more than one name for customer.
o Derived attribute: The value for this type of attribute can be derived from the values of
other related attributes or entities.
Cardinality Constraints
We express cardinality constraints by drawing either a directed line (), signifying “one,” or an
undirected line (—), signifying “many,” between the relationship set and the entity set.
E.g.: One-to-one relationship:
o A customer is associated with at most one loan via the relationship borrower
o A loan is associated with at most one customer via borrower
One-To-Many Relationship
In the one-to-many relationship a loan is associated with at most one customer via borrower, a
customer is associated with several (including 0) loans via borrower
In a many-to-one relationship a loan is associated with several (including 0) customers via borrower,
a customer is associated with at most one loan via borrower
Many to Many Relationship
Total participation (indicated by double line): every entity in the entity set participates in at least one
relationship in the relationship set
o E.g. participation of loan in borrower is total
Every loan must have a customer associated to it via borrower.
Partial participation: Some entities may not participate in any relationship in the relationship set.
o E.g. participation of customer in borrower is partial
Keys
A super key of an entity set is a set of one or more attributes whose values uniquely determine each
entity.
A candidate key of an entity set is a minimal super key
o Customer-id is candidate key of customer
o account-number is candidate key of account
Although several candidate keys may exist, one of the candidate keys is selected to be the primary
key.
Keys for Relationship Sets
The combination of primary keys of the participating entity sets forms a super key of a relationship
set.
o (customer-id, account-number) is the super key of depositor
o NOTE: this means a pair of entity sets can have at most one relationship in a particular
relationship set.
E.g. if we wish to track all access-dates to each account by each customer, we cannot
assume a relationship for each access. We can use a multi valued attribute though
Must consider the mapping cardinality of the relationship set when deciding the what are the
candidate keys
Need to consider semantics of relationship set in selecting the primary key in case of more than one
candidate key
We allow at most one arrow out of a ternary (or greater degree) relationship to indicate a cardinality
constraint
E.g. an arrow from works-on to job indicates each employee works on at most one job at any branch.
If there is more than one arrow, there are two ways of defining the meaning.
o E.g a ternary relationship R between A, B and C with arrows to B and C could mean
o 1. each A entity is associated with a unique entity from B and C or
o 2. each pair of entities from (A, B) is associated with a unique C entity, and each pair
(A, C) is associated with a unique B
o Each alternative has been used in different formalisms
o To avoid confusion we outlaw more than one arrow
Design Issues
Use of entity sets vs. attributes : Choice mainly depends on the structure of the enterprise being
modeled, and on the semantics associated with the attribute in question.
Use of entity sets vs. relationship sets: Possible guideline is to designate a relationship set to describe
an action that occurs between entities.
Binary versus n-ary relationship sets : Although it is possible to replace any nonbinary (n-ary, for n
> 2) relationship set by a number of distinct binary relationship sets, a n-ary relationship set shows
more clearly that several entities participate in a single relationship.
Placement of relationship attributes
Summary of Symbols Used in E-R Notation
Specialization
we identify distinctive sub-groupings within an entity set These sub-groupings become lower-level
entity sets
They have attributes or participate in relationships that do not apply to the higher-level entity
set
Depicted by a triangle component labeled ISA
E.g. customer “is a” person
Inheritance
a lower-level entity set inherits all the attributes and relationship participation of the higher-
level entity set to which it is linked.
Specialization Example
Generalization
Constraint on whether or not entities may belong to more than one lower-level entity set within a
single generalization.
o Disjoint
an entity can belong to only one lower-level entity set
write disjoint next to the ISA triangle
o Overlapping
an entity can belong to more than one lower-level entity set
Completeness constraint
o Does an entity in the higher-level entity set have to belong to at least one of the lower-level
entity sets?
Total
o an entity must belong to one of the lower-level entity sets
Partial
o an entity need not belong to one of the lower-level entity sets
Aggregation
Faithfulness
o Entities, attributes and relationships should reflect reality
o Sometimes the correct approach is not obvious
E.g. course and instructor entities and teaching relationship
What are the cardinality constraints? It depends…
Avoiding Redundancy
o No information should be repeated
Wastes space, leads to consistency problems
Simplicity
o Some relationships may be unnecessary
E.g. student member-of student-body attends course vs student attends course
Choosing the right kind of element
o The use of an attribute or entity set to represent an object
o Whether a real-world concept is best expressed by an entity set or a relationship set
Choosing the right relationships
o The use of a ternary relationship versus a pair of binary relationships
o The use of a strong or weak entity set.
o The use of specialization/generalization – contributes to modularity in the design.
o The use of aggregation – can treat the aggregate entity set as a single unit without concern for
the details of its internal structure.
E-R Diagram for a Banking Enterprise
Reduction of an E-R Schema to Tables
Primary keys allow entity sets and relationship sets to be expressed uniformly as tables which
represent the contents of the database.
A database which conforms to an E-R diagram can be represented by a collection of tables.
For each entity set and relationship set there is a unique table which is assigned the name of the
corresponding entity set or relationship set.
Each table has a number of columns (generally corresponding to attributes), which have unique
names.
Converting an E-R diagram to a table format is the basis for deriving a relational database design
from an E-R diagram.
Composite attributes are flattened out by creating a separate attribute for each component attribute
E.g. given entity set customer with composite attribute name with component attributes
first-name and last-name the table corresponding to the entity set has two attributes
name.first-name and name.last-name
A weak entity set becomes a table that includes a column for the primary key of the identifying
strong entity set
Representing Relationship Sets as
A many-to-many relationship set is represented as a table with columns for the primary keys of the
two participating entity sets, and any descriptive attributes of the relationship set.
E.g.: table for relationship set borrower
While identifying the attributes of an entity set, it is sometimes not clear, whether a property should be
modeled as an attribute or as an entity set.
Example: consider the entity set employee with attributes employees name and telephone number. It can
easily be said that a telephone is an entity in its own right with attributes telephone number and location.
If we take this point of views, the employee entity set must be redefined as follows:
o In the first case, the definition implies that every employee has one telephone number
associated with him.
o In the second case, the definition implies that all employees may have several telephone
number associated with them.
Thus, the second definition is more general than the first one, and may more accurately reflect the real
world situation. Even if we are given that each employee has only one telephone number associated with
him, the second definition may still be more appropriate because the telephone is shared among several
employees.
However, it is appropriate to have employees-name as an attribute of the employee entity set instead of
an entity because most of the employees have single name.
Example: assume that, a bank loan is modeled as an entity. An alternative is to model a loan not as an
entity, but rather as a relationship between customers and branches, with loan number and amount as
descriptive attributes. Each loan is represented by a relationship between a customer and a branch.
If every loan is held by exactly one customer and customer is associated with exactly one branch, we
may find satisfactory the design, where a loan is represented as a relationship. But, with this design, we
cannot represent conveniently a situated in which several customers hold a loan jointly. We must define
a separate relationship for each holder of the joint loan. Then, we must replicate the values for the
descriptive attributes loan-number and amount in each such relationship. Each such relationship must of
course, have the same value for the descriptive attributes loan number and amount.
One possible guideline is determining whether to use an entity set or a relationship set to designate a
relationship set, an action that occurs between entities. This approach can also be useful in deciding
whether certain attributes may be more appropriately expressed as relationships.
It is always possible to replace a non-binary (n-ary, for n>2) relationship set by a number of distinct
binary relationship sets.
Example: for simplicity, consider the abstract ternary (n=3) relationship set R, relating entity sets A, B,
C. We replace the relationship set R by an entity set E, and create three relationship sets:
If the relationship set R has any attributes, these are assigned to entity set E; otherwise, a special
identifying attribute is created for E. For each relationship (a i, bi, ci) in the relationship set R, we create a
new entity e; in the entity set E.
Then, in each of the three new relationship sets, we insert a relationship as follows:
* (ei, ai) in RA
* (ei, bi) in RB
* (ei, ci) in RC
We can generalize this process in a straight forward manner to n-ary relationship sets. Thus,
conceptually, we can restrict the E-R model to include only binary relationship sets.
The choice between using aggregation or a ternary relationship is mainly determined by the existence of
a relationship that relates a relationship set to an entity set. The choice may also be guided by certain
integrity constraints that we want to express.
Example: consider the constraint that each sponsorship be monitored by at most one employee. We can
express this constraint in terms of the sponsors relationship set. On the other hand, we can easily express
the constraint by drawing an arrow from the aggregated relationship sponsors to the relationship
Monitors. Thus, the presence of such a constraint servers as another reason for using aggregation rather
than a ternary relationship set.
Designing database for large organization takes efforts of more than a single designer. It
diagrammatically represents the complete database and enables the user who provides inputs to database,
to understand the complete functionality of database.
o The requirements of all the users are collected. The conflicting requirements are resolved and a
final conceptual view is generated to satisfy the requirements of all users.
o In the other method, the user provides his requirements; the designer generates a conceptual view
for the requirements. Likewise all the conceptual views from all user requirements are generated
and a comprehensive conceptual view that satisfies all the requirements is generated.
CASE STUDY
We have read all the basic terms of E-R Diagram. Now, let's understand how to draw E-R diagram? In ER
Model, objects of similar structures are collected into an entity set. The relationships between an entity sets
is represented by a named E-R relationship, which may be (one-to-one, one-to-many, many-to-one,
many-to-many), which maps one entity set to another entity set. A General ER Diagram is shown as-
In Figure, there are two entities ENTITY-1 and ENTITY-2 having attributes (Atr11, Atr12, ... Atr1m) and
(Atr21, Atr22, ... Atr2n) respectively, connected via many to many relationship (M:N). The attributes of
RELATIONSHIP are (AtrR1, AtrR2, ... AtrRO).
Need of ER Diagram -
The ER Diagrams are useful in representing the relationship among entities. It helps to show basic data
structures in a way that different people can understand. Many types of people are involved in the database
environment, including programmers, designers, managers and end users. But not all of these people work
with database and might not be as skilled as others to understand the making of a software or a program etc,
so, a conceptual model like the ERD helps show the design to many different people in a way they can all
understand.
Example of drawing ER Diagram -
How to draw E-R diagram of a company database if the following requirements are given : Question : Make an
ER Diagram for the company database with the following description :
1. The company is organized into departments. Each department has a unique name and a unique
number. A department may have several locations.
2. A department controls a number of projects, each of which has a unique name, a unique number and a
single location.
3. We store each employee's name, social security number, address and salary. An employee is assigned to
one department but may work on several projects, which are not necessarily controlled by the same
departments.
4. We want to keep track of the departments of each employee for insurance purposes. We keep each
dependent's name, age and relationship to the employee.
Answer :
Entities :
Attributes :
Primary Keys :
Attributes Types :
Cardinality Constraints :
✂✁ ✄ ✁ ☎ ✁ ✆ ✝✟✞✠✁ ✡ ✁ ☛ ✝ ☞✌✝ ✡ ✄ ✍ ✎ ✆ ✄ ✝ ☞✏✆ ✑ ✒✔✓ ✒✟✁ ☞✌✁ ✕ ✖ ✗ ✆ ✘ ✡ ✁ ✡✌✁ ✡ ✙✛✚ ✓ ✜✢✝ ✘ ✖ ✕ ✝ ✣ ✂✁ ✄ ✁ ☎ ✁ ✆ ✝✟✞✠✁ ✡ ✁ ☛ ✝ ☞✌✝ ✡ ✄ ✍ ✎ ✆ ✄ ✝ ☞✏✆ ✑ ✒✔✓ ✒✟✁ ☞✌✁ ✕ ✖ ✗ ✆ ✘ ✡ ✁ ✡✌✁ ✡ ✙✛✚ ✓ ✜✢✝ ✘ ✖ ✕ ✝ ❉
➔✯✭✫✶★→❍✩✫✧❋✤✦✥★✧✂✩✫✪✠✬✂✭✫✮✯✩✫✧❋❊❍●✯✥★✶★■❑❏✯✩✫✮▲✳▼●✯✩◆✳✴✥★❖ ➘❝✶★✥★✧✂✬✂→❍✬✂✮✯✩✫✶★✬✂✥★❖
➣ ❴✴❫↔❛❝✻✾✿ ✹✼❁✠❛q✻✾✿ ❡ ↕✌✻✾❭★s❪❢❀❁✠❂✂➅✯t❆✻✾❞✼➆★❢❆✻◆➆★❁✠♠r❥ ❫❵❂✂❛❦✿ ✹✼❁ ❇➷➴★❢❀❁✠❂✂➅✯❡ ♠r✻✾✽❀✽❀❭ ❡ ❁✠❣❤✿ ❫ ❄◆✻✾❞✼❣❤✿ ✹✼❁
❖ ❙✂❘✢❯ ❲✌➠ ➡ ➢✌❳✯➡ ❈
❳ ❩✢➠ ❲✏❳★➫✛❘✢❩
❧★✻✾♠♥❡ ♠r❥ ❫❵❂❋➙✢❂✂❁✠✻✾❭✛➛✫❭ ✻✾❞✼➆★❢❀✻✾➆★❁✠♠↔➜ ❁❀♦ ➆r♦✢➝❵s❪t❆➞ ❄✾✻✾❞✼❣❤❥ ❫❵❂ ❂✂❁✠♠♥❢❀❭ ✿❈❫❵❥✼✻❱➴★❢❀❁✠❂✂➅✯❡ ♠r✻✾❭ ♠♥❫✯✻❱❂✂❁✠❭ ✻✾✿ ❡ ❵ ❫ ❞➬❡ ❞✼♠♥✿ ✻✾❞✼↕✌❁❀♦
❡ ❛❝✽❀❭ ❁✠❛❝❁✠❞✼✿ ✻✾✿ ❡ ❫❵❞✺✉
✈❋➮✠➱✟✃✌❐ ❒r❮✢❰❈③✏⑦◆❹ ✾④ ➃◆❶❷①✾② ❽✛⑩ ⑨✌① ❹ ③⑤④✾❼❈⑦ ③⑤②★⑨✱Ï✠❶❷❽✛② ➑▲⑨♥② ❽✫⑦ ❹ ➌✌❽✛❸❱Ð ❺✠❶❷①
✉✱➦➧❫❵❂✂❁➐❫❵✽❀❁✠❂✂✻✾✿ ❡ ❫★❞✼✻✾❭ ❄✾✐★❁✠❂✂➅ Ï✠❶❷❽✛② ➑✺➁❻❹ ⑩ ⑩⑤② ❷❶ ④❻② ❽✛⑥⑤⑨✌② ❸❷⑩ ❽✛❼✔❼❈③⑤⑦◆❹ ④✾❼✔① ⑨✌④✾❾✢❽✛Ñ Ò
❶ ➟✷❘✢❯ ❲✌➠ ➡ ➢✏❳★❲✌❯❷➤❻❯ ❨✾❘✂➥⑤❙✢❲
❢❀♠♥❁✠❥ ❢❀❭❵❥ ❫❵❂✱❂✂❁✠✽❀❂✂❁✠♠♥❁✠❞✼✿ ❡ ❞✼➆✯❁✠➨◆❁✠↕✌❢❀✿ ❡ ❫❵❞✴✽❀❭ ✻✾❞✼♠❵♦ ✈❪➎❷➂✾❽✫❼✔❾✢➂✾❽✛⑧❋⑨✷⑦ ③⑤②★① ➂✾❽❆Ó ❐ ❰ Ô✠Õ Ö❷③✏⑦✾⑨✷⑥✠❹ ×✠❽✛④❻Ï⑤❶◆❽✛② ➑✺❹ ❼✼⑨✌⑩ ❼✔③
⑦ ❹ ➌✌❽✛❸❷Ñ✏Ør❽✛① ❽✛② ⑧❋❹ ④✾❽✛❸❱❺✠➑▲❸❷❽✛⑦ ❹ ④✾❹ ① ❹ ③⑤④❻③⑤⑦◆Ï⑤❶❷❽✛② ➑✺⑩ ⑨✌④✾⑥⑤❶◆⑨✌⑥✠❽
✉qt❆❁✠✿ ♠r❢❀♠♥❁✠❂✂♠r❣❆❁✠♠♥↕✌❂✂❡ ❧★❁➐❴✴✹✼✻✾✿
❷ ➟✷❘✢❯ ❲✌➠ ➡ ➢✏❳★❲✌❯❷➩r❲✌❯ ➫✛◗★❯ ◗★❩ ❾✢③⑤④✾❼✔① ② ❶❷❾✢① ❼✔❿
✿ ✹✼❁✠➅✯❴✴✻✾❞✼✿ ❄✾❂✂✻✾✿ ✹✼❁✠❂✱✿ ✹✼✻✾❞✴✹✼❫❵❴❑✿ ❫✯↕✌❫★❛❝✽❀❢❀✿ ❁➐❡ ✿✌♦ ❅❀❫❵♠♥❡ ✿ ❡ ❫❵❞✼✻✾❭★✐★♠❵♦✢❞✼✻✾❛❝❁✠❣❆➯ ❥ ❡ ❁✠❭ ❣❤❞✼❫❵✿ ✻✾✿ ❡ ❫★❞✺✉
➜ ➭✴❫❵❞✼➯ ❫❵✽❀❁✠❂✂✻✾✿ ❡ ❫❵❞✼✻✾❭ ❄✾➲ ♦➞ ❖
❘✢➫✛❯ ❲✏❙✢❲✌➠ ➡ ➳◆❘ ✈❪Ù❷③⑤❼✔❹ ① ❹ ③⑤④✾⑨✌⑩⑤④✾③⑤① ⑨✌① ❹ ③⑤④❻❽✛⑨✌❼✔❹ ❽✛②❵⑦ ③⑤②★⑦ ③⑤② ⑧❱⑨✌⑩⑤❸❷❽✂⑦ ❹ ④✾❹ ① ❹ ③⑤④✾❼✔➒
➲ ➲ ④✾⑨✌⑧❋❽✛❸❷Ú ⑦ ❹ ❽✛⑩ ❸❱④✾③⑤① ⑨✌① ❹ ③⑤④❻⑧❋③⑤② ❽✫② ❽✛⑨✌❸◆⑨✌❺✠⑩ ❽✂❿
☛ ➵✷❳ ❘✂❙✂❩✢➠ ❲✌❳ ➡ ❳★❨❋➤❻❯ ❨◆❘✂➥✏❙✢❲▼➸➺➩r❲✌❯ ➫✛◗★❯ ◗★❩✱➡ ❩✫➻✏❘✢❚❱➠ ➢
➲ ➲ ✈❪Û❷③⑤① ➂❻❶❷❼✔❽✛❸❱❹ ④❻✇✏➊✷➋
☛ ◗★❳ ❘✂❙✂❩✢➠ ❲✌❳ ➡ ❳❈❨❱➼rP✷➽★➾◆➚✏◗❈❘✂❙✢❚❱➪◆❙✢➢✌➫✛❘✂❩✢❩✢➡ ❳★❨❆➶
✂✁ ✄ ✁ ☎ ✁ ✆ ✝✟✞✠✁ ✡ ✁ ☛ ✝ ☞✌✝ ✡ ✄ ✍ ✎ ✆ ✄ ✝ ☞✏✆ ✑ ✒✔✓ ✒✟✁ ☞✌✁ ✕ ✖ ✗ ✆ ✘ ✡ ✁ ✡✌✁ ✡ ✙✛✚ ✓ ✜✢✝ ✘ ✖ ✕ ✝ ➓ ✂✁ ✄ ✁ ☎ ✁ ✆ ✝✟✞✠✁ ✡ ✁ ☛ ✝ ☞✌✝ ✡ ✄ ✍ ✎ ✆ ✄ ✝ ☞✏✆ ✑ ✒✔✓ ✒✟✁ ☞✌✁ ✕ ✖ ✗ ✆ ✘ ✡ ✁ ✡✌✁ ✡ ✙✛✚ ✓ ✜✢✝ ✘ ✖ ✕ ✝ ➹
ã✫ä
Ý❝Þ❻✩✫→❍ß❻✧✂✥áà✠✮✯❖★✪✠✩❆✮✯â✼✥★❖ sid bid day ✤✦✥★✧✂✩✫✪✠✬✂✭✫✮✯✩✫✧✱✰✲✧ ✳✴✥★✵✷✶★✩
22 101 10/10/96
58 103 11/12/96 ✻✾♠♥❡ ↕▲❫❵✽❀❁✠❂✂✻✾✿ ❡ ❫❵❞✼♠❵✉
➍✌✇✏⑨✌❹ ⑩ ③⑤② ❼✟➏✱⑨✌④✾❸➐➍✌ç❈❽✛❼✔❽✛② ×✠❽✛❼✔➏ ❖ í
❖ å ä
② ❽✛⑩ ⑨✌① ❹ ③⑤④✾❼✼⑦ ③⑤②★③⑤❶❷②★❽✛➌♥⑨✌⑧❋➃◆⑩ ❽✛❼✔❿ sid sname rating age ✈❋➮✠❐ Õ ❐ ➱✟Ö î ï✢ð σ Òò✇✏❽✛⑩ ❽✛❾✢① ❼✼⑨✱❼✟❶❷❺✠❼✔❽✛①✾③✏⑦◆② ③⑤➁❻❼❈⑦ ② ③⑤⑧ë② ❽✛⑩ ⑨✌① ❹ ③⑤④✾❿
Ðñ
❛ ✮✯✬✂✭✫✮❝❜❻à⑤✮✯✪✠✥★✶★❖★✥★â✼✪✠✬✂✭✫❝
✮ ❜ ✷ ✥❵✪✲❞✰❍
❡ ✬ ❢❂❢✫✥★✶★✥★✮✯â✼✥ ➟ ✶★✭✫❖★❆
❖ ❞✠➘❝✶★✭✁❻
➠ ●✯â✼✪
✐❦❥ ❧ ✐❂♠✙♥✰♦q♣ r✵♥✰s ❥ ♠✙t ♥✰t✝♣ ✻✾↕✌✹✴❂✂❫❵❴❑❫❵❥✼➝➤➢✫❡ ♠▲✽❀✻✾❡ ❂✂❁✠❣❤❴✴❡ ✿ ✹✴❁✠✻✾↕✌✹✴❂✂❫❵❴❑❫❵❥ ➢✾♦
❖ ➡ ♣
✉✰✉ ❧❅✈❆✐❂s ❥ ♠ ✇ ①✰②✰③ ④ ✹✼✻✾♠r❫❵❞✼❁➐❥ ❡ ❁✠❭ ❣❤✽❀❁✠❂✱❥ ❡ ❁⑤❭ ❣❤❫❵❥✼➝➤➢✫✻✾❞✼❣ ➢✂❄
➀r⑩ ⑩⑤③⑤⑦◆① ➂✾❽✛❼✔❽✫③⑤➃◆❽✛② ⑨✌① ❹ ③⑤④✾❼✼① ⑨✄✠ ✴❽ ❦➥ ✠✂
❖
① ➁❻③▲❹ ④✾➃◆❶❷①✾② ❽✛⑩ ⑨✌① ❹ ③⑤④✾❼✔➒✌➁❻➂✾❹ ❾✢➂
⑤✰⑥ ⑦ ✈❅⑧✝⑧☛♣✄r ⑨ ②✰②✰③ ② ❖ ➟✷❘✂❩✢◗★❯ ➠✼❩✢➫ ◆❘ ❪❲
❴✴❡ ✿ ✹✴❥ ❡ ❁✠❭ ❣❤❞✼✻✾❛❝❁✠♠➧➦ ❡ ❞✼✹✼❁✠❂✢❡ ✿ ❁✠❣✭❷
➨ ❡ ❥✼✽❀❫❵♠♥♠♥❡ ❧★❭ ❁❀♦ ♣
⑧❋❶❷❼✔①✠❺✠❽❆Ô✠ð✠î ï✔ð✠÷ ➱✟ï✔❒rø✌❮✔Ö î ❣✂Õ ❐ ÿ ②✰⑨ r✵✈❅✐❂s ⑩ ⑥④ ⑤ ②✰③ ④ ✉ ❫❵✿ ✹✴➝➤✱ ➢ ✻✾❞✼❣ ✱➢ ✹✼✻✾✐★❁➐✻❱❥ ❡ ❁✠❭ ❣❤↕✌✻✾❭ ❭ ❁✠❣ ➲❆♦
①✰① t✝✈❅❶❆❶❆⑩ ② ⑤ ②✰③ ④ ➩ ➩✷➢✌❳✄➫✂❯ ➡ ➫✛➠
✈❪✇✏⑨✌⑧❋❽✫④✾❶❷⑧❱❺✠❽✛②★③⑤⑦✾⑦ ❹ ❽✛⑩ ❸❷❼✟❿
✉⑨ ⑤ ②✰③ ④ ➭ ➯ ➲ ➳❦➵ ➯ ➸✒➺í ➻➤➼ ➽ ➺✏➾ ➲ ➸✢➚ ➺ ➚❦➼ ♣ ➭ ➯ ➲ ➳❦➵ ➪✢➲ ➳ ➳✢➺ ➶ ❩✢➡
✈
ê❤
③⑤② ② ❽✛❼✔➃◆③⑤④✾❸❷❹ ④✾⑥✠é✏⑦ ❹ ❽✛⑩ ❸❷❼
⑩✝✈❅❶❆❶❆⑩ ❷ ➹✏➹ ➳❦➘✄➯ ➾ ➲ ➸➷➴➮➬✵➱❂✃ ❐ ➹✵➹❮❒ ❐ ❒❰❒ ❐❦Ï ❒ ❐❦Ï✠Ð✵Ñ
➂✾⑨✌×✠❽✫① ➂✾❽✫❼✔⑨✌⑧❋❽❆① ➑✠➃◆❽✛❿ S1∪ S2 ➹✏➹ ➳❦➘✄➯ ➾ ➲ ➸➷➴➮➬✵➱❂✃ ❐Ò➱✠Ó ❒ ❐✏Ô ❒✏❒ Ï ❒ ➹ Ï✠Ð✵Ñ
è ➂✾⑨✌①✾❹ ❼✼① ➂✾❽❆❰ ➱ ✃✌❐ ❒r❮❆③⑤⑦✾② ❽✛❼✔❶❷⑩ ① ✱ Ô ❒ÖÕ ➘❂➪❂➪✢➼✏➽×ÓØ➱✏➱❂✃ ➱ ➹✵➹❮❒ ❐ ❒❰❒ ❐❦Ï ❒ ❐❦Ï✠Ð✵Ñ
Ô ❒ÖÕ ➘❂➪❂➪✢➼✏➽×ÓØ➱✏➱❂✃ ➱Ò➱✠Ó ❒ ❐✏Ô ❒✏❒ Ï ❒ ➹ Ï✠Ð✵Ñ
❖
❸❂❹ ❺ ❸❂❻☛❼✰❽❝❾ ❿✵❼✒➀ ❹ ❻☛➁ ❼✒➁✝❾
➌✠➍ ➎ ➌❂➏☛➐✒➑✍➒ ➓✏➐✰➔ ➍ ➏✝→ ➐✒→✗➒ ➂✰➃ ➄ ➅✙➆✝➆ ❾✄❿ ➇ ➈✒➈✰➉ ➈ ➱✵ÓÙ➽ ➘✢➯ ➾ ➶ ❒ ❐ÚÔ✏➱❂✃ ❐ ➹✵➹❮❒ ❐ ❒❰❒ ❐❦Ï ❒ ❐❦Ï✠Ð✵Ñ
➣✒➣ ➎❆↔✙➌❂➔ ➍ ➏ ↕ ➙✒➛✒➜ ➝ ➈✰➇ ❿ ➅ ❸❂➀ ➊ ➃✰➋ ➂ ➈✰➉ ➋ ➱✵ÓÙ➽ ➘✢➯ ➾ ➶ ❒ ❐ÚÔ✏➱❂✃ ❐Ò➱✠Ó ❒ ❐✏Ô ❒✏❒ Ï ❒ ➹ Ï✠Ð✵Ñ
S1− S2 S1∩ S2 ☛ ❪ ❐ ð✠❮✢❒rî ð✰Û✷ï✔ø✌❐ Ó ❮✔Ö ï✢Ó ÿ ρ (C(1→ sid1, 5 → sid 2), S1 × R1)
✂✁ ✄ ✁ ☎ ✁ ✆ ✝✟✞✠✁ ✡ ✁ ☛ ✝ ☞✌✝ ✡ ✄ ✍ ✎ ✆ ✄ ✝ ☞✏✆ ✑ ✒✔✓ ✒✟✁ ☞✌✁ ✕ ✖ ✗ ✆ ✘ ✡ ✁ ✡✌✁ ✡ ✙✛✚ ✓ ✜✢✝ ✘ ✖ ✕ ✝ ❵ ✂✁ ✄ ✁ ☎ ✁ ✆ ✝✟✞✠✁ ✡ ✁ ☛ ✝ ☞✌✝ ✡ ✄ ✍ ✎ ✆ ✄ ✝ ☞✏✆ ✑ ✒✔✓ ✒✟✁ ☞✌✁ ✕ ✖ ✗ ✆ ✘ ✡ ✁ ✡✌✁ ✡ ✙✛✚ ✓ ✜✢✝ ✘ ✖ ✕ ✝ ✣➞
Ü❀✭✫✬✂✮✯❖ Ü❀✭✫✬✂✮✯❖
➲
➡ ➠ ➡ ➢✌❳qÝ ➢✌➡ ❳
✉
R >< c S = σ c ( R × S) ✉✫❇ ♠♥✽❀❁✠↕✌❡ ✻✾❭❵↕✌✻✾♠♥❁➐❫❵❥❀↕✌❫❵❞✼❣❆❡ ✿ ❡ ❫❵❞ ÿ ❫❵❡ ❞✴❴✴✹✼❁✠❂✂❁
❖ ➩✷➢✌❳
❖ ❆➚✏◗★➡ þ✿ ✹✼❁➐↕✌ß ❫❵Ý ➢✌❞✼➡ ❣❆❳ ❡ ✿ ❡ ❫❵❞ ↕✌❫❵❞✼✿ ✻✾❡ ❞✼♠r❫❵❞✼❭ ➅
á â❦ã ä✽å ❦â æ❅ç✲èêé ë✠ç✲ì ã æ❆í ç✲í✙é á â❦ã ä✽å î✙ã ä ä✽ç✲ï ➫
✁✄✂✆☎✞✝✠✟ ✡ ☛☞✡ ✁✄✌✎✍
ð✲ð ✽ä ñ✽â❦ì ã æ ò ó✲ô✲õ ö ô✲÷ ø✲ö✲ù ø✲ø●ú✰ø ð ✰ú û✲ü ✑✓✒ ✔ ✑✓✕✗✖✙✘✛✚ ✜☞✖✙✢ ✒ ✕✗✣ ✖✙✣✤✚ ✥✆✒ ✔ ✔✞✖✙✦
ù✲ø ý ñ✽î✙î☛é✰ë ÷ ô✲ô✲õ ô ô✲÷ ø✲ö✲ù ø✲ø●ú✰ø ð ✰ú û✲ü ✧✙✧ ✔✩★✩✑✓✢ ✒ ✕ ✪ ✫✙✬✙✭ ✮ ✯✙✮✙✯ ✯✙✮✱✰✲✯✙✮✱✰✲✳✙✴
✬✙✵ ✜☞★✩✑✓✢ ✦ ✯✙✮ ✶✙✬✙✭ ✮ ✯✙✮✙✶ ✯✙✯✱✰✲✯ ✧ ✰✲✳✙✴
S1 >< R1
S1. sid < R1. sid S1 >< R1
♠♥✻✾❛❝❁➐✻✾♠▲✿ ✹✼✻✾✿❈❫❵❥❀↕✌❂✂❫❵♠✌♠♥➯ ✽❀❂✂❫❵❣❆❢❀↕✌✿✌♦ ♠♥❡ ❛❝❡ ❭ ✻✾❂✱✿ ❫✯↕✌sid
❂✂❫❵♠♥♠♥➯ ✽❀❂✂❫★❣❆❢❀↕✌✿ ❄✾❧★❢❀✿✼❫❵❞✼❭ ➅
❖ ➟✷❘✂❩✢◗★❯ ➠✼❩✢➫ ◆❘ ❪❲ ❦➥ ✠✂ ❖ ➟✷❘✂❩✢◗★❯ ➠✼❩✢➫ ◆❘ ❪❲ ❦➥ ✠✂
❁✠❴✴❁✠❂✱✿ ❢❀✽❀❭ ❁✠♠r✿ ✹✼✻✾❞✴↕✌❂✂❫❵♠♥♠♥➯ ✽❀❂✂❫❵❣❆❢❀↕✏✿ ❄◆❛❝❡ ➆★✹✼✿❈❧★❁ ❫❵❞✼❁➐↕✌❫❵✽❀➅✯❫❵❥✼❥ ❡ ❁✠❭ ❣❆♠r❥ ❫❵❂✱❴✴✹✼❡ ↕✌✹✴❁✠➴★❢❀✻◆❭ ❡ ✿ ➅✯❡ ♠r♠♥✽❀❁✠↕✌❡ ❥ ❡ ❁✠❣➐♦
❖ Þ✻✾❧★❭ ❁➐✿ ❫↔↕✌❫❵❛q✽❀❢❀✿ ❁❱❛❝❫❵❂✂❁➐❁✠❥ ❥ ❡ ↕✌❡ ❁✠❞✼✿ ❭ ➅
✉ ➴★❢❀❡ ÿ ❵
❫ ❡ ❞✴❫❵❞ ↕✌❫❵❛q❛❝❫❵❞✴❥ ❡ ❁✠❭ ❣❆♠❵♦
➝❷❫❵❛❝❁✠✿ ❡ ❛❝❁✠♠r↕✌✻✾❭ ❭ ✠ ❁ ❣❤✻ ♦
❖ ✏❪❲✌➠ ◗★❙✢❲✌❯ ✟➢✌➡ ❳ ☛Ý ➡ ❲✌❯ ❯
❖ ➠ ➥◆❘✢➠ ❲✄ß à✂➢✌➡ ❳
✂✁ ✄ ✁ ☎ ✁ ✆ ✝✟✞✠✁ ✡ ✁ ☛ ✝ ☞✌✝ ✡ ✄ ✍ ✎ ✆ ✄ ✝ ☞✏✆ ✑ ✒✔✓ ✒✟✁ ☞✌✁ ✕ ✖ ✗ ✆ ✘ ✡ ✁ ✡✌✁ ✡ ✙✛✚ ✓ ✜✢✝ ✘ ✖ ✕ ✝ ✣✣ ✂✁ ✄ ✁ ☎ ✁ ✆ ✝✟✞✠✁ ✡ ✁ ☛ ✝ ☞✌✝ ✡ ✄ ✍ ✎ ✆ ✄ ✝ ☞✏✆ ✑ ✒✔✓ ✒✟✁ ☞✌✁ ✕ ✖ ✗ ✆ ✘ ✡ ✁ ✡✌✁ ✡ ✙✛✚ ✓ ✜✢✝ ✘ ✖ ✕ ✝ ✣❉
❡❍✬✸✷❻✬✂❖★✬✂✭✫✮ Ý❝Þ❻✩✫→❍ß❻✧✂✥★❖á❂✭ ❢❰❡❍✬✸✷❻✬✂❖★✬✂✭✫✮ ✰♦♥✤♣
Ý❝Þ❻ß❻✶★✥★❖★❖★✬✂✮▲✳❑✰♦♥✤♣Ú❛ ❖★✬✂✮r➓
✳ ♣❝✩✫❖★✬✂â→➔❍ß❻✥★✶★✩✫✪✠✭✫✶★❖ ➜❨➝ ➞❃➟➠➞❃➡✼➢➥➤✄➦✾➧✸➨✾➦✄➡✼➝ ➩ ➧✼➫ ➦✛➭❜➯✞➧✼➲ ➳✎➤✛➫ ➤✄➦✎➤✗➫✄➳✲➤✄➟➠➵✓➧❃➡✼➸✗➺●➎✞➻✞➐
❡ ✐★❡ ♠♥❡ ❫❵❞✴❡ ♠r❞✼❫❵✿❈❁✠♠♥♠♥❁✠❞✼✿ ❡ ✻✾❭❵❫❵✽ ÿ ❢❀♠✌✿❈✻❋❢❀♠♥❁✠❥ ❢❀❭❵♠♥✹✼❫❵❂✂✿ ✹✼✻✾❞✼❣➐♦ ➝❷❫❵❭ ❢❀✿ ❡ ❫❵❞ ➢✾✉
❖ ➣ ❁ ❖ π sname((σ Reserves) >< Sailors)
✈❪Ð r ➀ ⑩ ❼✔③▲① ② ❶❷❽❆③⑤⑦✗✯ ③⑤❹ ④✾❼✔➒♥❺✠❶❷①✲✯ ③⑤❹ ④✾❼✼⑨✌② ❽❆❼✔③▲❾✢③⑤⑧❱⑧❋③⑤④❻① ➂✾⑨✌①✾❼✔➑✠❼✔① ❽✛⑧❋❼ bid =103
❹ ⑧❋◆ ➃ ⑩ ❽✛⑧❋❽✛④✾✗ ① ✯ ③⑤❹ ④✾❼✼❼✔➃◆❽✛❾✢❹ ⑨✌⑩ ⑩ ➑✠❿ Ò
➲ ✉ ❫❵❂ ❄◆↕✌❫★❛❝✽❀❢❀✿ ❁➐✻✾❭ ❭✆✿❋✐❈✻✾❭ ❢❀❁✠♠r✿ ✹✼✻✾✿❈✻✾❂✂❁➐❞✼❫❵✿ ➝❷❫❵❭ ❢❀✿ ❡ ❫❵❞➼✽
❖ ↔ ❘✢❲ Þ ➤ ❄❂
❅ ❖ ÿ ρ (Temp1, σ Re serves)
➦ ❣❆❡ ♠♥➴★❢❀✻✾❭ ❡ ❥ ❡ ❁✠✭ ❣ ➨❷❧★➅✯♠♥❫❵❛❝❁ ✐❈✻✾❭ ❢❀❁➐❡ ❞ ♦ bid = 103
❚ ❂
✈ ❢r×⑤⑨✌⑩ ❶❷❽✫❹ ❼★ù✌î ❰ ➙✂Ô⑤❮✢Õ î ú✟î ❐ ùr❹ ⑦✾❺✠➑▲⑨✌① ① ⑨✌❾✢➂✾❹ ④✾❣
↕ ⑥ ❡✷×⑤⑨✌⑩ ❶❷❽❆⑦ ② ③⑤➛
⑧ ✐❵➒♥➁❻❽ ρ ( Temp2, Temp1 >< Sailors)
③⑤❺✠① ⑨✌❹ ④✺⑨✌❜ ④ ❢▼❡r① ❶❷➃◆⑩ ❽✫① ➂✾⑨✌①✾❹ ❼✼④✾③⑤①✾❹ ❣ ④ ❤✱❿
π sname (Temp2)
Ør❹ ❼✔Ï✠❶◆⑨✌⑩ ❹ ⑦ ❹ ❽✛❸❅❢r×⑤⑨✌⑩ ❶❷❽✛❼✔ÿ π x ((π x ( A) × B) − A)
➝❷❫❵❭ ❢❀✿ ❡ ❫❵❞➼➽
π x ( A) − ⑨✌⑩ ⑩⑤❸❷❹ ❼✔Ï⑤❶❷⑨✌⑩ ❹ ⑦ ❹ ❽✛❸❱① ❶❷➃◆⑩ ❽✛❼ ❖ ÿ π sname (σ (Re serves >< Sailors))
➤❅❄ ❂❆❬ bid =103
✂✁ ✄ ✁ ☎ ✁ ✆ ✝✟✞✠✁ ✡ ✁ ☛ ✝ ☞✌✝ ✡ ✄ ✍ ✎ ✆ ✄ ✝ ☞✏✆ ✑ ✔✒ ✓ ✒✟✁ ☞✌✁ ✕ ✖ ✗ ✆ ✘ ✡ ✁ ✡✌✁ ✡ ✙✛✚ ✓ ✜✢✝ ✘ ✖ ✕ ✝ ✣Ü ✂✁ ✄ ✁ ☎ ✁ ✆ ✝✟✞✠✁ ✡ ✁ ☛ ✝ ☞✌✝ ✡ ✄ ✍ ✎ ✆ ✄ ✝ ☞✏✆ ✑ ✒✔✓ ✒✟✁ ☞✌✁ ✕ ✖ ✗ ✆ ✘ ✡ ✁ ✡✌✁ ✡ ✙✛✚ ✓ ✜✢✝ ✘ ✖ ✕ ✝ ✣ì
➜❨➝ ➞❃➟➠➞❃➡✼➢➥➤✄➦✾➧✸➨✾➦✄➡✼➝ ➩ ➧✼➫ ➦✛➭❜➯✞➧✼➲ ➳✎➤✛➫ ➤✄➦✎➤✗➫✄➳✲➤✄➟♠➡➾➫ ➤✄➟➠➵✓➧❃➡✼➸ ➜❨➝ ➞❃➟♠➦✄➡✼➝ ➩ ➧✼➫ ➦✛➭❜➯✞➧✼➲ ➳✎➤✛➫ ➤✄➦✎➤✞➫✄➳✎➤✄➟♠➡➾➫ ➤✄➟♠➧✼➫❥➡●➹❣➫ ➤✄➤✞➞➾➵✓➧❃➡✼➸
π sname (π ((π σ Boats) >< Re s) >< Sailors) ✸✺✻✾❞✴✻✾❭ ♠♥❫↔❣❆❁✠❥ ❡ ❞✼❁ ➣ ❁✠❛❝✽❀❧★❫❵✻✾✿ ♠r❢❆♠♥❡ ❞✼➆✯❢❀❞✼❡ ❫❵❞✺➉❪➜ ➘✴❫❵❴➼➴✌➞
sid bid color =’red ’ ❖
✹✼✻✾✿❈✹✼✻✾✽❀✽❀❁✠❞✼♠r❡ ❥ ❡ ♠r❂✂❁✠✽❀❭ ✻✾↕✌❁✠❣❤❧❈➅ ❡ ❞✴✿ ✹✼❡ ♠r➴★❢❀❁✠❂✂✗
➅ ➴
☛ ❤✛➙✂Ô✠❐ Ó ❡✱ï✢ø♥Ö î ❒rî ➚✢❐ Ó★➱✟❮✔ð✼ú✟î ð⑤ù✷Ö ✃✌î ❰✙Û✌î ➪♥❐ ð▲Ö ✃✌❐♥ú î Ó ❰ Ö◆❰ ï✢Õ Ô✠Ö î ï✔✙
ð ➶ ❖➷ ∨ ∧
✂✁ ✄ ✁ ☎ ✁ ✆ ✝✟✞✠✁ ✡ ✁ ☛ ✝ ☞✌✝ ✡ ✄ ✍ ✎ ✆ ✄ ✝ ☞✏✆ ✑ ✒✔✓ ✒✟✁ ☞✌✁ ✕ ✖ ✗ ✆ ✘ ✡ ✁ ✡✌✁ ✡ ✙✛✚ ✓ ✜✢✝ ✘ ✖ ✕ ✝ ✣☎ ✂✁ ✄ ✁ ☎ ✁ ✆ ✝✟✞✠✁ ✡ ✁ ☛ ✝ ☞✌✝ ✡ ✄ ✍ ✎ ✆ ✄ ✝ ☞✏✆ ✑ ✒✔✓ ✒✟✁ ☞✌✁ ✕ ✖ ✗ ✆ ✘ ✡ ✁ ✡✌✁ ✡ ✙✛✚ ✓ ✜✢✝ ✘ ✖ ✕ ✝ ✣✶
➜❨➝ ➞❃➟♠➦✄➡✼➝ ➩ ➧✼➫ ➦✛➭❜➯✞➧✼➲ ➳✎➤✛➫ ➤✄➦✎➤✞➫✄➳✎➤✄➟♠➡➾➫ ➤✄➟♠➡✼➞❃➟ ➡●➹❣➫ ➤✄➤✞➞➾➵✓➧❃➡✼➸ ➜❨➝ ➞❃➟➠➸ ➯✞➤✛➞❃➡✼➢➥➤✄➦✾➧✸➨✾➦✄➡✼➝ ➩ ➧✼➫ ➦✛➭❜➯✞➧✼➲ ➳✲➤✛➫ ➤✎➦✄➤✞➫✄➳✲➤✄➟♠➡✼➩ ➩❥➵✓➧❃➡✼➸ ➦
✷ ●✯→❍→❍✩✫✶★■
➣ ✹✼❁➐❂✂❁✠❭ ✾
✻ ✿ ❡ ❫❵❞✼✻✾❭❵❛❝❫❵❣❆❁✠❭❵✹✼✻✾♠r❂✂❡ ★➆ ❫❵❂✂❫❵❢❀♠♥❭ ➅✯❣❆❁✠❥ ❡ ❞✼❁✠❣
❖
➴★❢❀❁✠❂✂➅✯❭ ✾ ✻ ❞✼➆★❢❀✻✾➆★❁✠♠r✿ ✹✼✻✾✿✼✻✾❂✂❁➐♠♥❡ ❝❛ ✽❀❭ ❁➐✻◆❞✼❣
✽❀❫❵❴✴❁✠❂✂❥ ❀ ❢ ❭✛♦
❁✠❭ ✻✾✿ ❡ ❫❵❞✼✻✾❭❵✻✾❭ ➆★❁✠❧★❂✂✻❱❡ ♠▲❛❝❫❵❂✂❁➐❫❵✽❀❁✠❂✂✻✾✿ ❡ ❫❵❞✼✻✾❭ ❢❀♠♥❁✠❥ ❢❀❭
❖ ♣
✻✾♠r❡ ❞✼✿ ❁✠❂✂❞✼✻✾❭❵❂✂❁✠✽❀❂✂❁✠♠♥❁✠❞✼✿ ✻✾✿ ❡ ❫❵❞✴❥ ❫❵❂✱➴★❢❀❁✠❂✂➅ ❁
❁✠✐★✻✾❭ ❢❀✻✾✿ ❡ ❫❵❞➬✽❀❭ ✻✾❞✼♠❵♦
➝❷❁✠✐★❁✠❂✂✻✾❭❵❴✴✻✾➅★♠r❫❵❥✼❁✠➨❷✽❀❂✂❁✠♠♥♠♥❡ ❞✼➆✯✻❱➆★❡ ✐★❁✾❞✴➴★❢❀❁✠❂✂➅ ✻
❖
➴★❢❀❁✠❂✂➅✯❫❵✽❀✿ ❡ ❛❝❡ ❐✾❁✠❂✱♠♥✹✼❫❵❢❀❭ ❣❤↕✌✹✼❫❵❫❵♠♥❁➐✿ ✹❀❁➐❛❝❫❵♠♥✿ ❁
❁✠❥ ❥ ❡ ↕✌❡ ❁✠❞✼✿❈✐★❁✠❂✂♠♥❡ ❫❵❞✺♦
✱✳❃❅❄❆✹✠❇❉❈ ❊❋✷ ●❍❃❏■ ❑ ✴▼▲✣❃❅◆✸✺✂❇❅❖◗P✶❘❁❙✶❚ ❯✼❱❲❯❲❚ ❳✏❨ ❩ ❬✌❭✣❳✌❚❫❪❴❳✌❚ ❪❴❘✣❚ ❘✣❵ ❛ ❜✦❝❉✱✳❞
❖
✴✶❊✯❡❣❢❤❬✌✐❤❳✌❩ ❭❥❱❲❯❲❚ ❳✌❨ ❩ ❬✏❭✣❳✌❚❫❪❴❳✌❚ ❪❴❘✣❚ ❘❁❵ ❛ ❦❤❝❉✱✳❞✏❧
✚✜✛✣✢✂✤✦✥✠✧✂★✦✩✪✤✦✢✬✫✜✤✦✢✂✭✯✮✪✢✂✮✪✰ ✱✳✴✶❑ ♠✌◆✸❑ ◆✸❇❉✲✯✴✶❇♦♥✶❳✏❱❲❩ ❳✏♣✏❚ ❯✂❵❲q✶❪❴❬✌❭❁❵❲❨ ❳✌❭✣❨✟❵❲q✶❪r❬✌✐❤❙✶❳✌❱❲❩ ❵❲❬✌❭❥❬✌❙▼❵❲✽✶❚ ❬✌s✶❩ ❪❴❳✌❚
❖
❪❴❬✌❭✣❭✣❯❲❪❴❨ ❩ ♥✶❯✂❵✦✴✶❊✯❡❆t✏❘✣❳✏❭✣❨ ❩ ✉✂❩ ❯❲❱✂❵✶❧
✈✬✇❫①❁② ③✼④✦⑤✌⑥ ⑦ ⑤✌⑧✠⑨ ⑩❴❶❁⑥ ⑤✌❷✶❸✠⑩✦❹❻❺✠⑩❴⑥❅❼ ⑦ ❽ ⑩❴❽ ❾r❸✠⑩❴❿✠⑧✠❹✏➀❫❷✶➁◗❿ ❹❻➂✠➃ ➄❻➅✌➆ ➇ ➈✟❽
✱✳✲✯✴✶✵✸✷ ✹✠✺✼✻✶✽✶✾✿✴✶✺✂✷❁❀
✈✬➉❉①❁② ③✼④✦⑤✌⑥ ⑦ ⑤✌⑧✠⑨ ⑩❴❶❁⑥ ⑤✌❷✶❸✠⑩✿❹❻❺✠⑩❴⑥❅➊✌➋✔➌❉➍❲➎ ➏✳➇ ➆ ➇ ➌➐➇ ➏✠➃ ➈✣❼ ➑➓➒ ⑦ ⑩❴⑨ ➁◗❺❻⑤✌⑨ ➀❫⑩✂❶✔➂ ❽
✈❤➔❫❹❻❿ →↔➣❫↕❁➙❋⑤✌❷✶➁➓➛❉↕❁➙❥⑤✌⑥ ✿ ⑩ ❶✔⑦ ➜✬➝▼⑨ ⑩✦❶✔➀▼⑧✠❶✔⑩❴❿ ❶✣❹❻➒▼➒ ⑦ ⑥ ❶✔❿ ➞ ❹❻⑥ ➁❫⑩❴⑥✣⑨ ❹❻❸✠⑦ ➟✔❽
✵✸✺✂✹✠❇r❇r❈ ❃❅❊✯❇❉❈ ❊❋✷ ✲✯✹➓♠✌✴✶❑ ♠✌✸◆ ❑ ◆✸❇❉✴✶✺✂✹➓♠✌✴▼❑ ❑ ✹✠❡◗✉✂❬✌❱❲✐❤❘✣❚ ❳✌❵✶❧✼➢❤❊
❖ ➠✿➡
✴✶❊✯❇r●❋✹✠✺✼✷ ◆✸✵✸❑ ✹➓❈ ❇❉✹✠❇r❇r✹✠❊✯✷ ❈ ✴✶❑ ❑ ➤✪✴✶❊❋✴✶❇r❇r❈ ➥✣❊✯❄❆✹✠❊✯✷❁❃❅■✯♠✌❃❅❊✯❇r✷ ✴✶❊✯✷ ❇
✷ ❃❏▲❁✴✶✺✂❈ ✴✶➦✣❑ ✹✠❇❉✷ ✲✯✴✶✷❁❄➧✴✶➨✣✹➓✷ ✲✯✹➓■ ❃❅✺✂❄❆◆✸❑ ✴◗✹✠▲✣✴✶❑ ◆✸✴▼✷ ✹➓✷ ❃❏❨✟❱❲❘✣❯▼❧
✂✁ ✄ ✁ ☎ ✁ ✆ ✝✟✞✠✁ ✡ ✁ ☛ ✝ ☞✌✝ ✡ ✄ ✍ ✎ ✆ ✄ ✝ ☞✏✆ ✑ ✒✔✓ ✒✟✁ ☞✌✁ ✕ ✖ ✗ ✆ ✘ ✡ ✁ ✡ ✙ ✂✁ ✄ ✁ ☎ ✁ ✆ ✝✟✞✠✁ ✡ ✁ ☛ ✝ ☞✌✝ ✡ ✄ ✍ ✎ ✆ ✄ ✝ ☞✏✆ ✑ ✒✔✓ ✒✟✁ ☞✌✁ ✕ ✖ ✗ ✆ ✘ ✡ ✁ ✡ ❂
➫➭★✦➯➭✤✦✧✂✩➲✚✜✛✣✢✂✤✦✥✠✧✂★✦✩✪✤✦✢✬✫✜✤✦✢✂✭✯✮✪✢✂✮✪✰ ➫➭✚✜✫➚➾✪★✦➪✣➯➭✮✪✢✂✤✦✰
➾✪➪✣✛✣✛❒✤✦✩✪❮Ï❰❆★✦✮❏✩✪❮ÑÐ❒✤✦➪✣✧✂✤✦Ò➐✢✂✛✣✰ ➾✪✧✂✩✪❮Ï✤✦✢✂✢✬✰✣✤✦✧✂✢✂★✦➪✣✰❒ÙÚ✧✂✥✠ÛÜ✤➸➪✣✤✦✥✠✧✂✩♦ÝÞ✤✦Ò➐★✦ß↔✛❒à
❖
❜✿✲✯✹➓◆✸❇r✹➓❃❅■✯➷✣◆✸✴✶❊✯✷ ❈ ■ ❈ ✹✠✺✂❇
∃X
✴✶❊✯❡
∀X
❈ ❊❋✴◗■ ❃❅✺✂❄❆◆✸❑ ✴◗❈ ❇
I, N,T , A | I, N,T, A ∈ Sailors ∧ T > 7
î
❇r✴✶❈ ❡❣✷ ❃✪♣✏❩ ❭✣➱ ➬❤❧
❜✿✲✯✹➓♠✌❃❅❊✯❡✿❈ ✷ ❈ ❃❅❊ ✹✠❊✯❇r◆✸✺✂✹✠❇❉✷ ✲✯✴✶✷
✈❤✃➧❺✠⑤✌⑥ ⑦ ⑤✌⑧✠⑨ ⑩✿❿ →✶⑤✌❿✶⑦ ❶✯❷✶❹❻❿✠⑧✠❹❻➀❫❷✶➁➓⑦ ❶✯➒ ⑥ ⑩❴⑩ ❽ ❖ I, N, T , A ∈ Sailors
✷ ✲✯✹➓❡✿❃❅❄❆✴✶❈ ❊❋▲✣✴✶✺✂❈ ✴✶➦✣❑ ✹✠❇❉á q✶â❤q✿P✬✴✶❊✯❡ ✴✶✺✂✹➓➦✣❃❅◆✸❊✯❡❣✷ ❃
✹✠✷❁◆✸❇❉✺✂✹✠▲✣❈ ❇r❈ ✷❁✷ ✲✯✹➓❡✿✹✠■ ❈ ❊✯❈ ✷ ❈ ❃❅❊❋❃❅■✯✴◗➷❁◆✸✹✠✺✂➤❉❖ ➺
❖ Ó ■ ❈ ✹✠❑ ❡✿❇❉❃❅■✯✷ ✲✯✹➓❇r✴✶❄❆✹➓ã❫✴✶❈ ❑ ❃❅✺✂❇❉✷ ◆✸✵✸❑ ✹✸❧
❜✿✲✯✹➓✷ ✹✠✺✂❄ ✷ ❃✪✷ ✲✯✹➓❑ ✹✠■ ✷❁❃❅■✯Õ✯Ö✔×✼❛ ●❋✲✯❈ ♠✌✲❋❇r✲✯❃❅◆✸❑ ❡
x1, x2,..., xn | p x1, x2,..., xn
❖ I, N , T , A
î ➦✣✹➓✺✂✹✠✴✶❡❣✴✶❇❉❵❲❘✣❪❴ä◗❨ ä✶❳✌❨ ❞✯❇r✴▼➤✣❇❉✷ ✲✯✴✶✷❁✹✶▲✣✹✶✺✂➤✪✷ ◆✸✵✸❑ ✹
❜✿✲✯✹✠✺✂✹➓❈ ❇❉✴✶❊❋❈ ❄❆✵✸❃❅✺✂✷ ✴✶❊✯✷❁✺✂✹✠❇r✷ ✺✂❈ ♠✌✷ ❈ ❃❅❊✳❖✼✷ ✲✯✹➓▲✣✴✶✺✂❈ ✴✶➦✣❑ ✹✠❇ I, N , T , A
❖ ✷ ✲✯✴✶✷❁❇r✴✶✷ ❈ ❇r■ ❈ ✹✠❇❏P✿å✠æ◗❈ ❇❉❈ ❊❋✷ ✲✯✹➓✴✶❊✯❇r●❋✹✠✺▼❧
✽✿❧ ❧ ❧ ✽ ❊❋✷ ✲✯✴✶✷✯✴✶✵✸✵✸✹✠✴✶✺✼✷ ❃❏✷ ✲✯✹➓❑ ✹✠■ ✷❁❃❅■✸Õ❁Ö✔×❫❄❆◆✸❇✌✷❁➦✣✹
➡✦Ô ➡ ❃❅❡✿❈ ■ ➤✪✷ ✲✯❈ ❇❉➷✣◆✸✹✠✺✂➤❥✷ ❃❏✴✶❊✯❇r●❋✹✠✺▼❖
✷ ✲✯✹➓❬✌❭✣❚ ➵◗■ ✺✂✹✠✹➓▲✣✴✶✺✂❈ ✴✶➦✣❑ ✹✠❇❉❈ ❊❋✷ ✲✯✹➓■ ❃❅✺✂❄➧◆✸❑ ✴◗✵↔❛ ❧ ❧ ❧ ❞✏❧
❖ ç
✈❤è✠⑦ ❷✶➁◗❶✔⑤✌⑦ ⑨ ❹❻⑥ ❶❁é↔→✶❹♦⑤✌⑥ ⑩✦❹❻⑨ ➁❫⑩❴⑥✣❿ →✠⑤✌❷↔êrë➐❹❻⑥✣→✠⑤✌❺✠⑩✿⑤➐⑥ ⑤✌❿ ⑦ ❷✶❸♦➀❫❷✶➁❫⑩❴⑥
ì
❾r⑤✌❷✶➁➓⑤✌⑥ ⑩✦➟✔⑤✌⑨ ⑨ ⑩❴➁◗í î ❹❻⑩❴ï ❽
✂✁ ✄ ✁ ☎ ✁ ✆ ✝✟✞✠✁ ✡ ✁ ☛ ✝ ☞✌✝ ✡ ✄ ✍ ✎ ✆ ✄ ✝ ☞✏✆ ✑ ✒✔✓ ✒✟✁ ☞✌✁ ✕ ✖ ✗ ✆ ✘ ✡ ✁ ✡ ❐ ✂✁ ✄ ✁ ☎ ✁ ✆ ✝✟✞✠✁ ✡ ✁ ☛ ✝ ☞✌✝ ✡ ✄ ✍ ✎ ✆ ✄ ✝ ☞✏✆ ✑ ✒✔✓ ✒✟✁ ☞✌✁ ✕ ✖ ✗ ✆ ✘ ✡ ✁ ✡ Ø
ñ◗ò ó▼ô❣õ✌ö✦ò ÷ ø✦ù õ❍ù ö✦ú û✌ôýüýþ❣ÿ ❁ø ✁ ✄✂ ☎✏û❣ù û✏õ✌û❁ù✆☎✏û✌ô✞✝❴ø▼ö✦ú✠✟☛✡✌☞✌✍ ñ◗ò ó▼ô❣õ✌ö✦ò ÷ ø✦ù õ❍ù ö✦ú û✌ôýüýþ❣ÿ ❁ø ✁ ✄✂ ☎✏û❣ù û✏õ✌û❁✆ù ✏☎ û✌ô❣ö❒ù û✏ô✞❴✝ ø▼ö✦ú
I, N, T, A | I, N,T, A ∈ Sailors ∧ T > 7 ∧
I, N, T, A | I, N,T, A ∈ Sailors ∧ T > 7 ∧
î î
✎■ ❃❅✺
❖ ∃ Ir, Br , D (. . .)
∃ Ir ( ∃ Br ( ∃ D (. . .) ))
❖ ✓✹✠✴✶➦✣♠✌❇r✲❋✹✠✺✂➷✣▲✣◆✸✹➓✴✶❊✯✲✯✷❃❅❈ ●Ü ✷ ✲✯✹➓✵✸✴✶✺✂✹✠❊✯✷ ✲✯✹✠❇r✹✠❇❉♠✌❃❅❊✸✷ ✺✂❃❅❑❅✷ ✲✯✹➓❇r♠✌❃❅✵✸✹➓❃❅■
■ ❈ ✹✠✺✂× ❇❉➦✣❈ ❊✯❡✿❈ ❊✯➥❉❧
❖ ✏Õ ✑ ❃❅❃❅❈ ✷❊✯✹➓❇❉✷ ●❋
✲✯✹➓◆✸❇r✹➓❃❅■
∃
✷ ❃❏■ ❈ ❊✯❡❣✴◗✷ ◆✸✵✸❑ ✹◗❈ ❊❋❝➐✹✠❇r✹✠✺✂▲✣✹✠❇❉✷ ✲✯✴✶✷
❜✿✲✯❈ ❇❉❄❆✴✶➤✪❑ ❃❅❃❅➨❥♠✌◆✸❄❆➦✣✹✠✺✂❇r❃✣❄❆✹✠✽✶➦✣◆✸✷✯●❋❈ ✷ ✲❋✴◗➥✣❃❅❃❅❡❣◆✸❇r✹✠✺
❈ ✷ ✲✯×❫✷ ✲✯✹➓ã❫✴✶❈ ❑ ❃❅✺✂❇❉✷ ◆✸✵✸❑ ✹➓◆✸❊✯❡✿✹✠✺✼♠✌❃❅❊✯❇r❈ ❡✿✹✠✺✂✴✶✷ ❈ ❃❅❊✳❧ ❖
❈ ❊✯✷ ✹✠✺✂■ ✴✶♠✌✹✠✽✶❈ ✷❁❈ ❇❉▲✣✹✠✺✂➤✪❈ ❊✯✷ ◆✸❈ ✷ ❈ ▲✣✹✸❧◗❛ ✕✔
✴▼❈ ✷❁■ ❃❅✺ ❤❀ ❞
✎ ✗✖
➠
✂✁ ✄ ✁ ☎ ✁ ✆ ✝✟✞✠✁ ✡ ✁ ☛ ✝ ☞✌✝ ✡ ✄ ✍ ✎ ✆ ✄ ✝ ☞✏✆ ✑ ✒✔✓ ✒✟✁ ☞✌✁ ✕ ✖ ✗ ✆ ✘ ✡ ✁ ✡ ð ✂✁ ✄ ✁ ☎ ✁ ✆ ✝✟✞✠✁ ✡ ✁ ☛ ✝ ☞✌✝ ✡ ✄ ✍ ✎ ✆ ✄ ✝ ☞✏✆ ✑ ✒✔✓ ✒✟✁ ☞✌✁ ✕ ✖ ✗ ✆ ✘ ✡ ✁ ✡ ✒
î
I, N, T, A | I, N,T, A ∈ Sailors ∧
î
I, N, T, A | I, N,T, A ∈ Sailors ∧
∀ B, BN,C ¬ B, BN,C ∈ Boats ∨
∀ B, BN, C ∈ Boats
∃ Ir, Br, D ∈ Re serves I = Ir ∧ Br = B
∃ Ir, Br, D Ir, Br, D ∈ Re serves ∧ I = Ir ∧ Br = B
❈ ❊✯❡❣✴✶❑ ❑❅❇r✴✶❈ ❑ ❃❅✺✂❇❉á✯❇r◆✸♠✌✲❋✷ ✲✯✴✶✷❁■ ❃❅✺✼✹✠✴✶♠✏✲ ✢✜✤✣ ✷ ◆✸✵✸❑ ✹ ã❫❈ ❄❆✵✸❑ ✹✠✺✼❊✯❃❅✷ ✴✶✷ ❈ ❃❅❊✯✽✶❇r✴▼❄❆✹➓➷✣◆✸✹✠✺✂➤❉❧◗❛ ◆✸♠✌✲❋♠✌❑ ✹✠✴✶✺✂✹✠✺ ❞
❖ ✛
✹✠❈ ✷ ✲✯✹✠✺✼❈ ✷❁❈ ❇❉❊✯❃❅✷❁✴◗✷ ◆✸✵✸❑ ✹➓❈ ❊❋❀✿❃❅✴✶✷ ❇♦❃❅✺✼✷ ✲✯✹✠✺✂✹➓❈ ❇❉✴◗✷ ◆✸✵✸❑ ✹➓❈ ❊
B, BN,C ❖ ç ✖
❜✿❃❏■ ❈ ❊✯❡❣❇r✴✶❈ ❑ ❃❅✺✂❇❉●❋✲✯❃❅× ▲✣✹➓✺✂✹✠❇r✹✠✺✂▲✣✹✠❡❣✴▼❑ ❑❅✺✂✹✠❡❣➦✣❃❅✴✶✷ ❇❅❖
❝➐✹✠❇r✹✠✺✂▲✣✹✠❇❉❇r✲✯❃❅●❋❈ ❊✯➥✪✷ ✲✯✴✶✷✯❇r✴✶❈ ❑ ❃❅✺✼á✯✲✯✴▼❇❉✺✂✹✠❇r✹✠✺✂▲✣✹✠❡❣❈ ✷✌❧ ❖
❆❆❆❆❆
C ≠ ’red ’ ∨ ∃ Ir, Br, D ∈ Re serves I = Ir ∧ Br = B
✘ ✥
✂✁ ✄ ✁ ☎ ✁ ✆ ✝✟✞✠✁ ✡ ✁ ☛ ✝ ☞✌✝ ✡ ✄ ✍ ✎ ✆ ✄ ✝ ☞✏✆ ✑ ✒✔✓ ✒✟✁ ☞✌✁ ✕ ✖ ✗ ✆ ✘ ✡ ✁ ✡ ✂✁ ✄ ✁ ☎ ✁ ✆ ✝✟✞✠✁ ✡ ✁ ☛ ✝ ☞✌✝ ✡ ✄ ✍ ✎ ✆ ✄ ✝ ☞✏✆ ✑ ✒✔✓ ✒✟✁ ☞✌✁ ✕ ✖ ✗ ✆ ✘ ✡ ✁ ✡ ✙
❖ ◆ ✷❁❈ ❇❉✵✸❃❅❇r❇r❈ ➦✣❑ ✹➓✷ ❃❏●❋✺✂❈ ✷ ✹➓❇r➤✣❊✯✷ ✴▼♠✌✷ ❈ ♠✌✴✶❑ ❑ ➤❥♠✌❃❅✺✂✺✂✹✠♠✌✷❁♠✌✴✶❑ ♠✏◆✸❑ ◆✸❇
➷✣◆✸✹✠✺✂❈ ✹✠❇❉✷ ✲✯✴✶✷❁✲✯✴✶▲✣✹➓✴✶❊❋❈ ❊✯■ ❈ ❊✯❈ ✷ ✹➓❊✯◆✸❄➧➦✣✹✠✺✼❃❅■✯✴✶❊✯❇r●❋✹✠✺✂❇ ❝➐✹✠❑ ✴✶✷ ❈ ❃❅❊✯✴✶❑❅♠✌✴✶❑ ♠✌◆✸❑ ◆✸❇❉❈ ❇❉❊✯❃❅❊ ❲✣ ❃❅✵✸✹✠✺✂✴✶✷ ❈ ❃❅❊✯✴✶❑ ✽✶✴✶❊✯❡
ã❫◆✸♠✌✲❋➷✣◆✸✹✠✺✂❈ ✹✠❇❉✴✶✺✂✹➓♠✌✴✶❑ ❑ ✹✠❡❣❘✣❭✣❵❲❳ ✉❴❯ ❧
✖ ❖
◆✸❇r✹✠✺✂❇❉❡✿✹✠■ ❈ ❊✯✹➓➷✣◆✸✹✠✺✂❈ ✹✠❇❉❈ ❊❋✷ ✹✠✺✂❄❆❇❉❃❅■✯●❋✲✯✴✶✷❁✷ ✲✯✹✠➤
●❋✴✶❊✯✷ ✽✶❊✯❃❅✷❁❈ ❊❋✷ ✹✠✺✂❄❆❇♦❃❅■✯✲✯❃❅●Ü✷ ❃✪♠✌❃✣❄➧✵✸◆✸✷ ✹➓❈ ✷✌❧
✈❤⑩❴❽ ❸✠❽ ❾
S | ¬ S ∈ Sailors
❛ ❦❤✹✠♠✌❑ ✴✶✺✂✴✶✷ ❈ ▲✣✹✠❊✯✹✠❇r❇❅❧ ❞
î
✷❁❈ ❇❉➨✣❊✯❃❅●❋❊❋✷ ✲✯✴✶✷❁✹✶▲✣✹✠✺✂➤✪➷✣◆✸✹✠✺✂➤✪✷ ✲✯✴✶✷✸♠✌✴✶❊❋➦✣✹➓✹ ✵✸✺✂✹✠❇r❇r✹✠❡ ➢❤❑ ➥✣✹✠➦✣✺✂✴◗✴✶❊✯❡❣❇r✴✶■ ✹➓♠✌✴✶❑ ♠✌◆✸❑ ◆✸❇❉✲✯✴▼▲✣✹➓❇r✴▼❄❆✹
❖ ◆❈ ❊❋✺✂✹✠❑ ✴✶✷ ❈ ❃❅❊✯✴✶❑❅✴✶❑ ➥✣✹✠➦✣✺✂✴◗♠✌✴✶❊❋➦✣✹➓✹ ✵✸✺✂✹✠❇r❇r✹✠❡❣✴✶❇❉✴◗➡ ❇r✴✶■ ✹ ❖
✹ ✵✸✺✂✹✠❇r❇r❈ ▲✣✹➓✵✸❃❅●❋✹✠✺✂✽✶❑ ✹✠✴✶❡✿❈ ❊✯➥✪✷ ❃❏✷ ✲✯✹◗❊✯❃❅✷ ❈ ❃❅❊❋❃❅■
➡
➷✣◆✸✹✠✺✂➤✪❈ ❊❋❦❤❝➐■ ✱ ❖➓❜✦❝➐✁ ✱ P▼✷ ✲✯✹➓♠✌❃❅❊✯▲✣✹✠✺✂❇✌✹➓❈ ❇❉✴✶❑ ❇r❃❏✷ ✺✂◆✸✹✸❧
➡ ✺✂✹✠❑ ✶
✴ ✷ ❈ ❃❅❊✯✴✶❑❅♠✌❃❅❄❆✵✸❑ ✹✠✷ ✹✠❊✯✹✠❇r❇❅❧
3. Find the names of sailors 'Who have reserved boat number 103.
SELECT R.sid FROM Boats B, Reserves R WHERE B.bid = R.bid AND 8.color
= 'red'
7. Find the names of sailors who have reserved at least one boat.
8. Find the names of sailors who have reserved a red or a green boat.
(Or)
9. Find the names of sailors who have reserved both a red and a green boat.
SELECT S.sname FROM Sailors S, Reserves R1, Boats B1, Reserves R2, Boats
B2 WHERE S.sid = Rl.sid AND R1.bid = Bl.bid AND S.sid = R2.sid AND
R2.bid =B2.bid AND B1.color='red' AND B2.color = 'green'
(or)
SELECT S.snarne FROM Sailors S, Reserves R, Boats B WHERE S.sid = R.sid
AND R.bid = B.bid AND B.color = 'red'
INTERSECT
SELECT S2.sname FROM Sailors S2, Boats B2, Reserves R2
WHERE S2.sid = R2.sid AND R2.bid = B2.bid AND B2.color = 'green'
.
(Q 19) Find the sids of all sailor's who have reserved red boats but not green
boats.
SELECT S.sid FROM Sailors S, Reserves R, Boats B WHERE S.sid = R.sid AND
R.bid = B.bid AND B.color = 'red'
EXCEPT
Nested Queries
A nested query is a query that has another query embedded within it; the embedded
query is called a suhquery. The embedded query can of course be a nested query
itself; thus queries that have very deeply nested structures are possible.
3. Find the names of sailors who have not reserved a red boat.
SELECT S.sname FROM Sailors S WHERE S.sid NOT IN
( SELECT R.sid FROM Reserves R WHERE R.bid IN
( SELECT B.bid FROM Boats B WHERE B.color = 'red' ))
Correlated Nested Queries
In nested query subquery is executed only once but in correlated nested query sub
query is executed as many number of times as many rows are there in relation of
main query.
Q.Find the names of sailors who have reserved boat number 103.
The EXISTS operator is another set comparison operator, such as IN. It allows us to test whether
a set is nonempty, an implicit comparison with the empty set. Thus, for each Sailor row 5, we
test whether the set of Reserves rows R such that R.bid = 103 AND S.sid = R.sid is nonempty.
Set-Comparison Operators
set-comparison operators are EXISTS, IN, and UNIQUE, along with their negated versions.
SQL also supports op ANY and op ALL, where op is one of the arithmetic comparison operators
{<, <=, =, <>, >=, >}.
AGGREGATE OPERATORS
SQL supports five aggregate operations, which can be applied on any column, say A, of a
relation:
1. COUNT ([DISTINCT] A): The number of (unique) values in the A column.
2. SUM ([DISTINCT] A): The sum of all (unique) values in the A column.
3. AVG ([DISTINCT] A): The average of all (unique) values in the A column.
4. MAX (A): The maximum value in the A column.
5. MIN (A): The minimum value in the A column.
NULL VALUES
SQL provides a special column value called null to use in situations when the column value is
either unknown or inapplicable.
Eg:- Suppose the Sailor table definition was modified to include a rnaiden-name column.
However, only married women who take their husband's last name have a maiden name. For
women who do not take their husband's name and for men, the rmaiden_name colun are
inapplicable.
SQL provides a special comparison operator ISNULL to fint out null value for a column.
We can disallow null values by specifying NOT NULL as part of the field definition; for
example, sname CHAR(20) NOT NULL. In addition, the fields in a primary key are not allowed
to take on null values. Thus, there is an implicit NOT NULL constraint for every field listed in a
PRIMARY KEY constraint.
JOINS
S1 A C1 C 5k
S2 A C1 C 5k
S1 A C2 C++ 10k
S3 B C2 C++ 10k
S3 B C3 JAVA 15k
Primary Key(SID,CID)
Here all the data is stored in a single table which causes redundancy of data or say anomalies as
SID and Sname are repeated once for same CID . Let us discuss anomalies one bye one.
Updation/Modification Anomaly
Insertion Anomaly
Deletion Anomaly
1. Problem in updation / updation anomaly – If there is updation in the fee from 5000 to
7000, then we have to update FEE column in all the rows, else data will become
inconsistent.
2. Insertion Anomaly and Deleteion Anomaly- These anamolies exist only due to
redundancy, otherwise they do not exist.
Insertion Anomaly :
New course is introduced C4, But no student is there who is having C4 subject.
Because of insertion of some data, It is forced to insert some other dummy data.
3.
Deletion Anomaly :
Deletion of S3 student cause the deletion of course.
Because of deletion of some data forced to delete some other useful
data.
For example: Suppose we have a student table with attributes: Stu_Id, Stu_Name,
Stu_Age. Here Stu_Id attribute uniquely identifies the Stu_Name attribute of student
table because if we know the student id we can tell the student name associated with
it. This is known as functional dependency and can be written as Stu_Id->Stu_Name or
in words we can say Stu_Name is functionally dependent on Stu_Id.
Formally:
If column A of a table uniquely identifies the column B of same table then it can
represented as A->B (Attribute B is functionally dependent on attribute A)
For example: Consider a table with two columns Student_id and Student_Name.
Also, Student_Id -> Student_Id & Student_Name -> Student_Name are trivial dependencies too.
For example:
An employee table with three attributes: emp_id, emp_name, emp_address.
The following functional dependencies are non-trivial:
emp_id -> emp_name (emp_name is not a subset of emp_id)
emp_id -> emp_address (emp_address is not a subset of emp_id)
On the other hand, the following dependencies are trivial:
{emp_id, emp_name} -> emp_name [emp_name is a subset of {emp_id, emp_name}]
Refer: trivial functional dependency.
Multivalued dependency
Multivalued dependency occurs when there are more than one independent multivalued
attributes in a table.
For example: Consider a bike manufacture company, which produces two colors (Black and
white) in each model every year.
Here columns manuf_year and color are independent of each other and dependent on
bike_model. In this case these two columns are said to be multivalued dependent on bike_model.
These dependencies can be represented like this:
Transitive dependency
X -> Z is a transitive dependency if the following three functional dependencies hold true:
X->Y
Y does not ->X
Y->Z
Note: A transitive dependency can only occur in a relation of three of more attributes. This
dependency helps us normalizing the database in 3NF (3rdNormal Form).
Inference Rules
Armstrong’s axioms are a set of axioms (or, more precisely, inference rules) used to
infer all the functional dependencies on a relational database. They were developed by
William W. Armstrong.
Let R(U) be a relation scheme over the set of attributes U. We will use the letters X, Y, Z
to represent any subset of and, for short, the union of two sets of attributes and by
instead of the usual X U Y.
Employee-Department
If:
{SSN} → {DNO}
{DNO} → {DName}
Then also:
{SSN} → {DName}
Union rule:
o if X → Y and X → Z then: X → YZ
ACD, ABD, ADE, ABDE, ACDB, ACDE, ACDBE. {From Previous Post Eg.}
Neglecting the last four keys as they can be trimmed down, so, checking
Hence none of proper sets of SuperKeys is not able to determine all attributes of
R, So ACD, ABD, ADE all are minimal superkeys or candidate keys.
So, Super Keys will be B, AB, BC, BD, BE, BAC, BAD, BAE, BCD, BCE, BDE,
Taking the first one key, as all other keys can be trimmed down -
(Φ)+ = {Φ}
⇒ Φ → Φ
⇒ 1 FD
(A)+ = {ABC}
⇒ A → Φ, A → A, A → B, A → C,
⇒ 8 FDs = (2)3
(B)+ = {BC}
⇒ B → Φ, B → B, B → C, B → BC
⇒ 4 FDs = (2)2
(C)+ = {C}
⇒ C → Φ, C → C
⇒ 2 FDs = (2)1
(AB)+ = {ABC}
⇒ AB → Φ, AB → A, AB → B, AB → C,
AB → AB, AB → BC, AB → AC, AB → ABC
⇒ 8 FDs = (2)3
(BC)+ = {BC}
⇒ BC → Φ, BC → B, BC → C, BC → BC
⇒ 4 FDs = (2)2
(AC)+ = {ABC}
⇒ AC → Φ, AC → A, AC → C, AC → C,
(ABC)+ = {ABC}
⇒ 8 FDs = (2)3
F+ = {
B → Φ, B → B, B → C, B → BC, C → Φ, C → C, AB → Φ, AB → A, AB → B,
(Φ)+ = {Φ} ⇒ 1
Total = 13
First normal form
First normal form (1NF) is a property of a relation in a relational database. A relation is in first normal
form if and only if the domain of each attribute contains only atomic (indivisible) values, and the value
of each attribute contains only a single value from that domain.
Designs that Violate 1NF-Below is a table that stores the names and telephone numbers of
customers. One requirement though is to retain multiple telephone numbers for some customers.
The simplest way of satisfying this requirement is to allow the "Telephone Number" column in any
given row to contain more than one value:
Customer
Customer First
Surname Telephone Number
ID Name
Designs that Comply with 1NF-To bring the model into the first normal form, we split the
strings we used to hold our telephone number information into "atomic" (i.e. indivisible)
entities: single phone numbers. And we ensure no row contains more than one phone
number.
Customer
What is a transaction
1
Desirable Properties of ACID Transactions
Recovery
System failures, either hardware or software, must not result
in an inconsistent database
2
Transaction as a Recovery Unit
If an error or hardware/software crash occurs between the begin and
end, the database will be inconsistent
Computer Failure (system crash)
A transaction or system error
Local errors or exception conditions detected by the transaction
Concurrency control enforcement
Disk failure
Physical problems and catastrophes
The database is restored to some state from the past so that a correct
state—close to the time of failure—can be reconstructed from the past
state.
A DBMS ensures that if a transaction executes some updates and then a
failure occurs before the transaction reaches normal termination, then
those updates are undone.
The statements COMMIT and ROLLBACK (or their equivalent) ensure
Transaction Atomicity
Recovery
Mirroring
keep two copies of the database and maintain them simultaneously
Backup
periodically dump the complete state of the database to some form of
tertiary storage
System Logging
the log keeps track of all transaction operations affecting the values of
database items. The log is kept on disk so that it is not affected by
failures except for disk and catastrophic failures.
3
Recovery from Transaction Failures
Catastrophic failure
Restore a previous copy of the database from archival backup
Apply transaction log to copy to reconstruct more current state
by redoing committed transaction operations up to failure point
Incremental dump + log each transaction
Non-catastrophic failure
Reverse the changes that caused the inconsistency by undoing
the operations and possibly redoing legitimate changes which
were lost
The entries kept in the system log are consulted during
recovery.
No need to use the complete archival copy of the database.
Transaction States
For recovery purposes the system needs to keep track of when a
transaction starts, terminates and commits.
Begin_Transaction: marks the beginning of a transaction execution;
End_Transaction: specifies that the read and write operations have ended and
marks the end limit of transaction execution (but may be aborted because of
concurrency control);
Commit_Transaction: signals a successful end of the transaction. Any updates
executed by the transaction can be safely committed to the database and will not
be undone;
Rollback (or Abort): signals that the transaction has ended unsuccessfully. Any
changes that the transaction may have applied to the database must be undone;
Undo: similar to ROLLBACK but it applies to a single operation rather than to a
whole transaction;
Redo: specifies that certain transaction operations must be redone to ensure
that all the operations of a committed transaction have been applied successfully
to the database;
4
Entries in the System Log
For every transaction a unique transaction-id is generated Credit_labmark (sno
by the system. NUMBER, cno CHAR, credit
NUMBER)
[start_transaction, transaction-id]: the start of old_mark NUMBER;
execution of the transaction identified by transaction-id new_mark NUMBER;
END credit_labmark;
Transaction execution
A transaction reaches its commit point when all
operations accessing the database are completed
and the result has been recorded in the log. It then
writes a [commit, transaction-id].
BEGIN END
TRANSACTION TRANSACTION
active partially
committed COMMIT
committed
ROLLBACK ROLLBACK
READ, WRITE
terminated
failed
If a system failure occurs, searching the log and rollback the transactions that
have written into the log a
[start_transaction, transaction-id]
[write_item, transaction-id, X, old_value, new_value]
but have not recorded into the log a [commit, transaction-id]
5
Read and Write Operations of a Transaction
Specify read or write operations on the database items that are executed
as part of a transaction
read_item(X):
reads a database item named X into a program variable also named X.
1. find the address of the disk block that contains item X
2. copy that disk block into a buffer in the main memory
3. copy item X from the buffer to the program variable named
write_item(X):
writes the value of program variable X into the database item named X.
1. find the address of the disk block that contains item X
2. copy that disk block into a buffer in the main memory
3. copy item X from the program variable named X into its current location
in the buffer store the updated block in the buffer back to disk (this step
updates the database on disk)
X:= X
6
Write Ahead Logging
“In place” updating protocols: Overwriting data in situ
Deferred Update: Immediate Update:
no actual update of the the database may be updated
database until after a by some operations of a
transaction reaches its transaction before it reaches its
commit point commit point.
1. Updates recorded in log 1. Update X recorded in log
2. Transaction commit point 2. Update X in database
3. Update Y recorded in log FAILURE!
3. Force log to the disk
UNDO X
4. Update the database 4. Transaction commit point
3. Force log to the disk FAILURE!
FAILURE! 4. Update Y in database REDO Y
REDO database from log
entries
No UNDO necessary because • Undo in reverse order in log
database never altered • Redo in committed log order
• uses the write_item log entry
Net result
Account A 800
Account B 500
Account C 400
7
Transaction scheduling algorithms
Transaction Serialisability
The effect on a database of any number of transactions
executing in parallel must be the same as if they were
executed one after another
item X has incorrect value because its update from T1 is “lost” (overwritten)
T2 reads the value of X before T1 changes it in the database and hence the
updated database value resulting from T1 is lost
8
The Incorrect Summary or Unrepeatable Read Problem
One transaction is calculating an aggregate summary function on a
number of records while other transactions are updating some of these
records.
The aggregate function may calculate some values before they are
updated and others after.
transaction T1 fails and must change the value of X back to its old value
meanwhile T2 has read the “temporary” incorrect value of X
9
Schedules of Transactions
10
Example of Serial Schedules
Schedule A •Schedule B
T1: T2: T1: T2:
read_item(X); read_item(X);
X:= X - N; X:= X + M;
write_item(X); write_item(X);
read_item(Y); read_item(X);
Y:=Y + N; X:= X - N;
write_item(Y); write_item(X);
read_item(X); read_item(Y);
X:= X + M; Y:=Y + N;
write_item(X); write_item(Y);
Schedule C •Schedule D
11
Precedence graphs (assuming read X before write X)
T1: T2: T1: T2:
read_item(X); read_item(X);
X:= X - N; X:= X + M;
write_item(X); write_item(X);
read_item(Y); read_item(X);
Y:=Y + N; X:= X - N;
write_item(Y); write_item(X);
read_item(X); read_item(Y);
X:= X + M; Y:=Y + N;
write_item(X); write_item(Y);
12
Semantic Serialisability
Some applications can produce schedules that are
correct but aren’t conflict or view serialisable.
e.g. Debit/Credit transactions (Addition and
subtraction are commutative)
T1 T2
read_item(X); read_item(Y); Schedule
X:=X-10; Y:=Y-20; T1 T2
write_item(X); write_item(Y); read_item(X);
read_item(Y); read_item(Z); X:=X-10;
Y:=Y+10; Z:+Z+20; write_item(X);
write_item(Y); write_item(Z); read_item(Y);
Y:=Y-20;
write_item(Y);
read_item(Y);
Y:=Y+10;
write_item(Y);
13
Locking Techniques for Concurrency Control
The concept of locking data items is one of the main
techniques used for controlling the concurrent
execution of transactions.
A lock is a variable associated with a data item in the
database. Generally there is a lock for each data item
in the database.
A lock describes the status of the data item with
respect to possible operations that can be applied to
that item. It is used for synchronising the access by
concurrent transactions to the database items.
A transaction locks an object before using it
When an object is locked by another transaction, the
requesting transaction must wait
Types of Locks
Binary locks have two possible states:
1. locked (lock_item(X) operation) and
2. unlocked (unlock_item(X) operation
Multiple-mode locks allow concurrent access to the
same item by several transactions. Three possible
states:
1. read locked or shared locked (other transactions are allowed
to read the item)
2. write locked or exclusive locked (a single transaction
exclusively holds the lock on the item) and
3. unlocked.
Locks are held in a lock table.
upgrade lock: read lock to write lock
downgrade lock: write lock to read lock
14
Locks don’t guarantee serialisability: Lost Update
T1: (joe) T2: (fred) X Y
write_lock(X)
read_item(X); 4
X:= X - N; 2
unlock(X)
write_lock(X)
read_item(X); 4
X:= X + M; 7
unlock(X)
write_lock(X)
write_item(X); 2
unlock(X)
write_lock(Y)
read_item(Y); 8
write_lock(X)
write_item(X); 7
unlock(X)
Y:= Y + N; 10
write_item(Y); 10
unlock(Y)
X=20, Y=30
T1 T2
read_lock(Y); read_lock(X);
read_item(Y); read_item(X);
unlock(Y); unlock(X);
write_lock(X); write_lock(Y);
read_item(X); read_item(Y);
X:=X+Y; Y:=X+Y;
write_item(X); write_item(Y);
unlock(X); unlock(Y);
X is unlocked too early
Y is unlocked too early
15
Non-serialisable schedule S that uses locks
X=20 T1 T2
read_lock(Y);
Y=30 read_item(Y);
unlock(Y);
read_lock(X);
read_item(X);
unlock(X);
write_lock(Y);
read_item(Y);
Y:=X+Y;
write_item(Y);
unlock(Y);
write_lock(X);
read_item(X);
X:=X+Y;
write_item(X);
unlock(X);
T1 T2
read_lock(Y); read_lock(X);
read_item(Y); read_item(X);
write_lock(X); write_lock(Y);
unlock(Y); unlock(X);
read_item(X); read_item(Y);
X:=X+Y; Y:=X+Y;
write_item(X); write_item(Y);
unlock(X); unlock(Y);
16
Two-Phasing Locking
Basic 2PL
When a transaction releases a lock, it may not request another lock
lock point
obtain lock
number
of locks release lock
Phase 1 Phase 2
BEGIN END
Two-Phasing Locking
Strict 2PL a transaction does not release any of its
locks until after it commits or aborts
leads to a strict schedule for recovery
obtain lock
Transaction
BEGIN period of data END duration
item use
17
Locking Problems: Deadlock
Each of two or more transactions is waiting for the other to
release an item. Also called a deadly embrace
T1 T2
read_lock(Y);
read_item(Y);
read_lock(X);
read_item(X);
write_lock(X);
write_lock(Y);
cautious waiting
time outs
18
Locking Granularity
A database item could be
a database record
a field value of a database record
a disk block
the whole database
Trade-offs
coarse granularity
the larger the data item size, the lower the degree of
concurrency
fine granularity
the smaller the data item size, the more locks to be
managed and stored, and the more lock/unlock
operations needed.
19
Recovery: Shadow Paging Technique
20
Shadow Paging Technique
To recover from a failure Databasedatapages (blocks)
the state of the database before
Current pagetable
transaction execution is available (after updatingpages
page5(old)
Shadowpagetable
through the shadow page table 2,6) page1 (notupdated)
free modified pages
page4
discard currrent page table 1
1
2
that state is recovered by 2
page2(old) 3
3
reinstating the shadow page table 4
4
5
to become the current page table 5 page3 6
6
once more
Commiting a transaction page6
Garbage collection
1. read phase: read from the database, but updates are applied only to
local copies
2. validation phase: check to ensure serialisability will not be validated if
the transaction updates are actually applied to the database
3. write phase: if validation is successful, transaction updates applied to
database; otherwise updates are discarded and transaction is aborted
and restarted.
21
Validation Phase
Use transaction timestamps
write_sets and read_sets maintained
Transaction B is committed or in its validation phase
Validation Phase for Transaction A
To check that TransA does not interfere with TransB the
following must hold:
TransB completes its write phase before TransA starts its reads
phase
TransA starts its write phase after TransB completes its write phase,
and the read set of TransA has no items in common with the write
set of TransB
Both the read set and the write set of TransA have no items in
common with the write set of TransB, and TransB completes its read
phase before TransA completes its read phase.
Conclusions
Transaction management deals with two key
requirements of any database system:
Resilience
in the ability of data surviving hardware crashes and
software errors without sustaining loss or becoming
inconsistent
Access Control
in the ability to permit simultaneous access of data multiple
users in a consistent manner and assuring only authorised
access
22
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Scanned by CamScanner
Storage Structure
Relative data and information is stored collectively in file formats. A file is a
sequence of records stored in binary format. A disk drive is formatted into
several blocks that can store records. File records are mapped onto those
disk blocks.
File Organization
File Organization defines how file records are mapped onto disk blocks. We
have four types of File Organization to organize file records −
Indexing
We know that data is stored in the form of records. Every record has a key
field, which helps it to be recognized uniquely.
Primary Index − Primary index is defined on an ordered data file. The data file
is ordered on a key field. The key field is generally the primary key of the
relation.
Dense Index
Sparse Index
Dense Index
In dense index, there is an index record for every search key value in the
database. This makes searching faster but requires more space to store
index records itself. Index records contain search key value and a pointer to
the actual record on the disk.
Sparse Index
In sparse index, index records are not created for every search key. An
index record here contains a search key and an actual pointer to the data
on the disk. To search a record, we first proceed by index record and reach
at the actual location of the data. If the data we are looking for is not where
we directly reach by following the index, then the system starts sequential
search until the desired data is found.
B+ Tree
A B+ tree is a balanced binary search tree that follows a multi-level index
format. The leaf nodes of a B+ tree denote actual data pointers. B+ tree
ensures that all leaf nodes remain at the same height, thus balanced.
Additionally, the leaf nodes are linked using a link list; therefore, a B + tree
can support random access as well as sequential access.
Structure of B+ Tree
Every leaf node is at equal distance from the root node. A B + tree is of the
order n where n is fixed for every B+ tree.
Internal nodes −
Internal (non-leaf) nodes contain at least ⌈n/2⌉ pointers, except the root node.
Leaf nodes −
Leaf nodes contain at least ⌈n/2⌉ record pointers and ⌈n/2⌉ key values.
At most, a leaf node can contain n record pointers and n key values.
Every leaf node contains one block pointer P to point to next leaf node and forms
a linked list.
B+ Tree Insertion
B+ trees are filled from bottom and each entry is done at the leaf node.
o Partition at i = ⌊(m+1)/2⌋.
B+ Tree Deletion
B+ tree entries are deleted at the leaf nodes.
o If it is an internal node, delete and replace with the entry from the left
position.
o If underflow occurs, distribute the entries from the nodes left to it.
Hash Function − A hash function, h, is a mapping function that maps all the
set of search-keys K to the address where actual records are placed. It is a
function from search keys to bucket addresses.
Static Hashing
In static hashing, when a search-key value is provided, the hash function
always computes the same address. For example, if mod-4 hash function is
used, then it shall generate only 5 values. The output address shall always
be same for that function. The number of buckets provided remains
unchanged at all times.
Operation
Insertion − When a record is required to be entered using static hash, the hash
function h computes the bucket address for search key K, where the record will
be stored.
Search − When a record needs to be retrieved, the same hash function can be
used to retrieve the address of the bucket where the data is stored.
Bucket Overflow
The condition of bucket-overflow is known as collision. This is a fatal state
for any static hash function. In this case, overflow chaining can be used.
Overflow Chaining − When buckets are full, a new bucket is allocated for the
same hash result and is linked after the previous one. This mechanism is
called Closed Hashing.
Operation
Querying − Look at the depth value of the hash index and use those bits to
compute the bucket address.
Deletion − Perform a query to locate the desired data and delete the same.
o Else
o If all the buckets are full, perform the remedies of static hashing.
Hashing is not favorable when the data is organized in some ordering and
the queries require a range of data. When data is discrete and random,
hash performs the best.
Hashing algorithms have high complexity than indexing. All hash operations
are done in constant time.
B - Trees
In a binary search tree, AVL Tree, Red-Black tree etc., every node can have
only one value (key) and maximum of two children but there is another type
of search tree called B-Tree in which a node can store more than one value
(key) and it can have more than two children. B-Tree was developed in the
year of 1972 by Bayer and McCreight with the name Height Balanced m-
way Search Tree. Later it was named as B-Tree.
Here, number of keys in a node and number of children for a node is depend
on the order of the B-Tree. Every B-Tree has order.
Example
Operations on a B-Tree
The following operations are performed on a B-Tree...
1. Search
2. Insertion
3. Deletion
Example
Construct a B-Tree of Order 3 by inserting numbers from 1 to 10.
← Previous
Overview of Storage and Indexing
1
Data on External Storage
Disks: Can retrieve random page at fixed cost
But reading several consecutive pages is much cheaper than
reading them in random order
Tapes: Can only read pages in sequence
Cheaper than disks; used for archival storage
File organization: Method of arranging a file of records
on external storage.
Record id (rid) is sufficient to physically locate record
Indexes are data structures that allow us to find the record ids of
records with given values in index search key fields
Architecture: Buffer manager stages pages from external
storage to main memory buffer pool. File and index
layers make calls to the buffer manager. Page: typically
4 Kbytes.
2
Alternative File Organizations
Many alternatives exist, each ideal for some
situations, and not so good in others:
Heap (random order) files: Suitable when typical
access is a file scan retrieving all records.
Sorted Files: Best if records must be retrieved in
some order, or only a `range’ of records is needed.
Indexes: Data structures to organize records via
trees or hashing.
• Like sorted files, they speed up searches for a subset of
records, based on values in certain (“search key”) fields
• Updates are much faster than in sorted files.
3
Indexes
4
Index Classification
Primary vs. secondary: If search key contains
primary key, then called primary index.
Unique index: Search key contains a candidate key.
Clustered vs. unclustered: If order of data records
is the same as order of data entries, then called
clustered index.
A file can be clustered on at most one search key.
Cost of retrieving data records through index varies
greatly based on whether index is clustered or not!
5
Index Classification
Dense vs Sparse: If there is an entry in the index
for each key value -> dense index (unclustered
indices are dense). If there is an entry for each
page -> sparse index.
1 Brown
5 Chen
.. Peterson
.. Rhodes
Smith
Yu
White
6
Clustered vs. Unclustered Index
To build clustered index, first sort the Heap file (with
some free space on each page for future inserts).
Overflow pages may be needed for inserts. (Thus, order of
data recs is `close to’, but not identical to, the sort order.)
Index entries
CLUSTERED direct search for UNCLUSTERED
data entries
5 13 27 30
9
Static Hashing
# primary pages fixed, allocated sequentially, never de-allocated;
overflow pages if needed.
h(k) mod N = bucket to which data entry with key k belongs. (N =
# of buckets)
Long overflow chains can develop and degrade performance.
Extendible and Linear Hashing: Dynamic techniques to fix this.
h(key) mod N 0
2
key
h
N-1
Primary bucket pages Overflow pages
10
Static Hashing (Contd.)
Buckets contain data entries.
Hash fn works on search key field of record r. Must
distribute values over range 0 ... M-1.
h(key) = (a * key + b) usually works well.
a and b are constants; lots known about how to tune h.
11
Cost Model for Our Analysis
We ignore CPU costs, for simplicity:
B: The number of data pages
R: Number of records per page
D: (Average) time to read or write disk page
Measuring number of page I/O’s ignores gains of
pre-fetching a sequence of pages; thus, even I/O
cost is only approximated.
Average-case analysis; based on several simplistic
assumptions.
Good enough to show the overall trends!
12
Comparing File Organizations
13
Choice of Indexes What indexes should we
create?
One approach: Consider the most important queries
in turn. Consider the best plan using the current
indexes, and see if a better plan is possible with an
additional index. If so, create it.
Obviously, this implies that we must understand how a
DBMS evaluates queries and creates query evaluation plans!
For now, we discuss simple 1-table queries.
Before creating an index, must also consider the
impact on updates in the workload!
Trade-off: Indexes can make queries go faster, updates
slower. Require disk space, too.
Index Selection Guidelines
Attributes in WHERE clause are candidates for index keys.
Exact match condition suggests hash index.
Range query suggests tree index.
• Clustering is especially useful for range queries; can also help on
equality queries if there are many duplicates.
Multi-attribute search keys should be considered when a
WHERE clause contains several conditions.
Try to choose indexes that benefit as many queries as
possible. Since only one index can be clustered per relation,
choose it based on important queries that would benefit the
most from clustering.
17
Examples of Clustered Indexes
20
Summary
Data entries can be actual data records, <key,
rid> pairs, or <key, rid-list> pairs.
Choice orthogonal to indexing technique used to
locate data entries with a given key value.
Can have several indexes on a given file of
data records, each with a different search key.
Indexes can be classified as clustered vs.
unclustered, primary vs. secondary, and
dense vs. sparse. Differences have important
consequences for utility/performance.
21
Understanding the nature of the workload for the
application, and the performance goals, is essential
to developing a good design.
What are the important queries and updates? What
attributes/relations are involved?
Indexes must be chosen to speed up important
queries (and perhaps some updates!).
Index maintenance overhead on updates to key fields.
Choose indexes that can help many queries, if possible.
Build indexes to support index-only strategies.
Clustering is an important decision; only one index on a
given relation can be clustered!
Order of fields in composite index key can be important.
I SAM (I nde x e d Se que nt ia l Ac c e ss
Cha pt e r 5 - T re e I nde x e s
M e t hod)
Given a dynamic file (many insertions and deletions) • the most extensively used indexing method in last decade.
we would like to do frequent independent fetches, consider • mostly promoted by IBM and INGRES DBMS, but obsolete today.
• an unsorted file
• a sorted file • ISAM is simple and efficient as long as no new records are added
• having an index (look up table) It contains
– a memory-resident cylinder index that keeps the highest valued key
for each cylinder
Inverted Files:
– each cylinder contains an index that keeps the highest valued key
• A simplest index structure that is in the form of an ordered list for each block high high
...
where each each entry is a (key, ptr) pair. cylinder value cylinder value
memory-resident
• difficult to maintain cylinder index 1 1001 2 2878
– After insertion and deletions, whole file needs to be shifted.
high high
...
block value block value
Most DBMSs use B+-trees and hash table utilities. index at
1 100 2 170
• we must learn how they work and what performance to expect. cylinder 1
1
De finit ion of a B+-T re e of Orde r v
• The root has at least two children unless it is a leaf. • B+-trees are short and wide.
• No internal node has more than 2v keys. • The records take up more space than the keys and
– Root may have less keys addresses.
– Internal nodes contain only keys and addresses of nodes on – Typically internal nodes carry on 100-200 keys, leaves carry
the next lower level. on 15 records.
• All leaves are on the same level. • A primary index determines the way the records are
– When B+-tree is used as a primary index, the leaves contain actually stored.
the data records. • Clustering index: records are stored together in
– When B+-tree is used as a secondary index, the leaves buckets acc.to the values of the key.
contain the keys and record addresses.
– The records in a given bucket will have nearby key values.
• An internal node with k keys has k +1 children. – The index only note the lowest or the highest key in a given
bucket.
Bucket factor (Bkfr) : the # records that can fit in a leaf node. • For this reason, clustering index, is often called a sparse
Fan-out: the average # children of an internal node. index (e.g., ISAM, a B+-tree with data in the leaves)
K. Dincer Chapter 5 - File Organization 7 K. Dincer Chapter 5 - File Organization 8
and Processing and Processing