0% found this document useful (0 votes)

329 views73 pages

Distributed Database Design

The document discusses distributed database design. It covers topics such as distributed database design concepts, data distribution objectives, data fragmentation, allocation of fragments, and transparencies in distributed database design. It describes issues in distributed database design like placement of data, programs and applications across computer network sites. It also discusses dimensions of the distributed design problem like access patterns, sharing levels and knowledge levels. Finally, it outlines the distributed design process and covers issues like why and how to fragment data, degree of fragmentation, allocation alternatives, and information requirements.

Uploaded by

sheenam_bhatia

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

329 views73 pages

Distributed Database Design

Uploaded by

sheenam_bhatia

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 73

Distributed Database Design

TOPICS
Distributed database design concept,
objective of Data Distribution,
Data Fragmentation,
The allocation of fragment ,
Transparencies in Distributed Database Design

Design Problem
In the general setting :

Making decisions about the placement of data

and programs across the sites of a computer
network as well as possibly designing the
network itself.
In Distributed DBMS, the placement of

applications entails
placement of the distributed DBMS software; and
placement of the applications that run on the

database

Dimensions of the Problem

Access pattern behavior
dynamic
static

data
data +
program

Level of sharing

partial
information
Level of knowledge
complete
information

Distribution Design
Top-down
mostly in designing systems from scratch
mostly in homogeneous systems

Bottom-up
when the databases already exist at a

number of sites

Top-Down Design
Requirements
Analysis
Objectives
User Input
Conceptual
Design

View Integration

View Design

Access
Information

GCS

Distribution
Design
LCSs
Physical
Design
LISs

ESs

User Input

Distribution Design Issues

Why fragment at all?
How to fragment?
How much to fragment?
How to test correctness?
How to allocate?
Information requirements?

Fragmentation
Can't we just distribute relations?
What is a reasonable unit of distribution?
relation
views are subsets of relations locality
extra communication

fragments of relations (sub-relations)

concurrent execution of a number of transactions
that access different portions of a relation
views that cannot be defined on a single fragment
will require extra processing
semantic data control (especially integrity
enforcement) more difficult

Fragmentation Alternatives
Horizontal
PROJ

PROJ1 : projects with budgets less

than $200,000
PROJ2 : projects with budgets
greater than or equal to
$200,000
PROJ1
PNO

PNO

PNAME

BUDGET

P1 Instrumentation 150000
P2 Database Develop.135000
P3 CAD/CAM
250000
P4 Maintenance
310000
P5 CAD/CAM
500000

LOC
Montreal
New York
New York
Paris
Boston

PROJ2
PNAME

P1 Instrumentation

BUDGET

LOC

15000 Montreal
0
P2 Database Develop.135000 New York

PNO
P3

PNAME
CAD/CAM

BUDGET

LOC

250000 New York

P4 Maintenance

310000 Paris

500000 Boston

CAD/CAM

Fragmentation Alternatives
Vertical
PROJ

PROJ1: information about

project budgets
PROJ2: information about
project names and
locations

PNO

PNAME

BUDGET

P1 Instrumentation 150000
P2 Database Develop.135000
P3
CAD/CAM
250000
P4 Maintenance
310000
P5
CAD/CAM
500000

PROJ1

PROJ2

PNO

BUDGET

PNO

P1
P2
P3
P4
P5

150000
135000
250000
310000
500000

PNAME

LOC

P1 Instrumentation Montreal
P2 Database Develop. New York
P3
CAD/CAM
New York
P4 Maintenance
Paris
P5 CAD/CAM
Boston

LOC
Montreal
New York
New York
Paris
Boston

Degree of Fragmentation
finite number of alternatives

tuples
or
attributes

relations

Finding the suitable level of partitioning within

this range

Correctness of Fragmentation
Completeness
Decomposition of relation R into fragments R1, R2, ...,

Rn is complete if and only if each data item in R can

also be found in some Ri
Reconstruction
If relation R is decomposed into fragments R1, R2, ...,

Rn, then there should exist some relational operator

such that
R = 1inRi

Disjointness
If relation R is decomposed into fragments R1, R2, ...,

Rn, and data item di is in Rj, then di should not be in

any other fragment Rk (k j ).

Allocation Alternatives
Non-replicated

partitioned : each fragment resides at only

one site

Replicated

fully replicated : each fragment at each site

partially replicated : each fragment at some

of the sites
If read-only queries << 1, replication is advantageous,
update queries

otherwise replication may cause problems

Rule of thumb:

Comparison of Replication
Alternatives
Full-replication

Partial-replication

Partitioning

QUERY
PROCESSING

Easy

Same Difficulty

DIRECTORY
MANAGEMENT

Easy or
Non-existant

Same Difficulty

CONCURRENCY
CONTROL

Moderate

Difficult

Easy

RELIABILITY

Very high

High

Low

Possible
application

Realistic

Possible
application

REALITY

Information Requirements
Four categories:

Database information

Application information

Communication network information

Computer system information

PHF Information Requirements

Database Information
relationship

SKILL
TITLE, SAL
L1

EMP
ENO, ENAME, TITLE

PROJ
PNO, PNAME, BUDGET, LOC

ASG
ENO, PNO, RESP, DUR

cardinality of each relation: card(R)

Application Information
1. Qualitative Information
. The fundamental qualitative information consists of the
predicates used in user queries.
. Analyze user queries based on 80/20 rule: 20% of user
queries account for 80% of the total data access.
One should investigate the more important queries
2. Quantitative Information
. Minterm Selectivity sel(mi): number of tuples that
would be accessed by a query specified according to
a given minterm predicate.
. Access Frequency acc(mi): the access frequency of a
given minterm predicate in a given period.

Fragmentation
Horizontal Fragmentation (HF)
Primary Horizontal Fragmentation

(PHF)
Derived Horizontal Fragmentation

(DHF)
Vertical Fragmentation (VF)
Hybrid Fragmentation (HF)

Primary Horizontal
Fragmentation
EMP table
Three branch

offices, with each

employee
working at only
one office

Create table MPLS_EMPS as

Select * From EMP Where Loc =
Minneapolis;
Create table LA_EMPS as
Select *From EMP Where Loc = LA;
Create table NY_EMPS as Select *
From EMP Where Loc = New York;

After fragmentation
Select * from MPLS_EMPS
Union
Select * from LA_EMPS)
Union
Select * from NY_EMPS;

Example 1
AP1:looking for those employees who work in Los
Angeles (LA).
Pr = {p1: Loc= LA}

M= {m1: Loc = LA, m2: Loc<>LA}

is a minimal and complete set of minterm predicates for AP1

Fragment F1: Create table LA_EMPS as

Select * from EMP Where Loc = "LA";

Fragment F2: Create table NON_LA_EMPS as
Select * from EMP Where Loc <> "LA";
Minimal :the rows are accessed differently by at least one
application.
Complete :the rows have the same probability of being
accessed by any application.

AP1:Exclude any employee whose salary was less

than or equal to 30000

Pr = {p1: Loc = "LA,p2: salary > 30000}
M = {m1: Loc = "LA" Sal > 30000,

m2: Loc = "LA" Sal <= 30000,

m3: Loc <>"LA" Sal > 30000,
m4: Loc <>"LA" Sal <= 30000}
Any fragmentation must satisfy the following rules as defined :
Rule 1: Completeness. Decomposition of R into R1, R2, . . . , Rn is complete if and only if

each data item in R can also be found in some Ri.

Rule 2: Reconstruction. If R is decomposed into R1, R2, . . . , Rn, then there should exist
some relational operator, , such that R = 1in Ri.
Rule 3: Disjointness. If R is decomposed into R1, R2, . . . , Rn, and di is a tuple in Rj, then
di should not be in any other fragment, such as Rk, where k = j.

NOTE: N simple predicates in Pr, M will have 2N minterm

predicates.

PHF - Information
Requirements
Application Information
simple predicates : Given R[A1, A2, , An], a simple

predicate pj is
pj : Ai Value

where {=,<,,>,,}, Value Di and Di is the domain

of Ai.
For relation R we define Pr = {p1, p2, ,pm}
Example :
PNAME = "Maintenance"
BUDGET 200000

minterm predicates : Given R and Pr = {p1, p2, ,pm}

define M = {m1,m2,,mr} as
M = { mi | mi =

pjPrpj* }, 1jm, 1iz

where pj* = pj or pj* = (pj).

PHF Information Requirements

Application Information
minterm selectivities: sel(mi)
The number of tuples of the relation that would be

accessed by a user query which is specified

according to a given minterm predicate mi.

access frequencies: acc(qi)

The frequency with which a user application qi

accesses data.
Access frequency for a minterm predicate can also be

defined.

Primary Horizontal Fragmentation

Definition :
Rj = Fj(R), 1 j w
where Fj is a selection formula, which is (preferably) a
minterm predicate.
Therefore,
A horizontal fragment Ri of relation R consists of all the
tuples of R which satisfy a minterm predicate mi.

Given a set of minterm predicates M, there are as many

horizontal fragments of relation R as there are minterm
predicates.
Set of horizontal fragments also referred to as minterm
fragments.

PHF Algorithm
Given: A relation R, the set of simple
predicates Pr
Output:The set of fragments of R = {R1, R2,
,Rw} which obey the fragmentation
rules.
Preliminaries :
Pr should be complete
Pr should be minimal

Completeness of Simple
Predicates
A set of simple predicates Pr is said to be

complete if and only if the accesses to the tuples

of the minterm fragments defined on Pr requires
that two tuples of the same minterm fragment
have the same probability of being accessed by
any application.
Example 2:
Assume PROJ[PNO,PNAME,BUDGET,LOC] has two

applications defined on it.

Find the budgets of projects at each location. (1)
Find projects with budgets less than $200000. (2)

Completeness of Simple Predicates

According to (1),
Pr={LOC=Montreal,LOC=New York,LOC=Paris}

which is not complete with respect to (2).

Modify
Pr ={LOC=Montreal,LOC=New York,LOC=Paris,
BUDGET200000,BUDGET>200000}

which is complete.

Find projects with budgets less than $200000. (2)

Minimality of Simple
Predicates
If a predicate influences how fragmentation

is performed, (i.e., causes a fragment f to

be further fragmented into, say, fi and fj)
then there should be at least one
application that accesses fi and fj
differently.
In other words, the simple predicate should
be relevant in determining a fragmentation.
If all the predicates of a set Pr are relevant,
then Pr is minimal.
acc(mi ) acc(mj )

card( fi ) card( fj )

Minimality of Simple
Predicates
Example :
Pr ={LOC=Montreal,LOC=New York,
LOC=Paris,
BUDGET200000,BUDGET>200000}

is minimal (in addition to being complete).

However, if we add
PNAME = Instrumentation

then Pr is not minimal.

COM_MIN Algorithm
Given: a relation R and a set of simple
predicates Pr
Output: a complete and minimal set of
simple predicates Pr' for Pr
Rule 1: a relation or fragment is partitioned
into at least two parts which are
accessed differently by at least one
application.

COM_MIN Algorithm
Initialization :
find a pi Pr such that pi partitions R according to
Rule 1
set Pr' = pi ; Pr Pr {pi} ; F {fi}
Iteratively add predicates to Pr' until it is complete
find a pj Pr such that pj partitions some fk defined
according to minterm predicate over Pr' according to
Rule 1
set Pr' = Pr' {pj }; Pr Pr {pj }; F F {fi}
if pk Pr' which is nonrelevant then
Pr' Pr {pk}
F F {fk}

PHORIZONTAL Algorithm
Makes use of COM_MIN to perform fragmentation.
Input: a relation R and a set of simple predicates
Pr
Output: a set of minterm predicates M according to
which relation R is to be fragmented
Pr' COM_MIN (R,Pr)
determine the set M of minterm predicates
determine the set I of implications among pi Pr
eliminate the contradictory minterms from M

PHF Example 3
Two candidate relations : PAY and PROJ.
Fragmentation of relation PAY
Application: Check the salary info and determine

raise.
Employee records kept at two sites application
run at two sites
Simple predicates
p1 : SAL 30000
p2 : SAL > 30000
Pr = {p1,p2} which is complete and minimal Pr'=Pr

Minterm predicates
m1 : (SAL 30000)
m2 : NOT(SAL 30000) (SAL > 30000)

PHF Example 3

PAY1
TITLE

PAY2
SAL

TITLE

SAL

Mech. Eng. 27000

Elect. Eng.

40000

Programmer 24000

Syst. Anal.

34000

PHF Example 3
Fragmentation of relation PROJ

Applications:
Find the name and budget of projects given their no.
Issued

at three sites

Access project information according to budget

one

site accesses 200000 other accesses >200000

Simple predicates
For application (1)

p1 : LOC = Montreal
p2 : LOC = New York
p3 : LOC = Paris

For application (2)

p4 : BUDGET 200000
p5 : BUDGET > 200000
Pr = Pr' = {p1,p2,p3,p4,p5}

PHF Example 3
Fragmentation of relation PROJ continued
Minterm fragments left

m1 : (LOC = Montreal) (BUDGET

200000)
m2 : (LOC = Montreal) (BUDGET > 200000)
m3 : (LOC = New York) (BUDGET 200000)
m4 : (LOC = New York) (BUDGET > 200000)
m5 : (LOC = Paris) (BUDGET 200000)
m6 : (LOC = Paris) (BUDGET > 200000)

PHF Example
PROJ2

PROJ1
PNO
P1

PNAME

BUDGET

Instrumentation150000

LOC
Montrea
l

PROJ4
PNO

PNAME

CAD/CAM

PNO
P2

PNAME

BUDGET

LOC

Database
Develop.

135000 New York

PROJ6
BUDGET
250000

LOC
New
York

PNO
P4

PNAME

BUDGET

LOC

Maintenance

310000

Paris

PHF Correctness
Completeness
Since Pr' is complete and minimal, the selection predicates

are complete

Reconstruction
If relation R is fragmented into FR = {R1,R2,,Rr}
R =

Ri FR Ri

Disjointness
Minterm predicates that form the basis of fragmentation

should be mutually exclusive.

Derived Horizontal
Fragmentation
Defined on a member relation of a link according to a

selection operation specified on its owner.

Each link is an equijoin.
Equijoin can be implemented by means of semijoins.
SKILL
TITLE, SAL
L1
EMP

PROJ

ENO, ENAME, TITLE

PNO, PNAME, BUDGET, LOC

ASG
ENO, PNO, RESP, DUR

DHF Definition
Given a link L where owner(L)=S and
member(L)=R, the derived horizontal
fragments of R are defined as
Ri = R F Si, 1iw

where w is the maximum number of

fragments that will be defined on R and
Si = Fi(S)

where Fi is the formula according to which

the primary horizontal fragment Si is defined.

DHF Example
Given link L1 where owner(L1)=SKILL/PAY and member(L1)=EMP
Group engineers into two groups according to their salary: those making less
than or equal to $30,000, and those making more than $30,000.
EMP1 = EMP SKILL/PAY1
EMP2 = EMP SKILL/PAY2

where
SKILL/PAY1 = SAL30000(SKILL/PAY)
SKILL/PAY2 = SAL>30000(SKILL/PAY)

EMP1

EMP2

ENO

ENAME

E3
E4
E7

A. Lee
J. Miller
R. Davis

TITLE
Mech. Eng.
Programmer
Mech. Eng.

ENO

ENAME

TITLE

E1
E2
E5
E6
E8

J. Doe
M. Smith
B. Casey
L. Chu
J. Jones

Elect. Eng.
Syst. Anal.
Syst. Anal.
Elect. Eng.
Syst. Anal.

DHF Correctness

Completeness

Referential integrity
Let R be the member relation of a link whose owner is relation S which is fragmented

as FS = {S1, S2, ..., Sn}. Furthermore, let A be the join attribute between R and S. Then,
for each tuple t of R, there should be a tuple t' of S such that
t[A] = t' [A]

Reconstruction
Same as primary horizontal fragmentation.

Disjointness
Simple join graphs between the owner and the member fragments.

Vertical Fragmentation
Group the columns of a table into fragments.
Because each fragment contains a subset of the total set of columns in the table, VF can be

used to enforce security and/or privacy of data.

More difficult than horizontal, because more alternatives exist.
In the case of vertical partitioning, if a relation has m non-primary key attributes, the

number of possible fragments is equal to B(m), which is the mth Bell number
Two approaches :
grouping
attributes to fragments
first step creates as many vertical fragments as the number of non-key columns in the

table. Then grouping approach uses joins across the primary key, to group some of
these fragments together, and continues as needed
Not usually considered a valid approach
splitting
relation to fragments
placing each non-key column in one and only one fragment

Need to design affinity or closeness

If a table has 15 columns, then the number of

possible vertical fragments is 109 and

If the number of vertical fragments for a table
with 30 columns is 1023.
Evaluating is not practical.
Sol : Find the closeness/affinity between the
attributes to decide whether to group them into
same fragment or not

VF Information
Requirements
Application Information
Attribute affinities
a measure that indicates how closely related the attributes are
This is obtained by: access frequency + usage pattern
Access freq: how many times an application/query runs in a given
period of time at different sites
Usage pattern:Indicates whether a column is used by an
application/query.
Attribute usage values
Given a set of queries Q = {q1, q2,, qq} that will run on the

relation
R[A1 , A2,, An],
1 if attribute Aj is referenced by query qi
use(qi,Aj) =
0 otherwise

use(qi,) can be defined accordingly

VF Definition of use(qi,Aj)
Consider the following 4 queries for relation PROJ
q1: SELECT BUDGET q2: SELECT PNAME,BUDGET
FROM PROJ
FROM PROJ
WHERE
PNO=Value
q3: SELECT PNAMEq4: SELECT SUM(BUDGET)
FROM PROJ
FROM PROJ
WHERE
LOC=Value
WHERE
LOC=Value

Let A1= PNO, A2= PNAME, A3= BUDGET, A4=

LOC
A1

VF Affinity Measure
af(Ai,Aj)
The attribute affinity measure between two
attributes Ai and Aj of a relation R[A1, A2, , An]
with respect to the set of applications Q = (q1, q2,
, qq) is defined as follows :

af (Ai, Aj)

(query access)

all queries that access A and A

query access

access
access frequency of a query
execution

all sites

VF Calculation of af(Ai, Aj)

Example: Assume each query in the previous example
accesses the attributes once during each execution.
S1 S2 S3
Also assume the access frequencies
q1

15 20
5

25 25

Then
af(A1, A3) = 15*1 + 20*1+10*1
= 45

and the attribute affinity matrix AA is

A1
A2
A3
A4

A1 A2 A3 A4
45 0 45 0
5 75
0 80
45 5 53 3
3 78
0 75

VF Clustering Algorithm
Take the attribute affinity matrix AA and

reorganize the attribute orders to form

clusters where the attributes in each
cluster demonstrate high affinity to one
another.
Bond Energy Algorithm (BEA) has been
used for clustering of entities. BEA finds an
ordering of entities (in our case attributes)
such that the global affinity measure is
maximized.

Bond Energy Algorithm

Input: The AA affinity matrix
Output: The clustered affinity matrix CA which
is a perturbation of AA
Initialization: Place and fix one of the columns of
AA in CA.
Iteration: Place the remaining n-i columns in the
remaining i+1 positions in the CA matrix. For
each column, choose the placement that makes
the most contribution to the global affinity
measure.
Row order: Order the rows according to the
column ordering.

Bond Energy Algorithm

Best placement? Define contribution of a
placement:
cont(Ai, Ak, Aj) = 2bond(Ai, Ak)+2bond(Ak, Al)
2bond(Ai, Aj)
n

where
af(Az,Ax)af(Az,Ay)
bond(Ax,Ay
)=
z 1

BEA Example
Consider the following AA matrix and the corresponding CA matrix
where A1 and A2 have been placed. Place A3:

Ordering (0-3-1) :
cont(A0,A3,A1) = 2bond(A0 , A3)+2bond(A3 , A1)2bond(A0 , A1)
= 2* 0 + 2* 4410 2*0 = 8820

Ordering (1-3-2) :
cont(A1,A3,A2) = 2bond(A1 , A3)+2bond(A3 , A2)2bond(A1,A2)
= 2* 4410 + 2* 890 2*225 = 10150

Ordering (2-3-4) :
cont (A2,A3,A4)= 1780

BEA Example
Therefore, the CA matrix has the form

A1 A3 A2

45 45
0

5 80

45 53
0

3 75

When A is placed, the final form of the CA

matrix (after row

A1 A3 A2 A4
organization)
is 45 0 0
A1 45
A3 45 53

5 80 75

3 75 78

VF Algorithm
How can you divide a set of clustered attributes {A1, A2, , An} into two (or more) sets {A1, A2, , Ai} and {Ai, , An} such that there are no (or minimal)

applications that access both (or more than one) of the sets.
the function will produce fragments that are balanced.

For Best partitioning split the columns into a one-column BC and n 1 column TC first , and then repeatedly add columns from TC to BC until TC
is

left with only one column.

choose the splitting that has the highest Z value.

Z can be positive if total accesses to only one fragment are maximized while the total accesses to both fragments are minimized

Disadvantage :Not able to carve out an embedded or inner block of columns as a partition.

Sol: Overcome by adding a shift operation(moves the topmost row of the matrix to the bottom and then it moves the leftmost column of the matrix to the
extreme right)

A1 A2 A3 Ai Ai+1. . .Am
...

A1
A2

Ai
...

Ai+1
Am

SHIFT OPERATION

VF ALgorithm
Define
TQ = set of applications that access only TA(Top corner
attributes)
BQ = set of applications that access only BA
OQ = set of applications that access both TA and BA

and
CTQ = total number of accesses to attributes by applications
that access only TA
CBQ = total number of accesses to attributes by applications
that access only BA
COQ = total number of accesses to attributes by applications
that access both TA and BA

Then find the point along the diagonal that maximizes

Goal Function: Z =CTQCBQCOQ2

Example
1)

TC: all the applications that access one of the TC columns (C4, C1, or
C3) but do not access any BC
[i.e AP1, AP2, and AP4]

No application is BC-only.

AP3 accesses both TC and BC columns,

TCW = AFF(AP1) + AFF(AP2) + AFF(AP4) = 3 + 7 + 3 = 13

BCW = none = 0
BOCW = AFF(AP3) = 4
Z = 13*0 42 = 16
2) TC
TCW = AFF(AP1) + AFF(AP2) = 3 + 7 = 10
BCW = AFF(AP3) = 4
BOCW = AFF(AP4) = 3
Z = 4*10 32 = 40 9 = 31

Result :The two vertical fragments will be defined as VF1(C, C4) and
VF2(C, C1, C2, C3).

3)
TCW = AFF(AP2) = 7
BCW = AFF(AP3) + AFF(AP4) = 4 + 3 = 7
BOCW = AFF(AP1) = 3
Z = 7*7 32 = 49 9 = 40
4)
TCW = AFF(AP4) = 3
BCW = AFF(AP2) = 7
BOCW = AFF(AP1) + AFF(AP3) = 3 + 4 =
7
Z = 3*7 72 = 21 49 = 28

VF Algorithm
Two problems :
Cluster forming in the middle of the CA matrix
Shift a row up and a column left and apply the

algorithm to find the best partitioning point

Do this for all possible shifts
Cost O(m2)

More than two clusters

m-way partitioning
try 1, 2, , m1 split points along diagonal and try to

find the best point for each of these

Cost O(2m)

VF Correctness
A relation R, defined over attribute set A and key K,
generates the vertical partitioning FR = {R1, R2, , Rr}.
Completeness
The following should be true for A:
A=

AR i

Reconstruction
Reconstruction can be achieved by
R=

K Ri, Ri FR

Disjointness
TID's are not considered to be overlapping since they are

maintained by the system

Duplicated keys are not considered to be overlapping

Hybrid Fragmentation
R
HF

R11

R12

R21

R22

R23

Fragment Allocation
Problem Statement

Given
F = {F1, F2, , Fn}

fragments

S ={S1, S2, , Sm}

network sites

Q = {q1, q2,, qq}

applications

Find the "optimal" distribution of F to S.

Optimality

Minimal cost
Communication + storage + processing (read & update)
Cost in terms of time (usually)

Performance

Response time and/or throughput

Constraints

Per site constraints (storage & processing)

Information Requirements
Database information

selectivity of fragments
size of a fragment

Application information

access types and numbers

access localities

Communication network information

unit cost of storing data at a site
unit cost of processing at a site

Computer system information

bandwidth
latency
communication overhead

Allocation
File Allocation (FAP) vs Database Allocation
(DAP):
Fragments are not individual files
relationships have to be maintained

Access to databases is more complicated

remote file access model not applicable
relationship between allocation and query processing

Cost of integrity enforcement should be

considered
Cost of concurrency control should be

considered

Allocation Information
Requirements
Database Information
selectivity of fragments
size of a fragment

Application Information
number of read accesses of a query to a fragment
number of update accesses of query to a fragment
A matrix indicating which queries updates which fragments
A similar matrix for retrievals
originating site of each query

Site Information
unit cost of storing data at a site
unit cost of processing at a site

Network Information
communication cost/frame between two sites
frame size

Allocation Model
General Form
min(Total Cost)
subject to
response time constraint
storage constraint
processing constraint
Decision Variable
xij

1 if fragment Fi is stored at site Sj

0 otherwise

Allocation Model
Total Cost

query processing cost

all queries

cost of storing a fragment at a site

all sites all fragments

Storage Cost (of fragment Fj at Sk)

(unit storage cost at Sk) (size of Fj) xjk

Query Processing Cost (for one query)

processing component + transmission component

Allocation Model
Query Processing Cost

Processing component
access cost + integrity enforcement cost +
concurrency control cost

Access cost

(no. of update accesses+ no. of read accesses)

all sites all fragments

xij local processing cost at a site

Integrity enforcement and concurrency

control costs
Can be similarly calculated

Allocation Model
Query Processing Cost

Transmission component
cost of processing updates + cost of processing
retrievals

Cost of updates

update message cost

all sites all fragments

acknowledgment cost
all sites all fragments

Retrieval Cost

minall sites
(cost of retrieval command

all fragments

cost of sending back the result)

Allocation Model
Constraints
Response Time
execution time of query max. allowable response
time for that query

Storage Constraint (for a site)

storage requirement of a fragment at that

site at that site
storage capacity

all fragments

Processing constraint (for a site)

processing load of a query at that site

all queries

processing capacity of that site

Allocation Model
Solution Methods
FAP is NP-complete
DAP also NP-complete

Heuristics based on
single commodity warehouse location (for

FAP)
knapsack problem
branch and bound techniques
network flow

Allocation Model
Attempts to reduce the solution space
assume all candidate partitionings known;

select the best partitioning

ignore replication at first
sliding window on fragments

Fragmentation: Univ.-Prof. Dr. Peter Brezany Institut Für Scientific Computing Universität Wien
No ratings yet
Fragmentation: Univ.-Prof. Dr. Peter Brezany Institut Für Scientific Computing Universität Wien
17 pages
Distributed Database: Source
No ratings yet
Distributed Database: Source
19 pages
Chapter 5 Distributed Database Design
No ratings yet
Chapter 5 Distributed Database Design
12 pages
DBMS - LAB Manual
No ratings yet
DBMS - LAB Manual
22 pages
Cloud Computing & Virtualization Guide
No ratings yet
Cloud Computing & Virtualization Guide
7 pages
Bellman Ford Algorithm
No ratings yet
Bellman Ford Algorithm
4 pages
Access Control Models and Methods - Types of Access Control
No ratings yet
Access Control Models and Methods - Types of Access Control
12 pages
IM Ch12 Distributed DBMS Ed12
No ratings yet
IM Ch12 Distributed DBMS Ed12
14 pages
Cs9152 DBT Unit I Notes
100% (1)
Cs9152 DBT Unit I Notes
53 pages
Date's Twelve Rules For Distributed Database Systems
No ratings yet
Date's Twelve Rules For Distributed Database Systems
3 pages
Shortest Path Algorithms
No ratings yet
Shortest Path Algorithms
94 pages
BGP Lab6
No ratings yet
BGP Lab6
3 pages
Chapter 4 Distributed Database Systems
No ratings yet
Chapter 4 Distributed Database Systems
69 pages
2 RoutingAlgorithms
No ratings yet
2 RoutingAlgorithms
36 pages
Understanding Transaction Management
No ratings yet
Understanding Transaction Management
28 pages
DDB Unit 1-5
No ratings yet
DDB Unit 1-5
190 pages
Bellman Ford Algorithm Guide
No ratings yet
Bellman Ford Algorithm Guide
25 pages
3 - Design and Analysis of Algorithms
No ratings yet
3 - Design and Analysis of Algorithms
188 pages
Cyber Security 2024 Notes
0% (1)
Cyber Security 2024 Notes
3 pages
5.1 Mining Data Streams
No ratings yet
5.1 Mining Data Streams
16 pages
Software Architecture & Design Guide
No ratings yet
Software Architecture & Design Guide
10 pages
Ddbms Lab Manual
No ratings yet
Ddbms Lab Manual
100 pages
Class - 6 - m1 - Roles - and - Boundaries
No ratings yet
Class - 6 - m1 - Roles - and - Boundaries
19 pages
Chapter 10: Algorithms 10.1. Deterministic and Non-Deterministic Algorithm
No ratings yet
Chapter 10: Algorithms 10.1. Deterministic and Non-Deterministic Algorithm
5 pages
COA Notes Unit - 2
No ratings yet
COA Notes Unit - 2
30 pages
4 Serializability
No ratings yet
4 Serializability
6 pages
Bellman-Ford Algorithm Quiz
No ratings yet
Bellman-Ford Algorithm Quiz
4 pages
DBMS Basic Concepts
No ratings yet
DBMS Basic Concepts
56 pages
Unit 3 Routing Algorithms Computer Networks
No ratings yet
Unit 3 Routing Algorithms Computer Networks
78 pages
Distributed Database Systems Explained
100% (1)
Distributed Database Systems Explained
13 pages
Distributed File Systems: Unit - V Essay Questions
No ratings yet
Distributed File Systems: Unit - V Essay Questions
10 pages
Chapter - 7 Distributed Database System
No ratings yet
Chapter - 7 Distributed Database System
58 pages
Eceg-4221-Vlsi Lec 01 Overview
No ratings yet
Eceg-4221-Vlsi Lec 01 Overview
42 pages
Parallel Database Systems
No ratings yet
Parallel Database Systems
17 pages
Cs3451-Unit 3 Os Notes
No ratings yet
Cs3451-Unit 3 Os Notes
37 pages
Distributed DBMS Fundamentals
No ratings yet
Distributed DBMS Fundamentals
25 pages
V Sem Solution Bank
100% (1)
V Sem Solution Bank
303 pages
SCCS 420 CH 22-1 (Delivery Forwarding Routing)
100% (1)
SCCS 420 CH 22-1 (Delivery Forwarding Routing)
18 pages
System Analysis and Design Notes
100% (1)
System Analysis and Design Notes
32 pages
Chapters 6-8 Exercises
100% (1)
Chapters 6-8 Exercises
3 pages
IT257 DAA Approximation Algorithms
No ratings yet
IT257 DAA Approximation Algorithms
55 pages
Thompson Nfa
No ratings yet
Thompson Nfa
14 pages
Structure of Page Table in Operating Systems
No ratings yet
Structure of Page Table in Operating Systems
5 pages
DBMS - 2 Marks
No ratings yet
DBMS - 2 Marks
19 pages
LAB ASSIGNMENT With Output
No ratings yet
LAB ASSIGNMENT With Output
53 pages
Client Computing Evolution
No ratings yet
Client Computing Evolution
37 pages
CS403 - Database Management Systems
No ratings yet
CS403 - Database Management Systems
4 pages
Design and Analysis of Algorithms Feb 2022
No ratings yet
Design and Analysis of Algorithms Feb 2022
2 pages
Parallel Computing Quiz
No ratings yet
Parallel Computing Quiz
15 pages
Routing Algorithm
No ratings yet
Routing Algorithm
82 pages
Partitioning Methods
100% (1)
Partitioning Methods
3 pages
Information Modelling
No ratings yet
Information Modelling
84 pages
Subject Name Parallel and Distributed Computing
100% (1)
Subject Name Parallel and Distributed Computing
3 pages
Cs1451 Network Protocol Handout
No ratings yet
Cs1451 Network Protocol Handout
3 pages
Hashing in Data Structures
No ratings yet
Hashing in Data Structures
27 pages
ADB - Unit - II (Chapter-2)
No ratings yet
ADB - Unit - II (Chapter-2)
67 pages
2 Distribution Design
No ratings yet
2 Distribution Design
73 pages
2 Distribution Design
No ratings yet
2 Distribution Design
73 pages
Lec3 21 10 16.
No ratings yet
Lec3 21 10 16.
52 pages
Distributed Database Design
No ratings yet
Distributed Database Design
49 pages
OOPM Theory Questions
No ratings yet
OOPM Theory Questions
7 pages
HCL ME Tablet U1 Specs & Price
No ratings yet
HCL ME Tablet U1 Specs & Price
4 pages
The Law of Nations As A Constitutional Obligation: Michael D. Ramsey
No ratings yet
The Law of Nations As A Constitutional Obligation: Michael D. Ramsey
47 pages
Toru Dutt: Pioneer of Indo-Anglican Literature
No ratings yet
Toru Dutt: Pioneer of Indo-Anglican Literature
5 pages
Lecture - 08 PLSQL Triggers and Audit Mechanisms
No ratings yet
Lecture - 08 PLSQL Triggers and Audit Mechanisms
89 pages
A PROJECT REPORT ON Hotel Managment Usin
No ratings yet
A PROJECT REPORT ON Hotel Managment Usin
137 pages
Netwrix Auditor Data Discovery and Classification Quick Start Guide
No ratings yet
Netwrix Auditor Data Discovery and Classification Quick Start Guide
39 pages
School Email List for Marketers
No ratings yet
School Email List for Marketers
2 pages
Callmanager Database Replication
No ratings yet
Callmanager Database Replication
53 pages
Imran Shaik - Updated Resume
No ratings yet
Imran Shaik - Updated Resume
3 pages
Buckland 1991 - Information As A Thing
100% (1)
Buckland 1991 - Information As A Thing
10 pages
UGC NET 2024 Paper 1: Comprehension & Analysis
No ratings yet
UGC NET 2024 Paper 1: Comprehension & Analysis
18 pages
Update Weather App Project Report
No ratings yet
Update Weather App Project Report
35 pages
Information Management: By: Karl Steven A. Maddela Instructor
100% (4)
Information Management: By: Karl Steven A. Maddela Instructor
33 pages
Public Finance A Contemporary Application of Theory To Policy 11th Edition by David N Hyman
0% (1)
Public Finance A Contemporary Application of Theory To Policy 11th Edition by David N Hyman
316 pages
Operate Database Application - Lecture Notes
50% (4)
Operate Database Application - Lecture Notes
169 pages
SQLMAP Guide for Beginners
No ratings yet
SQLMAP Guide for Beginners
16 pages
50 Must-Know Oracle SQL Interview Questions & Answers
No ratings yet
50 Must-Know Oracle SQL Interview Questions & Answers
5 pages
Example Srs Document For Web Application
No ratings yet
Example Srs Document For Web Application
13 pages
HP Operations Smart Plug-In For Oracle 11.40
No ratings yet
HP Operations Smart Plug-In For Oracle 11.40
212 pages
Database Systems Lab Manual
No ratings yet
Database Systems Lab Manual
137 pages
3.2-Relational Algebra
No ratings yet
3.2-Relational Algebra
32 pages
BusyBoys Project Report Overview
No ratings yet
BusyBoys Project Report Overview
23 pages
Database Testing
No ratings yet
Database Testing
45 pages
SQL Database Testing Guide
No ratings yet
SQL Database Testing Guide
0 pages
The Software Development Book
100% (1)
The Software Development Book
27 pages
TAFJ MessageIntegrity
No ratings yet
TAFJ MessageIntegrity
14 pages
DBMS Midterm Exam Instructions
No ratings yet
DBMS Midterm Exam Instructions
3 pages
Design Report For Tournament Scoring System
No ratings yet
Design Report For Tournament Scoring System
17 pages
DBMS 01
No ratings yet
DBMS 01
11 pages
Full Stack Development Journey From Basics To Advanced
No ratings yet
Full Stack Development Journey From Basics To Advanced
8 pages
RAMresume
No ratings yet
RAMresume
1 page
Cloud Pentesting Cheatsheet
100% (1)
Cloud Pentesting Cheatsheet
22 pages
Fundamentals of Database Systems 6th Edition by Ramez Elmasri
No ratings yet
Fundamentals of Database Systems 6th Edition by Ramez Elmasri
317 pages