14/06/2022, 14:40 Distributed Systems - Design
Distributed Database Design
Data Fragmentation
Data Replication
Data Allocation
Client/Server
Architecture
In this section, you will be introduced to distributed database design issues.
These include data fragmentation, data replication and data allocation. We
will also
look briefly at how distributed database capabilities are
implemented within a
client/server architecture.
Data Fragmentation
Data fragmentation is a technique used to break up objects. In designing a
distributed database, you must decide which portion of the database is to be
stored
where. One technique used to break up the database into logical units
called fragments.
Fragmentation information is stored in a distributed data
catalogue which
the processing computer uses to process a user's request.
As a point of discussion, we can look at data fragmentation in terms of
relations or tables. The following matrix describes the different types of
fragmentation that can be used.
Horizontal
fragmentation This type of fragmentation
refers division of a relation
into fragments of rows. Each fragment
is stored at a
different computer or node, and each fragment
contains
unique rows. Each horizontal fragment may have a
different
number of rows, but each fragment must have
the same attributes.
Vertical
fragmentation This type of fragmentation
refers to the division of a
relation into fragments that comprise a
collection of
attributes. Each vertical fragment must have the same
number of rows, but can have different attributes
depending on the
key.
Mixed
fragmentation This type of fragmentation
is a two-step process. First,
horizontal fragmentation is done to
obtain the
necessary rows, then vertical fragmentation is done to
divide the attributes among the rows.
Data Replication
Data replication is the storage of data copies at multiple sites on the
network. Fragment copies can be stored at several site, thus enhancing
data
availability and response time. Replicated data is subject to a mutual
consistency rule. This rule requires that all copies of the
data fragments
must be identical and to ensure data consistency among all
of the
replications.
Although data replication is beneficial in terms of availability and
response
times, the maintenance of the replications can become complex.
For
example, if data is replicated over multiple sites, the DDBMS must
decide
which copy to access. For a query operation, the nearest copy is
all that is
required to satisfy a transaction. However, if the operation
is an update, then
all copies must be selected and updated to satisfy the
mutual consistency
rule.
A database can be either fully replicated,
partially replicated or
unreplicated.
[Link] type of fragmentation is,the attributes … 1/2
14/06/2022, 14:40 Distributed Systems - Design
Full
replication Stores
multiple copies of each database fragment at
multiple sites. Fully
replicated databases can be
impractical because of the amount of
overhead
imposed on the system.
Partial
replication Stores
multiple copies of some database fragments at
multiple sites. Most
DDBMS can handle this type of
replication very well.
No
replication Stores each
database fragment at a single site. No
duplication occurs.
Data replication is particularly useful if usage frequency of remote
data is
high and the database is fairly large. Another benefit of data
replication is
the possibility of restoring lost data at a particular
site.
Data Allocation
Data allocation is a process of deciding where to store the data. It also
involves a decision as to which data is stored at what location. Data
allocation can be centralised, partitioned or replicated.
Centralised The entire database is
stored at one site. No
distribution occurs.
Partitioned The database is divided
into several fragments that
are stored at several sites.
Replicated Copies of one or more
database fragments are stored
at several sites.
Client
Server Architecture
Implementation of a distributed database system must be carefully
managed
within a client server architecture. Typically, the server
provides the
resources for the client to use. The client receives the
request from the user
and the request is passed to the server. The server
receives, schedules and
executes the requests, selecting only what the
client requires. The request is
sent only when the client requests it.
There are advantages to implementing a distributed database system
using
client server architecture:
Cost Client server systems are
less expensive that
mainframes. There is also a considerable cost
savings in off-loading applications development from
the mainframe
to PCs
PC
Functionality and Use Many users are more
familiar and more skilled with
PC technology than they are with
mainframe
technology. The use of PC technology is more
widespread in
the workplace.
Data
Analysis and Query These tools are readily
available in the marketplace
Tools and can be used with many database
systems.
There are some disadvantages however, in that the client server
environment
is more complex and thus requires more management resources.
Security is
also another issue because of the number of users and sites.
[Link] type of fragmentation is,the attributes … 2/2