Determine Suitablity of DataBase Functionality
Determine Suitablity of DataBase Functionality
Database
Administration
Level IV
Learning Guide
Unit of Competence: Determine Suitability of
Database Functionality and Scalability
Module Title: Determine Suitability of Database
Functionality
and Scalability
LG Code : EIS DBA4 03 0812
TLM Code : EIS DBA4 03M0812
Date: September, 2018
TLM Development Manual
Compiled by: Derese Teshome/ICT Department Head
Introduction
This unit defines the competence required to identify current and future business
requirements for a database.
This Course defines the following Learning Outcomes
Determine database functionality
Prepare report
Identify scalability and functionality requirements
Date: September, 2018
TLM Development Manual
Compiled by: Derese Teshome/ICT Department Head
Managing information means taking care of it so that it works for us and is useful for
the tasks we perform. By using a DBMS, the information we collect and add to its
database is no longer subject to accidental disorganization. It becomes more accessible
and integrated with the rest of our work. Managing information using a database
allows us to become strategic users of the data we have.
We often need to access and re-sort data for various uses. These may include:
Sort
Match
Link
Aggregate
Skip fields
Calculate
Arrange
Because of the versatility of databases, we find them powering all sorts of projects. A
database can be linked to:
There are a number of characteristics that distinguish the database approach from the
file-based system or approach. This chapter describes the benefits (and features) of the
database system.
Date: September, 2018
TLM Development Manual
Compiled by: Derese Teshome/ICT Department Head
In the file-based system, the structure of the data files is defined in the application
programs so if a user wants to change the structure of a file, all the programs that
access that file might need to be changed as well.
On the other hand, in the database approach, the data structure is stored in the
system catalogue and not in the programs. Therefore, one change is all that is
needed to change the structure of a file. This insulation between the programs and
data is also called program-data independence.
A database supports multiple views of data. A view is a subset of the database, which
is defined and dedicated for particular users of the system. Multiple users in the
system might have different views of the system. Each view might contain only the data
of interest to a user or group of users.
Current database systems are designed for multiple users. That is, they allow many
users to access the same database at the same time. This access is achieved through
features called concurrency control strategies. These strategies ensure that the data
accessed are always correct and that data integrity is maintained.
The design of modern multiuser database systems is a great improvement from those
in the past which restricted usage to one person at a time.
In the database approach, ideally, each data item is stored in only one place in the
database. In some cases, data redundancy still exists to improve system performance,
but such redundancy is controlled by application programming and kept to minimum
by introducing as little redundancy as possible when designing the database.
Date: September, 2018
TLM Development Manual
Compiled by: Derese Teshome/ICT Department Head
6. Data sharing
The integration of all the data, for an organization, within a database system has many
advantages. First, it allows for data sharing among employees and others who have
access to the system. Second, it gives users the ability to generate more information
from a given amount of data than would be possible without the integration.
Database management systems must provide the ability to define and enforce certain
constraints to ensure that users enter valid information and maintain data
integrity. A database constraint is a restriction or rule that dictates what can be
entered or edited in a table such as a postal code using a certain format or adding a
valid city in the City field.
There are many types of database constraints. Data type, for example, determines the
sort of data permitted in a field, for example numbers only. Data uniqueness such as
the primary key ensures that no duplicates are entered. Constraints can be simple
(field based) or complex (programming).
Not all users of a database system will have the same accessing privileges. For
example, one user might have read-only access (i.e., the ability to read a file but not
make changes), while another might have read and write privileges, which is the ability
to both read and modify a file. For this reason, a database management system should
provide a security subsystem to create and control different types of user accounts and
restrict unauthorized access.
9. Data independence
By its very nature, a DBMS permits many users to have access to its database either
individually or simultaneously. It is not important for users to be aware of how and
where the data they access is stored
Backup and recovery are methods that allow you to protect your data from loss. The
database system provides a separate process, from that of a network backup, for
backing up and recovering data. If a hard drive fails and the database stored on the
hard drive is not accessible, the only way to recover the database is from a backup.
If a computer system fails in the middle of a complex update process, the recovery
subsystem is responsible for making sure that the database is restored to its original
state. These are two more benefits of a database management system.
Key Terms
concurrency control strategies: features of a database that allow several users access
to the same data item at the same time
data type: determines the sort of data permitted in a field, for example numbers only
data uniqueness: ensures that no duplicates are entered
database constraint: a restriction that determines what is allowed to be entered or
edited in a table
metadata: defines and describes the data and relationships between tables in the
database
read and write privileges: the ability to both read and modify a file
read-only access: the ability to read a file but not make changes
self-describing: a database system is referred to as self-describing because it not only
contains the database itself, but also metadata which defines and describes the data
and relationships between tables in the database
view: a subset of the database
13. Exercises
INTRODUCTION
Scalability is a desirable attribute of a network, system, or process. The concept
connotes the ability of a system to accommodate an increasing number of elements
or objects, to process growing volumes of work gracefully, and/or to be susceptible
to enlargement. When procuring or designing a system, we often require that it be
scalable. The requirement may even be mentioned in a contract with a vendor.
When we say that a system is unscalable, we usually mean that
the additional cost of coping with a given increase in traffic or size is excessive, or
that it cannot cope at this increased level at all. Cost may be quantified in many
ways, including but not limited to response time, processing overhead, space,
memory, or even money. A system that does not scale well adds to labour costs or
harms the quality of service. It can delay or deprive the user of revenue opportunities.
Eventually, it must be replaced.
The scalability of a system subject to growing demand is crucial to its long-term
success. At the same time, the concept of scalability and our understanding of the
factors that improve or diminish it are vague and even subjective. Many
systems designers and performance analysts have an intuitive feel for scalability, but
the determining factors are not always clear. They may vary from one system to
another.
In this paper, we attempt to define attributes that make a system scalable. This is a
first step towards identifying those factors that typically impede scalability. Once that
has been done, the factors may be recognised early in the design phase of a project.
Then, bounds on the scalability of a proposed or existing system may be more easily
understood.
An unidentified author recently placed a definition of scalability at the URL
http://www.whatis.com/scalabil.htm. Two usages are cited: (1) the ability of a
computer application or product (hardware of software) to function well as it (or its
context) is changed in size or volume in order to meet a user need; and (2) the
ability not only to function well in the rescaled situation, but to actually take full
advantage of it, for example if it were moved from a smaller to a larger
operating system or from a uniprocessor to a mulitprocessor environment.
Jogalekar and Woodside define a metric to evaluate the scalability of a
Date: September, 2018
TLM Development Manual
Compiled by: Derese Teshome/ICT Department Head
distributed system from one system to another, but do not attempt to classify
attributes of or impediments to scalability [13]. Without explicitly attempting
to define scalability, Hennessy cites an observation by Greg Papadopoulos of
SUN that the amount of online storage on servers is currently expanding
faster than Moore’s law [11]. Hennessy also suggests that scalability is an aspect of
research that should receive more emphasis in the future than performance. The
work in this paper and in [13] suggests that performance and scalability need not be
decoupled; indeed, they may be very closely intertwined.
The ability to scale up a system or system component may depend on the
types of data structures and algorithms used to implement it or the
mechanisms its components use to communicate with one another. The data
structures support particular functions of a system. The algorithms may be used to
search these structures, to schedule activities or access to resources, to
coordinate interactions between processes, or to update multiple copies of data at
different places. The data structures affect not only the amount of space required to
perform a particular function, but also the time. These observations give rise to
notions of space scalability and space-time scalability, which we shall formally
define later. In addition, the fixed size of some data structures, such as arrays or
address fields, may inherently limit the growth in the number of objects they track.
We call the absence of this type of limitation structural scalability.
The scalability of a system may be impaired by inherent wastefulness in frequently
repeated actions. It may also be impaired by the presence of access algorithms
that lead to deadlock or that result in suboptimal scheduling of resources. Such
systems may function well when the load is light, but suffer substantial performance
degradation as the load increases. We call systems that do not suffer from such
impairments load scalable. Classical examples of poor load scalability include
Ethernet bus contention and busy waiting on locks in multiprocessor systems. We
note in passing that systems with poor load scalability can be hard to model,
because they may migrate from a state of graceful function to overload, and perhaps
from there into deadlock.
The improvement of structural, space, and space-time scalability depends on the
judicious choice of algorithms and data structures, and synchronization
mechanisms. Since algorithmic analysis, programming techniques, and
synchronization are well documented elsewhere [1, 19, 10], we shall not dwell on
them further here. Nor shall we consider scalability aspects of parallel computation
or connectivity. In the present paper, our focus shall be on load scalability.
In the next section, we elaborate on scalability concepts we described above. We
then look at some examples. These examples will be used to illustrate the
notion of unproductive cycles as well as instances in which scalability is undermined
by one or more scheduling rules.
2. TYPES OF SCALABILITY
2.1 General Types of Scalability
We consider four types of scalability here: load scalability, space scalability, space-
time scalability, and structural scalability. A system or system component may have
Date: September, 2018
TLM Development Manual
Compiled by: Derese Teshome/ICT Department Head
more than one of these attributes. Moreover, two or more types of scalability may
mutually interact.
Load scalability. We say that a system has load scalability if it has the ability to
function gracefully, i.e., without undue delay and without unproductive resource
consumption or resource contention at light, moderate, or heavy loads while
making good use of available resources. Some of the factors that can
undermine load scalability include (1) the scheduling of a shared resource,
(2) the scheduling of a class of resources in a manner that increases its own
usage (self-expansion), and (3) inadequate exploitation of parallelism.
Date: September, 2018
TLM Development Manual
Compiled by: Derese Teshome/ICT Department Head
The Ethernet does not have load scalability, because the high collision rate at
heavy loads prevents bandwidth from being used effectively. The token ring with
nonexhaustive service does have load scalability, because every packet is served
within a bounded amount of time.
A scheduling rule may or may not have load scalability, depending on its
properties. For example, the Berkeley UNIX
4.2BSD operating system gives higher CPU priority to the first stage of processing
inbound packets than to either the second stage or to the first stage of processing
outbound packets. This in turn has higher priority than I/O, which in turn has
higher priority than user activity. This means that sustained intense inbound traffic
can starve the outbound traffic or prevent the processing of packets that have
already arrived. This scenario is quite likely at a web server [16]. This situation can
also lead to livelock, a form of blocking from which recovery is possible once the
intense packet traffic abates. Inbound packets cannot be processed and therefore are
unacknowledged. This eventually causes the TCP sliding window to shut, while
triggering retransmissions. Network goodput then drops to zero. Even if
acknowledgments could be generated for inbound packets, it would not be
possible to transmit them, because of the starvation of outbound transmission. It
is also worth noting that if I/O interrupts and interrupts triggered by
inbound packets are handled at the same level of CPU priority, heavy inbound
packet traffic will delay I/O handling as well. This delays information delivery from
web servers.
A system may also have poor load scalability because one of the resources it contains
has a performance measure that is self- expanding, i.e., its expectation is an
increasing function of itself. This may occur in queueing systems in which a common
FCFS work queue is used by processes wishing to acquire resources or wishing to
return them to a free pool. This is because the holding time of a resource is increased
by contention for a like resource, whose holding time is increased by the delay
incurred by the customer wishing to free it. Self-expansion diminishes scalability by
reducing the traffic volume at which saturation occurs. In some cases, it might be
detected when performance models of the system in question based on fixed-point
approximations predict that performance measures will increase without bound,
rather than converging. In some cases, the presence self-expansion may make the
performance of the system unpredictable when the system is heavily loaded. Despite
this, the operating region in which self-expansion is likely to have the biggest impact
may be readily identifiable: it is likely to be close to the point at which the loading
of an active or passive resource begins to steeply increase delays.
Load scalability may be undermined by inadequate parallelism. A quantitative
method for describing parallelism is given in [15]. Parallelism may be regarded as
inadequate if system structure prevents the use of multiple processors for tasks that
could be executed asynchronously. For example, a transaction processing (TP)
monitor might handle multiple tasks that must all be executed within the
context of a single process. These tasks can only be executed on one processor in a
multiprocessor system, because the operating system only sees the registers for the
Date: September, 2018
TLM Development Manual
Compiled by: Derese Teshome/ICT Department Head
TP monitor, not for the individual tasks. Such a system is said to be single-threaded.
Steps
1. Be clear about your Topic or company on which you are going to write your
report and then even before you start researching get these important topics in
your mind which are to be included in your report so that you can prepare a
solid professional report:
2. As you start researching, jolt down your findings at one place. Be sure that you
have the technical information like power consumed, machine name,
maintenance, capacity, etc. However don't forget about your main topic.
4. Think about the advantages and disadvantages of the process and get your ideas
blooming based on the topics to be included in the report.
6. Before you start writing the report collect all previous reports about the topic or
other reference material to boost a solid organizing.
7. Now as you have collected all the data start categorizing matter into prescribed
headings as shown in the above picture.
8. Keep all fonts in size-12 and headings around 14-16. Be sure that
there aren't some requirements for spacing and font size.