0% found this document useful (0 votes)
144 views

Determine Suitablity of DataBase Functionality

This document is a learning guide that defines learning outcomes for determining database functionality and scalability. It discusses key characteristics and benefits of databases, including their self-describing nature, insulation between programs and data, support for multiple views, sharing of data among multiple users, control of data redundancy, enforcement of integrity constraints, and provision of backup and recovery facilities. The document also defines several key terms related to databases and includes exercises for readers.
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
144 views

Determine Suitablity of DataBase Functionality

This document is a learning guide that defines learning outcomes for determining database functionality and scalability. It discusses key characteristics and benefits of databases, including their self-describing nature, insulation between programs and data, support for multiple views, sharing of data among multiple users, control of data redundancy, enforcement of integrity constraints, and provision of backup and recovery facilities. The document also defines several key terms related to databases and includes exercises for readers.
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 13

Date: September, 2018

TLM Development Manual


Compiled by: Derese Teshome/ICT Department Head

MISRAK POLY TECHNIC COLLEGE

Database
Administration
Level IV
Learning Guide
Unit of Competence: Determine Suitability of
Database Functionality and Scalability
Module Title: Determine Suitability of Database
Functionality
and Scalability
LG Code : EIS DBA4 03 0812
TLM Code : EIS DBA4 03M0812
Date: September, 2018
TLM Development Manual
Compiled by: Derese Teshome/ICT Department Head

Introduction
This unit defines the competence required to identify current and future business
requirements for a database.
This Course defines the following Learning Outcomes
Determine database functionality
Prepare report
Identify scalability and functionality requirements
Date: September, 2018
TLM Development Manual
Compiled by: Derese Teshome/ICT Department Head

Lo1: Determine database functionality

Managing information means taking care of it so that it works for us and is useful for
the tasks we perform. By using a DBMS, the information we collect and add to its
database is no longer subject to accidental disorganization. It becomes more accessible
and integrated with the rest of our work. Managing information using a database
allows us to become strategic users of the data we have.

We often need to access and re-sort data for various uses. These may include:

 Creating mailing lists


 Writing management reports
 Generating lists of selected news stories
 Identifying various client needs

The processing power of a database allows it to manipulate the data it houses, so it


can:

 Sort
 Match
 Link
 Aggregate
 Skip fields
 Calculate
 Arrange

Because of the versatility of databases, we find them powering all sorts of projects. A
database can be linked to:

 A website that is capturing registered users


 A client-tracking application for social service organizations
 A medical record system for a health care facility
 Your personal address book in your email client
 A collection of word-processed documents
 A system that issues airline reservations

1. Characteristics and Benefits of a Database

There are a number of characteristics that distinguish the database approach from the
file-based system or approach. This chapter describes the benefits (and features) of the
database system.
Date: September, 2018
TLM Development Manual
Compiled by: Derese Teshome/ICT Department Head

1. Self-describing nature of a database system

A database system is referred to as self-describing because it not only contains the


database itself, but also metadata which defines and describes the data and
relationships between tables in the database. This information is used by the DBMS
software or database users if needed. This separation of data and information about
the data makes a database system totally different from the traditional file-based
system in which the data definition is part of the application programs.

2. Insulation between program and data

In the file-based system, the structure of the data files is defined in the application
programs so if a user wants to change the structure of a file, all the programs that
access that file might need to be changed as well.

On the other hand, in the database approach, the data structure is stored in the
system catalogue and not in the programs. Therefore, one change is all that is
needed to change the structure of a file. This insulation between the programs and
data is also called program-data independence.

3. Support for multiple views of data

A database supports multiple views of data.  A view is a subset of the database, which
is defined and dedicated for particular users of the system. Multiple users in the
system might have different views of the system. Each view might contain only the data
of interest to a user or group of users.

4. Sharing of data and multiuser system

Current database systems are designed for multiple users. That is, they allow many
users to access the same database at the same time. This access is achieved through
features called concurrency control strategies. These strategies ensure that the data
accessed are always correct and that data integrity is maintained. 

The design of modern multiuser database systems is a great improvement from those
in the past which restricted usage to one person at a time.

5. Control of data redundancy

In the database approach, ideally, each data item is stored in only one place in the
database. In some cases, data redundancy still exists to improve system performance,
but such redundancy is controlled by application programming and kept to minimum
by introducing as little redundancy as possible when designing the database.
Date: September, 2018
TLM Development Manual
Compiled by: Derese Teshome/ICT Department Head

6. Data sharing

The integration of all the data, for an organization, within a database system has many
advantages. First, it allows for data sharing among employees and others who have
access to the system. Second, it gives users the ability to generate more information
from a given amount of data than would be possible without the integration.

7. Enforcement of integrity constraints

Database management systems must provide the ability to define and enforce certain
constraints to ensure that users enter valid information and maintain data
integrity. A database constraint is a restriction or rule that dictates what can be
entered or edited in a table such as a postal code using a certain format or adding a
valid city in the City field.

There are many types of database constraints. Data type, for example, determines the
sort of data permitted in a field, for example numbers only. Data uniqueness such as
the primary key ensures that no duplicates are entered. Constraints can be simple
(field based) or complex (programming).

8. Restriction of unauthorized access

Not all users of a database system will have the same accessing privileges. For
example, one user might have read-only access (i.e., the ability to read a file but not
make changes), while another might have read and write privileges, which is the ability
to both read and modify a file. For this reason, a database management system should
provide a security subsystem to create and control different types of user accounts and
restrict unauthorized access.

9. Data independence

Another advantage of a database management system is how it allows for data


independence. In other words, the system data descriptions or data describing data
(metadata) are separated from the application programs. This is possible because
changes to the data structure are handled by the database management system and 
are not embedded in the program itself.

10. Transaction processing

 A database management system must include concurrency control subsystems. This


feature ensures that data remains consistent and valid during transaction processing
even if several users update the same information.
Date: September, 2018
TLM Development Manual
Compiled by: Derese Teshome/ICT Department Head

11. Provision for multiple views of data

By its very nature, a DBMS permits many users to have access to its database either
individually or simultaneously. It is not important for users to be aware of how and
where the data they access is stored

12. Backup and recovery facilities

Backup and recovery are methods that allow you to protect your data from loss.  The
database system provides a separate process, from that of a network backup, for
backing up and recovering data. If a hard drive fails and the database stored on the
hard drive is not accessible, the only way to recover the database is from a backup.
If a computer system fails in the middle of a complex update process, the recovery
subsystem is responsible for making sure that the database is restored to its original
state. These are two more benefits of a database management system.

Key Terms

concurrency control strategies: features of a database that allow several users access
to the same data item at the same time

data type: determines the sort of data permitted in a field, for example numbers only
data uniqueness: ensures that no duplicates are entered
database constraint: a restriction that determines what is allowed to be entered or
edited in a table
metadata: defines and describes the data and relationships between tables in the
database
read and write privileges: the ability to both read and modify a file
read-only access: the ability to read a file but not make changes
self-describing: a database system is referred to as self-describing because it not only
contains the database itself, but also metadata which defines and describes the data
and relationships between tables in the database
view: a subset of the database

13. Exercises

1. How is a DBMS distinguished from a file-based system?


2. What is data independence and why is it important?
3. What is the purpose of managing information?
4. Discuss the uses of databases in a business environment.
5. What is metadata?
Date: September, 2018
TLM Development Manual
Compiled by: Derese Teshome/ICT Department Head

Lo2: Identify scalability and functionality requirements

INTRODUCTION
Scalability is a desirable attribute of a network, system, or process. The concept
connotes the ability of a system to accommodate an increasing number of elements
or objects, to process growing volumes of work gracefully, and/or to be susceptible
to enlargement. When procuring or designing a system, we often require that it be
scalable. The requirement may even be mentioned in a contract with a vendor.
When we say that a system is unscalable, we usually mean that
the additional cost of coping with a given increase in traffic or size is excessive, or
that it cannot cope at this increased level at all. Cost may be quantified in many
ways, including but not limited to response time, processing overhead, space,
memory, or even money. A system that does not scale well adds to labour costs or
harms the quality of service. It can delay or deprive the user of revenue opportunities.
Eventually, it must be replaced.
The scalability of a system subject to growing demand is crucial to its long-term
success. At the same time, the concept of scalability and our understanding of the
factors that improve or diminish it are vague and even subjective. Many
systems designers and performance analysts have an intuitive feel for scalability, but
the determining factors are not always clear. They may vary from one system to
another.
In this paper, we attempt to define attributes that make a system scalable. This is a
first step towards identifying those factors that typically impede scalability. Once that
has been done, the factors may be recognised early in the design phase of a project.
Then, bounds on the scalability of a proposed or existing system may be more easily
understood.
An unidentified author recently placed a definition of scalability at the URL
http://www.whatis.com/scalabil.htm. Two usages are cited: (1) the ability of a
computer application or product (hardware of software) to function well as it (or its
context) is changed in size or volume in order to meet a user need; and (2) the
ability not only to function well in the rescaled situation, but to actually take full
advantage of it, for example if it were moved from a smaller to a larger
operating system or from a uniprocessor to a mulitprocessor environment.
Jogalekar and Woodside define a metric to evaluate the scalability of a
Date: September, 2018
TLM Development Manual
Compiled by: Derese Teshome/ICT Department Head

distributed system from one system to another, but do not attempt to classify
attributes of or impediments to scalability [13]. Without explicitly attempting
to define scalability, Hennessy cites an observation by Greg Papadopoulos of
SUN that the amount of online storage on servers is currently expanding
faster than Moore’s law [11]. Hennessy also suggests that scalability is an aspect of
research that should receive more emphasis in the future than performance. The
work in this paper and in [13] suggests that performance and scalability need not be
decoupled; indeed, they may be very closely intertwined.
The ability to scale up a system or system component may depend on the
types of data structures and algorithms used to implement it or the
mechanisms its components use to communicate with one another. The data
structures support particular functions of a system. The algorithms may be used to
search these structures, to schedule activities or access to resources, to
coordinate interactions between processes, or to update multiple copies of data at
different places. The data structures affect not only the amount of space required to
perform a particular function, but also the time. These observations give rise to
notions of space scalability and space-time scalability, which we shall formally
define later. In addition, the fixed size of some data structures, such as arrays or
address fields, may inherently limit the growth in the number of objects they track.
We call the absence of this type of limitation structural scalability.
The scalability of a system may be impaired by inherent wastefulness in frequently
repeated actions. It may also be impaired by the presence of access algorithms
that lead to deadlock or that result in suboptimal scheduling of resources. Such
systems may function well when the load is light, but suffer substantial performance
degradation as the load increases. We call systems that do not suffer from such
impairments load scalable. Classical examples of poor load scalability include
Ethernet bus contention and busy waiting on locks in multiprocessor systems. We
note in passing that systems with poor load scalability can be hard to model,
because they may migrate from a state of graceful function to overload, and perhaps
from there into deadlock.
The improvement of structural, space, and space-time scalability depends on the
judicious choice of algorithms and data structures, and synchronization
mechanisms. Since algorithmic analysis, programming techniques, and
synchronization are well documented elsewhere [1, 19, 10], we shall not dwell on
them further here. Nor shall we consider scalability aspects of parallel computation
or connectivity. In the present paper, our focus shall be on load scalability.
In the next section, we elaborate on scalability concepts we described above. We
then look at some examples. These examples will be used to illustrate the
notion of unproductive cycles as well as instances in which scalability is undermined
by one or more scheduling rules.
2. TYPES OF SCALABILITY
2.1 General Types of Scalability
We consider four types of scalability here: load scalability, space scalability, space-
time scalability, and structural scalability. A system or system component may have
Date: September, 2018
TLM Development Manual
Compiled by: Derese Teshome/ICT Department Head

more than one of these attributes. Moreover, two or more types of scalability may
mutually interact.
Load scalability. We say that a system has load scalability if it has the ability to
function gracefully, i.e., without undue delay and without unproductive resource
consumption or resource contention at light, moderate, or heavy loads while
making good use of available resources. Some of the factors that can
undermine load scalability include (1) the scheduling of a shared resource,
(2) the scheduling of a class of resources in a manner that increases its own
usage (self-expansion), and (3) inadequate exploitation of parallelism.
Date: September, 2018
TLM Development Manual
Compiled by: Derese Teshome/ICT Department Head

The Ethernet does not have load scalability, because the high collision rate at
heavy loads prevents bandwidth from being used effectively. The token ring with
nonexhaustive service does have load scalability, because every packet is served
within a bounded amount of time.
A scheduling rule may or may not have load scalability, depending on its
properties. For example, the Berkeley UNIX
4.2BSD operating system gives higher CPU priority to the first stage of processing
inbound packets than to either the second stage or to the first stage of processing
outbound packets. This in turn has higher priority than I/O, which in turn has
higher priority than user activity. This means that sustained intense inbound traffic
can starve the outbound traffic or prevent the processing of packets that have
already arrived. This scenario is quite likely at a web server [16]. This situation can
also lead to livelock, a form of blocking from which recovery is possible once the
intense packet traffic abates. Inbound packets cannot be processed and therefore are
unacknowledged. This eventually causes the TCP sliding window to shut, while
triggering retransmissions. Network goodput then drops to zero. Even if
acknowledgments could be generated for inbound packets, it would not be
possible to transmit them, because of the starvation of outbound transmission. It
is also worth noting that if I/O interrupts and interrupts triggered by
inbound packets are handled at the same level of CPU priority, heavy inbound
packet traffic will delay I/O handling as well. This delays information delivery from
web servers.
A system may also have poor load scalability because one of the resources it contains
has a performance measure that is self- expanding, i.e., its expectation is an
increasing function of itself. This may occur in queueing systems in which a common
FCFS work queue is used by processes wishing to acquire resources or wishing to
return them to a free pool. This is because the holding time of a resource is increased
by contention for a like resource, whose holding time is increased by the delay
incurred by the customer wishing to free it. Self-expansion diminishes scalability by
reducing the traffic volume at which saturation occurs. In some cases, it might be
detected when performance models of the system in question based on fixed-point
approximations predict that performance measures will increase without bound,
rather than converging. In some cases, the presence self-expansion may make the
performance of the system unpredictable when the system is heavily loaded. Despite
this, the operating region in which self-expansion is likely to have the biggest impact
may be readily identifiable: it is likely to be close to the point at which the loading
of an active or passive resource begins to steeply increase delays.
Load scalability may be undermined by inadequate parallelism. A quantitative
method for describing parallelism is given in [15]. Parallelism may be regarded as
inadequate if system structure prevents the use of multiple processors for tasks that
could be executed asynchronously. For example, a transaction processing (TP)
monitor might handle multiple tasks that must all be executed within the
context of a single process. These tasks can only be executed on one processor in a
multiprocessor system, because the operating system only sees the registers for the
Date: September, 2018
TLM Development Manual
Compiled by: Derese Teshome/ICT Department Head

TP monitor, not for the individual tasks. Such a system is said to be single-threaded.

Space scalability. A system or application is regarded as having space


scalability if its memory requirements do not grow to intolerable levels as the
number of items it supports increases. Of course, intolerable is a relative
term. We might say that a particular application or data structure is space
scalable if its memory requirements increase at most sublinearly with the number
of items in question. Various programming techniques might be used to
achieve space scalability, such as sparse matrix methods or compression. Because
compression takes time, it is possible that space scalability may only be
achieved at the expense of load scalability.
Space-time scalability. We regard a system as having space-time scalability if
it continues to function gracefully as the number of objects it encompasses
increases by orders of magnitude. A system may be space-time scalable if the
data structures and algorithms used to implement it are conducive to smooth
and speedy operation whether the system is of moderate size or large. For
example, a search engine that is based on a linear search would not be space-
time scalable, while one based on an indexed or sorted data structure such as a
hash table or balanced tree could be. Notice that this may be a driver of load
scalability for the following reasons:
1. The presence of a large number of objects may lead to the presence of a
heavier load.
2. The ability to perform a quick search may be affected by the size of a
data structure and how it is organised.
3. A system or application that occupies a large amount of memory may
incur considerable paging overhead.
Space scalability is a necessary condition for space-time scalability in most
systems, because excessive storage requirements could lead to memory
management problems and/or increased search times.
Structural scalability. We think of a system as being structurally scalable if its
implementation or standards do not impede the growth of the number of objects
it encompasses, or at least will not do so within a chosen time frame. This is a
relative term, because scalability depends on the number of objects of interest
now relative to the number of objects later. Any system with a finite address
space has limits on its scalability. The limits are inherent in the addressing
scheme. For instance, a packet header field typically contains a fixed number of
bits. If the field is an address field, the number of addressable nodes is limited. If
the field is a window size, the amount of unacknowledged data is limited. A
telephone numbering scheme with a fixed number of digits, such as the North
American Numbering Plan, is scalable only to the extent that the maximum
quantity of distinct numbers is significantly greater than the set of numbers to be
assigned before the number of digits is expanded.
Load scalability may be improved by modifying scheduling rules, avoiding self-
Date: September, 2018
TLM Development Manual
Compiled by: Derese Teshome/ICT Department Head

expansion, or exploiting parallelism. By contrast, the other forms of scalability we


have described are inherent in
architectural characteristics (such as word length or the choice of data structures)
or standards (such as the number of bits in certain fields) that may be difficult
or impossible to change.
2.2 Scalability over Long Distances
Distance Scalability. An algorithm or protocol is distance scalable if it works well
over long distances as well as short distances.
Speed/Distance Scalability. An algorithm or protocol is speed/distance scalable if
it works well over long distances as well as short distances at high and low
speeds.
The motivation for these types of scalability is TCP/IP. Its sliding window protocol
shows poor speed/distance scalability in its original form. Protocols in which bit
status maps are periodically sent from the destination to the source, such as SSCOP
[8] are intended to overcome this shortcoming. This is the subject of future work
and will not be considered further here.
3. INDEPENDENCE AND OVERLAP BETWEEN SCALABILITY TYPES
When exploring taxonomy of defining characteristics, it is
natural to ask whether they are independent of one another or whether they
overlap. The examples presented here show that there are cases where load
scalability is not undermined by poor space scalability or structural scalability.
Systems with poor space scalability or space-time scalability might have poor load
scalability because of the attendant memory management overhead or
search costs. Systems with good space-time scalability because their data
structures are well engineered might have poor load scalability because of poor
decisions about scheduling or parallelism that have nothing to do with memory
management.
Let us now consider the relationship between structural scalability and load
scalability. Clearly, the latter is not a driver of the former, though the reverse could
be true. For example, the inability to exploit parallelism and make use of such
resources as multiple processors undermines load scalability, but could be
attributed to a choice of implementation that is structurally unscalable.
The foregoing discussion shows that the types of scalability presented here are not
entirely independent of one another, although many aspects of each type are.
Therefore, though they provide a broad basis for a discussion of scalability, that basis
is not orthogonal in the sense that a suitable set of base vectors could be. Nor is it
clear that an attempt at orthogonalization, i.e., an attempt to provide a
characterisation of scalability consisting only of independent components, would be
useful to the software practitioner, because the areas of overlap between our aspects
of scalability are a reflection of the sorts of design choices a practitioner might face.
Date: September, 2018
TLM Development Manual
Compiled by: Derese Teshome/ICT Department Head

Lo3: Prepare report

Steps
1. Be clear about your Topic or company on which you are going to write your
report and then even before you start researching get these important topics in
your mind which are to be included in your report so that you can prepare a
solid professional report:

2. As you start researching, jolt down your findings at one place. Be sure that you
have the technical information like power consumed, machine name,
maintenance, capacity, etc. However don't forget about your main topic.

3. Use organizational trees, if any, to illustrate your findings.

4. Think about the advantages and disadvantages of the process and get your ideas
blooming based on the topics to be included in the report.

5. Don't use friendly or overwhelming writings as it's a professional report. Use


more of technical words and less of normal writings.

6. Before you start writing the report collect all previous reports about the topic or
other reference material to boost a solid organizing.

7. Now as you have collected all the data start categorizing matter into prescribed
headings as shown in the above picture.

8. Keep all fonts in size-12 and headings around 14-16. Be sure that
there aren't some requirements for spacing and font size.

9. Include illustrative own made mechanisms in the report.


10. As you finish typing your report, get a hard copy for the soft copy report.

You might also like