Data and File Management
Data and File Management
Information:
-consists of facts and items of knowledge. Anything that is meaningful to people.
Information can be expressed in words, numbers, pictures, sound or measurements.
Types of Data
Analogue Data
It is the data represented by a quantity that varies continuously. The value of the data item
at a given time is represented by the size of the quantity, measured on a fixed scale.
- Some watches have an analogue display, where the hands move continuously
round a dial. The time is represented by the position of the hands on the dial.
- Conversations travel on an old telephone circuits as an analogue signal. The size
of the signal depends on the loudness of the speech. The words spoken show up
changes in the frequency of the signal.
Digital Data
1
Data is digital if some quantity in it can be set to a number of different separate values or
states. The combinations of these values represent data. Digital devices are usually binary
and the data is represented as a succession of 0s and 1s.
- An electronic calculator. The display is digital with the numbers in decimal. Each
digit can have any of the ten separate states (that is the numbers 0, 1, 2, 3, 4, 5, 6,
7, 8, 9). The circuits inside are digital with binary digits represented by 0 volts
and 5 volts.
- Some watches have a digital display – digits shown on a little display screen
represent the time.
- Telephone conversation can be digital.
Analogue
- The quantity use to represent data gets bigger or smaller depending on the
size of the data itself.
- Any value can be represented because the quantity can take any value in
the range used.
Digital
- With the device the quantity used can only take few different values –
usually only two.
- The data is held as a code.
Data Conversion
2
Converting information to data (Encoding and Decoding)
Encode
To encode data means to convert data into a form ready for processing.
Information about an item is encoded into the bar codes, which are then printed on item
labels. This data can be input into a computer via a laser scanner on a POS terminal at the
till.
Decode
On a school data file, the names of the teachers are stored. For this two (2) letters
of the surname are used. Thus, Mr. Gaongalelwe can be stored as GN, Miss
Mmepe as MP and Mr. Williams as WI. The computer has a reference file of
these codes. To print out the name the computer uses the reference file to decode
the two letters. It can then print out the full name.
Computers work with data, which, to them, is more than a string of symbols. These
symbols are letters, numbers, other characters, can be pictures and graphs. Inside the
computer, all this data is represented in the form the computer will understand (in the
computer’s own codes). These codes are based on two symbols only – the digits 0 and 1.
The digits ( 0 and 1) are also used in the base two (binary) number system. They are
given names bits short for binary digits
Computers use a common code called American Standard Code for Information
Interchange (ASCII). An ASCII code is an 8-bit character code. A convenient grouping
of bits inside a computer is in sets of 8 bits. A set of eight bits is called a byte. A byte can
store an ASCII character with only one bit left.
3
ASCII characters and their binary equivalent are shown below.
4
Exercise
2. One bit can have two possible values: 0 or 1. Two bits can have four values: 0 0, 0 1, 1
0, 1 1. How many possible values can the following have:
a) three bits,
b) four bits,
c) one byte?
Answer:
Data Types
It is the term used to describe the kind of data used e.g. whether it is a number or a letter.
Character: One of the symbols used to make up data. e.g. a letter (A…Z), a punctuation
mark or any digit of a number (0…9) etc. All keyboard combination characters.
5
Integers: complete numbers (whole numbers) either positive or negative.
Character set: this is a set of letters, digits and other symbols used for representing data.
These include numeric characters (digits), alphabetical characters (letters) and even
special characters (punctuation marks, mathematical symbols, etc).
Data Capture/Collection
Examples are:
For example
using a document reader,
Scanning pictures and text from documents,
Using sensors for data logging,
Scanning coded data such as bar codes and magnetic stripes
Advantages
Disadvantages
Many automated data entry/capture systems are expensive to set up. Therefore a
small shop may decide not to use the system.
6
e.g.
A membership subscription form,
A questionnaire,
A turn around document.
Advantages
Data is standardised – all records are set out in the same way.
People collecting the data know exactly what data is required.
Disadvantages
It can be slow to enter data.
Transcription (data entry) errors can occur.
Handwriting recognition can be unreliable
Turnaround Documents
A form, which is produced by a computer, with more data, added to it and then input to
the computer again for processing.
E.g.
At Omang offices renewals forms of the IDs can have an ID number already
written on it.
In a club membership form, the computer will print the person’s membership
number on the renewal form.
Advantages
Data, which is already known to the computer, does not have to be written or
keyed again.
The computer can recognise each individual document, using information it has
already printed on it.
Data Verification
It is the checking of the data, which has been copied from one place to another to see if
that, it still represent the original data.
Example:
7
In a computer bureau, data is being encoded onto a disc. A keyboard operator
reads the data from a source document and keys it at a key station, the data being
recorded on disc.
The second operator, who re-keys it all, then verifies this data. The computer
controlling the key station checks the data stored against the data now being
typed and reports any discrepancies/differences, so that any errors can be
corrected.
Data Validation
It is the checking of data at the time of input. The software carries out the checking. The
check is to ensure that the data is reasonable.
Note: Validation is not the checking of data to make sure that it is correct. Verification
does that. Validation checks many data entry errors, but not all.
Range Check
The software can be set to check that data falls within certain limits.
Examples
On a job application form, the date of birth can be validated to ensure that the age
of the applicant is greater than 17 and less than 61.
The readings taken from someone’s water meter can be validated to make sure
that they are within reasonable limits. This could prevent the customers getting
huge bills because of the operator’s error.
Length Check
A field may have been set up to hold only certain numbers of characters. The software
can prevent the operator entering many.
Presence Check
Some fields must not be left blank. For example, an application to sign up to an Internet
chat room may require a user name and an e-mail address.
Type Check
This makes sure that the data type is as expected. If someone accidentally enters a
number in someone’s name, such as Nic9las Cage, the software can easily pick up this.
Also, if someone entered a letter ‘o’, where a digit is required, this can also be noticed by
a type check.
Check digit
8
A check digit is an extra digit added on to a reference number.
e.g.
Bank account numbers, the ISBN of a book and scanned bar codes contain check
digits.
ATM card pin number may contain a check digit.
Bar codes for items sold in a supermarket.
Presenting Information
Presenting information can be done using a word processor or desktop publishing.
Word Processing
Word processing means producing text such as letters and reports using IT. A piece of
text produced by a word processor is called a document.
To enter data into a word processor the user simply types it. The user can end the
paragraph by pressing ‘ENTER’ key. At the end of the line the word processor goes
to the end of the line automatically.
Word Wrapping is the process of moving the cursor on to the new line automatically
when the next word will not fit on the present one.
Editing text
When one wants to change the text, he or she can do it in two ways:
1. Overtype: as you type you rub out the character your cursor is on to the right.
This can be done by pressing the ‘insert’ key (Twice or once)
2. Insert: the letters you type are inserted and all the rest of the text moves to
make a room for them.
9
Appearance and Style
A word processor allows you to change the text is displayed and printed. Common
features are:
1. underlining text,
2. making bold ,
3. centering,
4. italics,
5. double line spacing
Fonts
Examples
Spelling
A Spell Checker is a program, which checks the spelling of the words against those in a
dictionary.
Notes:
1. The dictionary is a file of words stored with the spell checker to which you can
add words that are not in the main dictionary.
2. Usually if the word is not found in the dictionary, you are given a choice. You can
ignore or skip the word.
Tabs
A TAB is usually at the left hand of the keyboard or marked with the word ‘TAB’ or with
two arrows pointing in opposite directions. When the TAB key is pressed the cursor
jumps across the page several positions at a time. Usually about 5 character spaces.
A word processor can change the way the text fits on to a page.
10
The margins are the limits, which have been set for text near the edges of the page. They
can be changes so that the text is nearer the edge or further away from it.
Moving the left or right margin makes the text wider or narrower. Moving the top or
bottom margin makes the text longer or shorter.
The indent is the distance text is moved in from the margin. You can indent part of the
text without moving the actual margin.
To justify the text means to keep the letters in a straight line at the edge of the page.
This is a screen showing indented text with margins and well justified.
11
A word processor allows you to:
1. search for a word or words(sometimes called ‘find’). You simply key in the
word and the cursor moves to the first it occurs in the document,
2. search and replace. The computer searches for a word and replaces the word
with another one, wherever it finds it. Usually you have the choice of deciding
whether to replace it or not when it is found.
Mail Merge
Many word processing can produce a set of letters by adding to them the name and
address of each person on a mailing list.
A mail merge is the operation of producing a set of personalized letter by merging the
personal details with the standard letter.
Example
merge
Other names for this technique include mail shot and mass mailing.
The following are needed to carry out the mail merge:
the instruction on how to merge them – these may be codes within the standard
letter.
Advantages:
Disadvantages:
A mail merge sometimes makes it to easy to produce letters, which people do not
want. They may be regarded as ‘junk mail’.
Desktop publishing is the use of a computer system to produce page layouts of high
quality for printing or publication.
Characteristics
13
Options with package to produce text and graphics.
Facilities to for moving pictures and pieces of text on the page and adjusting their
size to fit spaces.
Application of DTP
File Organisation
File
The term file is used to describe any data or program stored on a backing store such as a
tape or a disc.
Examples of files:
When they have been saved any of the following can be regarded as a file.
A computer program.
A spreadsheet.
A computer drawing.
A student file. Each record in this file holds the data on one student.
A teacher file. Each record in this file contains all the data about one teacher.
14
A field is an area of a record reserved for one particular type of data item. Each field
contains one data item. An item of data here means the smallest piece of data that would
be dealt with separately – a single name or a single number etc.
Storage of Files
To create a file means to organize data into a file, e.g. when the fields are set up in a
database and the records are keyed into it.
To save a file means to copy all the records of the file from the main store to the backing
store.
To load the file means to read all the records of the file from the backing store into the
main store.
To open a file means to prepare it so that data can be read from it or written to it.
To close a file is the procedure, which is necessary when the user has finished using a
file.
15
Directories
A directory is a small file on a disc, which is used by the operating system to locate the
other files on the disc.
The directory contains a list of names of files and the information needed to access the
files on the disc. The information given in a directory can include:
A directory can mean the area of a disc where files are stored. The main directory on a
disc is called the root directory. A sub-directory is a part of the root directory
The screen shot below shows part of the directory structure on a PC.
16
Under folders, there is a directory and sub- directories. In the ‘my document’ sub-
directory, there are files created by database, word processing and spreadsheet application
packages.
The operating system provides a means of organizing files into directories or folders so
that they can be easily managed. Most operating systems use a hierarchical or tree
directory structure, where folders are stored in other folders. This is shown in the diagram
below.
Hard disk
Types of Files
17
There four different types of files. These are:
Serial files
Serial files can be used in batch processing systems to hold transaction data before
it is sorted out.
A sequential file
Example can be a club membership file where the order depends on the
membership number.
The way in which the records are arranged within the file;
The method of working out where each record is stored in the file.
The method of access to a file refers to the way in which a program reads data from a file
or writes data to it.
A serial access file has data stored on it in the order in which it was written. Each new
record goes to the end of the file. To read a record it is necessary to read through all the
preceding records first.
A sequential access file has data stored on it in the order of the data in a primary key.
A file of stolen cars can be a sequential access file.
A Direct access file is the one where any record can be accessed without having to access
other files first. Also known as random access.
Notes:
18
File stored in a tape cartridge are always serial or sequential access. A direct
access would involve too much movement of the tape forward and backward.
Direct access files can only be stored on a direct access medium (such as
magnetic tape)
Selected records can be accessed far more quickly from direct access.
Records do not have to be put into any particular order before the file is created.
Updating files
19
To update a file means to alter it with new information.
20