0% found this document useful (0 votes)
16 views

Data Management

Uploaded by

henrrymolina011
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

Data Management

Uploaded by

henrrymolina011
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 36

*DATA MANAGEMENT

PRESENTATION OF DATA
This refers to the organization of data into tables,
graphs or charts, so that logical and statistical conclusions
can be derived from the collected measurements.
For example, a nationwide travel agency offers special
rates for package tours during summer. To economize
spending for the advertisement only certain age group of
people will be sent brochures for attraction. The agency
gets to previous passenger customers from its files and
groups them according to ages. Only those who are groups
with least people are sent brochures.
The following are the ages of the previous
customers:

*1. An array from the smallest to the largest


Rules in the Construction of Frequency Distribution
1. We seldom use fewer than 5 or more than 15 classes.
We note that it is impractical to group a thousand
measurements into four classes or to group 10
observations to 7 classes.
2. Whenever possible we make the classes cover equal
ranges of values and make ranges multiples of numbers
that are easy to work with. Open classes should be
avoided such as classes if “less than” or “more than”
3. We can make sure that each item goes only into one
class. It means that the classes should overlap.
4. In the final presentation of the table, tally is
usually omitted
In deciding the number of classes, the statisticians
Freud and Simon suggested the following:

However, if we cannot decide on the number of


classes to the used, the suggested formula is:
Where N denotes the number of
observations.
For example, using the data in the given
array of numbers, the class interval is
= = 9.8 = 10, approximate size of class
interval.
Classes Tally Frequency Relative Percentag
Frequency e
10-19 I 1 0.025 2.5%
20-29 I 1 0.025 2.5%
30-39 III 3 0.075 7.5%
40-49 IIII 4 0.100 10.00%

50-59 IIIII-IIIII-III 13 0.325 32.5%


60-69 IIIII-IIIII-III 13 0.325 32.5%
70-79 IIII 4 0.100 10%
80-89 I 1 0.025 2.5%
Sum 40 1.000 100%

Frequency Distribution of ages of the customer Passengers of


Nationwide Travel.
Classes Lowe Upper Lower Upper Class
r Limit Boundar Boundar Mark
Limit y y
10-19 10 19 9.5 19.5 14.5
20-29 20 29 19.5 29.5 24.5
30-39 30 39 29.5 39.5 34.5
40-49 40 49 39.5 49.5 44.5
50-59 50 59 49.5 59.5 54.5
60-69 60 69 59.5 69.5 64.5
70-79 70 79 69.5 79.5 74.5
80-89 80 89 79.5 89.5 84.5
Measures of Central Tendency

A measure of central tendency is an


important aspect of quantitative data. It is an
estimate of a “typical” value.

Three of the many ways to measure central


tendency are the mean, median and mode.
The Mean of ungrouped data
The mean of ungrouped data is defined as the
sum of all the scores or data divided by the
number of scores in the data.
In particular, the mean is denoted by of the
scores is given by the formula
=

For example, the mean of 5,7,11,20 and 18 is


Weighted arithmetic mean of a
group of numbers or scores
designated by , , ….. which occur
respective is
Example : The weighted arithmetic mean of the
numbers 12, 12, 12,15, 15, 18, 18, 18, 16, and 20

= 15.6
Note on Odd or Even Sample Sizes

If the sample size is an odd number then the location point will produce a median that is
an observed value. If the sample size is an even number, then the location will require
one to take the mean of two numbers to calculate the median. The result may or may
not be an observed value as the example below illustrates.
The mode is the value that occurs most
often in the data. It is important to note
that there may be more than one mode in
the data set.
Example 1-5: Test Scores
Consider the aptitude test scores of ten
children below:
95, 78, 69, 91, 82, 76, 76, 86, 88, 80
The mean for grouped data

The arithmetic mean for grouped data


Example.
On arriving in Boracay Beach in Cebu a sample of 60 vacationers is asked about their
ages by the Tourist Bureau. The sample information is organized into the following
frequency distribution. Compute the mean age.

Age Midpoint (x) Number of vacationers(f) fx


11-20 15.50 3 77.50
21-30 25.50 7 178.50
31-40 35.50 12 426.00
41-50 45.50 22 1001.0
51-60 55.50 8 444.0
61-70 65.50 4 262.0
71-80 75.50 2 151.0
σ 𝑓𝑥 =2540
n=60
* = 42.33
Coding or Deviation Method
A shorter formula for computing the mean is by the
use of the process called coding or deviation method.
Choose any class mark denoted by with corresponding
deviation of 0. Then the values of d are the integer d=
0
The coding method is given by the formula

+.c, where c , the class size and d, the deviations, d= 0


Example.

Scores f x d fd
65-69 2 67 -3 -6
70-74 8 72 -2 -16
75-79 10 77 -1 -10
80-84 9 82 𝑥0 0 0
85-89 7 87 1 7
90-94 2 92 2 4
95-99 2 97 3 6
n=40 ෍ 𝑓𝑑 = −15
*+ = 82+5 = 80.125
The Median for a grouped data is
= .c
Where
denotes the lower boundary of the class
containing the median;
in all classes immediately preceding the class
containing the median;
requency in the class containing the median; and c,
width of the class.
Age Number of CF
vacationers(f)
11-20 5 5
21-30 7 12
31-40 12 24
41-50 22 46 Median class, =
40.50
51-60 8 54
61-70 4 58
71-80 2 60
n=60
Thus the Median
= .c
= 40.50 + . 10
=43.23
The Mode for grouped data
The mode is the most frequently
occurring score in the grouped data. It is
used with scores from a nominal
variable.

+c
Compute the mode of Aling Miling’s
store in a given month with the following
frequency distribution.
Sales Number of days(f)
1,000-1999 2
2,000-2,999 3
3,000-3,999 6
4,000-4,999 7
5,000-5,999 8 modal class
6,000-6,999 2
7,000-7,999 2
n=30
Solution. The modal class is the interval
5,000-5,999. With =8, =7,= 2, then the
mode is

=+c
= 4,999.50+.1,000
Variance & Standard Deviation for ungrouped
data

, for the variance

s =, for the standard deviation


Example. Find the standard deviation of the scores 3, 14,
20, 16, 24 and 7. Tabulating the scores and computing the
squares, we have the following:

3 9
14 196
20 400
16 256
24 576
7 49
= 62

s= 7.87, for standard deviation


Short cut formulas of Standard Deviation
and Variance for Grouped data.
.
Where c the class size and d, the values
of deviations, i.e d=0
Consider the data of the preceding problem and compute
the standard deviation using the short cut formula.
Scores f d fd
5-7 2 -2 -4 4 8
8-10 4 -1 -4 1 4
11-13 14 0 0 0 0
14-16 9 1 9 1 9
17-19 6 2 12 4 24
20-22 3 3 9 9 27
23-25 2 4 8 16 32
*
=18.8076
s= = 4.34 Standard Deviation

You might also like