0% found this document useful (0 votes)
2 views

Module 2 AgStat Revised

This document is a learning module focused on Agricultural Statistics, covering basic principles such as data types, measures of central tendency, and probability. It outlines learning outcomes for students, including understanding population and sample concepts, and various sampling techniques. The module also includes guidelines for rounding numbers and calculating confidence intervals for population means.

Uploaded by

Charvy Acot
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Module 2 AgStat Revised

This document is a learning module focused on Agricultural Statistics, covering basic principles such as data types, measures of central tendency, and probability. It outlines learning outcomes for students, including understanding population and sample concepts, and various sampling techniques. The module also includes guidelines for rounding numbers and calculating confidence intervals for population means.

Uploaded by

Charvy Acot
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 41

Learning Module 2

Week 4-5
o

Agricultural
Statistics
(Stat 1e)

CAMIGUIN POLYTECHNIC STATE


COLLEGE
CATARMAN CAMPUS
Institute of Agriculture
Tangaro, Catarman, Camiguin

1
Table of Contents

Learning Module 2
REVIEW OF BASIC PRINCIPLES
OF STATISTICS
..... ii

Data ...................... 1

Rounding Numbers ……….……..5

Population and Sample


.............. 6

Measures of Central Tendency


of Position and of Variability of
Ungrouped Data ................. 13

Measures of Central Tendency


of Position and of Variability of
Grouped data ............ 24

Probability
............................................ 28
Mid Term
Submit your outputs on time.
Submission is on the schedule of
module retrieval. (see due date in
google classroom)

0
Learning Outcomes:
At the end of the unit, the students are expected to:
1. Review some basic principles of statistics
2. Discussion on data focused on population and sample
3. Discuss and calculate the measures of Central Tendency of Position
and of Variability of Ungrouped Data
4. Discuss and calculate the measures of Central Tendency of Position
and of Variability of Grouped Data
5. Discussion and Calculation for Probability

Data
Data? – collections of
observations (such
as measurements,
genders, survey
responses)

Constant – refers to the fundamental quantities that do not change in value.

Variables – quantities that may take anyone of a specified set of values

• A function whose value is a real number determined by its element in the


sample, which is referred to as the domain of the variable.

Types of variables:

1. Qualitative variable

• Non measurable that cannot assume a numerical value but can be


classified into categories.
• Non-numerical characteristics or labels

Ex. Favourite Movie


Eye Colour
Political party

1
2. Quantitative variable

 quantities that can be counted, measured or calculated numerical


measurements or quantities

Ex. Height, weight, Income


Income, Resting Pulse rate
Blood alcohol level

Qualitative variables

1. Dichotomous
Ex. Gender (male or female), emotional conditions (happy or sad)
2.Trichotomous
Ex. Opinion (yes, no or undecided)
Decisions (argee, disagree or undecided)
3. Multinomous
Ex. Mode (always, often, seldom, very seldom or never)
Size (extra small, small, medium, large, extra large)

Quantitative variables

Types based on way of obtaining its value

1. Continuous variable

A variable that can theoretically assume any value between two given
values obtained through measurement.

2. Discrete variable

➢ A variable that cannot assume any value


➢ Obtained though counting

In general, measurement give rise to continuous data while counting


give rise to discrete data.

2
Types based on cause-and-effect relationship:

1. Dependent variable

A variable that is observed and measured to determine the effect of


another variable.

2. Independent variable

A variable that is not affected by the other variable.

Scales or Levels of Measurements

Another way to classify data is to use levels of measurement.

The four (4) levels of measurements are:

1. Nominal level
2. Ordinal level
3. Interval level
4. Ratio level

Nominal Level of Measurement

➢ Characterized by data that consist of names, labels or categories


only, and the data cannot be arranged in an ordering scheme (such
as low to high)
➢ Objects or individual’s responses in a single category are equal with
respect to some attributes and each category is coded numerically
when subjected to statistical analysis
➢ Qualitative data only

Examples:

Responses (yes -1, no-2, undecided-3)


Marital status (single-1, married- 2, separated- 3; or widow/widower- 4)
Blood type (A-1, B-2, AB-3, O-4)
Shape of the bacteria (spherical -1,; rod-shaped -2; spiral- 3)

3
Ordinal Level of Measurement

➢ Involves data that can be arranged in some order, but differences


between data value either cannot be determined or are
meaningless
➢ Classifies objects or individual’s responses according to degree or
level, then each level is coded numerically when subjected to
statistical analysis
➢ Can be qualitative or quantitative

Examples:

Course grades ( A-1; B-2; C-3 or D-4; E-5)

Customer’s responses (excellent-1; very satisfactory -2; satisfactory -3;


fair-4; or poor needs improvement-5)

Honor awards (Summa Cum Laude -1; Magna Cum Laude -2; Cum
Laude -3)

Interval Level of Measurement

➢ Data values are numerical, so they have a natural meaningful


order and differences between data values are meaningful, zero
is an arbitrary measurement rather than actually indicating
“nothing”.
➢ Always quantitative

Examples:

Temperature: 10⁰C, 15⁰C, 30⁰C


Year of birth: 1960, 1980, 2000, and 2010
Ratio level of Measurement
Achievement scores: 15, 24, 35, 40, 45

4
Ratio level of Measurement

➢ The interval level with the additional property that there is also a
natural zero starting point (where zero indicates that none of the
quantity is present); for values at this level, differences and ratios
are meaningful.
➢ Always quantitative

Examples:

Weight
Volume
Number of Children

Rounding Numbers
Rounding off

➢ Maybe required in numbers with decimal places; usually resulting


from measurements

Rules:

1. When the last digit or the digit to be dropped is <5; it is dropped

Example: Round off the ff. to the nearest hundredths

0.664= 0.66 4.752 = 4.75 0.7638 = 0.76

2. When the last digit or the digit to be dropped is >5; the preceding
digit is increased by 1.

Example: Round the ff. to the nearest tenths

0.86 = 0.9 8.58 = 8.6 32.487 = 32.5

3. When the last digit or the digit to be dropped is 5; the rounding off is
governed by the “even integer convention”

5
Rounding off to the nearest even integer

Example: Round off the ff. to the nearest thousandths


2.4385 = 2.438 6.4755 = 6.476 0.46354 – 0.464

The above rule may be simplified as:

When 5 is preceded by an odd number, the preceding digit is


increased by 1

When 5 is preceded by an even number, the preceding digit is


retained.

Population and Sample


PARAMETER – a measurable
quantity, e. g. temperature that
determines the result of a scientific
experiment and can be altered to
vary the result.

A general quantity that relates to an


entire population, as distinct from
an individual statistic that relates to
a sample.

Population – is a finite or infinite collection of objects, events or individuals


with specified class or characteristics under consideration

➢ The totality of individuals or units under study


➢ Consist of all possible values of a variable
➢ Denoted by a capital letter “N”

Examples:

Students in CPSC
taxi drivers in CDO
cellular phones user
TV viewers

6
➢ When all values of a population are known, it is possible to describe it
without vagueness since all is known about it.
➢ However, investigation of the entire population is difficult due to
material constraints such as time, money and efforts
➢ It is often impractical if not impossible to get hold of the total population
thus, we resort to drawing and studying small part of the population, the
sample, the “representative” of the population.
➢ We compute sample estimates or statistics of population parameters.

Sampling is utilized when the mass of data is too great to be handled

➢ It refers to the method of getting a small but representative cross-


section of the population.

Sample –is a finite or limited collection of objects, events or individuals


selected from population

➢ Is expected to possess characteristics identical to those of the


population, otherwise validity and reliability of the information will
be in question.
➢ Denoted by a small letter “n”

TYPES OF DATA SETS

Population – is a finite or infinite collection of objects, events or


individuals with specified class or characteristics under
consideration

Samples – is a finite or limited collection of objects, events or


individuals selected from population

Sampling Techniques

➢ Utilized to test the validity of conclusion or inferences from the


sample to the population
➢ A representative sample of 100 is generally preferable to the
unrepresentative sample of 1000

Sampling Technique
1. Simple Random Sampling

7
2. Stratified Random Sampling
3. Systematic Random Sampling
4. Cluster Sampling
5. Multistage Sampling

Simple Random Sampling

➢ A limited number of individuals chosen from a population


➢ Every individual has equal chances of being selected in the
sample before the selection is done
➢ Involves the use of randomization schemes such as:

▪ Drawing of lots
▪ Use of table of random numbers
• Usually used if the population is large, hence
lottery is cumbersome
▪ Use of cards

Stratified Random Sampling

➢ Dividing the population into at least two different subgroups


(strata) that share the same characteristics
➢ Getting the members at random proportionate to each stratum or
subgroup.

➢ To obtain sample proportionate to the given members in each stratum


(course)
➢ Determine the population of the sample to the population, that is
dividing the sample size by the population.
➢ Multiplying the number for each stratum by the proportion or
percentage obtained above.
➢ Draw the required random sample size by using a randomization
scheme

Systematic Random Sampling

➢ Refer to the process of selecting every nth element in the


population until the desired sample size is obtained.
➢ All members of the element are arranged alphabetically or in
any systematic fashion.

8
▪ Divide the population size by the sample size to get the
interval of sample
▪ Say, for a population of 5,000 and the sample size desired
is 500 then divide 5,000 by 500 and the result is the
sample interval collection; that is, the sample is every 10th.

Cluster Sampling

➢ A cluster refers to an intact


group which has a common
characteristic.
➢ Advantageous procedure when
the population is spread over a
wide geographical area.
➢ A practical sampling technique
used if the
complete list of
the members of
the population is
not available.
➢ Divide the
population area
into sections (or clusters)
➢ Randomly select some of those clusters
➢ Choose all members from selected clusters

Multistage Sampling

➢ Involves a complex sampling technique


o Dividing the population into strata
o Dividing each stratum into clusters
o Drawing a sample from each cluster using the simple
random sampling technique

For example, our country is divided into 13 regions which form the stratum.
Each region consist of provinces and each province consists of
municipalities from which random samples may be collected.

Finding the sample size, n with respect to the population size N by using the
formula: N
n= 1+Ne²

9
Where:

n is the sample size


N is the population size
e is the margin of error
Sloven’s formula in determining the sample size.

Example:

Find the sample size the researcher should include in his study if the
population size of his respondents is 2000 at 95% accuracy?

Solution:

Since 95% accuracy is required, the corresponding percentage of


error is 5%. Hence, applying the formula:

2000 2000
n= = = 1334 samples
1+2000 (0.05)² 1+5

Two general types of samples:

1. Purposive sample – consists of units picked from the population


deliberately vary depending upon the sampler and his motives.

2. Probability sample – is drawn by following definite set of rules which


allows one to evaluate, even before actual selection, the chance of
drawing any particular sample.

Random sample; equal probability of selection.

Two types of sample in terms of size:

1. Large Sample (n > 30)


2. Small Sample (n ≤ 30)

10
➢ Parameter – a measurable quantity, e.g. temperature that determines
the result of a scientific experiment and can be altered to vary the
result.
➢ A general quantity that relates to an entire population as distinct from
an individual statistic that relates to a sample.

Parameter/Statistic:

Parameter/Statistic Population Sample


1. Mean µ x
2. Proportion P ṗ
3. Variance 𝜎² s²
4. Standard deviation 𝜎 s

➢ One of the primary uses of statistics is to estimate population


parameters when the population is too large for a census to be
practical

o To accomplish this, a random sample of values from the


population data set is drawn and the sample statistic
calculated.

Hence, when a parameter is not known, an estimate of the said


parameter may be calculated; the confidence interval of a parameter

➢ Confidence interval of a parameter – is an estimate of a


parameter that is not known

o Range of values or an interval estimated to contain the


value of a population parameter.

Level of Confidence

The level of confidence for an interval estimate is the probability


that the interval contains the population parameter.
11
It is a probability for the experiment of drawing a sample and constructing
an interval estimate.

 May be referred as degree of accuracy

 The level of confidence I denoted by L of C

Commonly used levels for the estimates:

Levels of confidence, Levels of Significance,

L of C α = 100 – L of C

90% 10%

95% 5%

99% 1%

Confidence Interval for a Population Mean

 For large sample (>30):

𝜎
𝐶. 𝐼. = 𝑥 ± 𝑍
√𝑛
Where:
X = the sample mean
n = the # of samples
Z = the critical value of z corresponding to a level of confidence
𝜎 = population standard deviation
Z = 1.96 at 95%
Z = 2.58 at 99%

12
For small sample
 For sample (n< 30):

𝜎 To be used if
𝐶. 𝐼. = 𝑥 ± 𝑍
√𝑛 𝜎 𝑖𝑠 𝑘𝑛𝑜𝑤𝑛 ( 𝑡ℎ𝑎𝑡 𝑖𝑠, 𝑠 𝑖𝑠 𝑛𝑜𝑡 𝑡𝑜 𝑏𝑒 𝑠𝑢𝑏𝑠𝑡𝑖𝑡𝑢𝑡𝑒𝑑)
𝑠
𝐶. 𝐼. = 𝑥 ± 𝑡
√𝑛 To be used if
𝜎 𝑖𝑠 𝑛𝑜𝑡 𝑘𝑛𝑜𝑤𝑛 𝑎𝑛𝑑 𝑡ℎ𝑒𝑟𝑒𝑓𝑜𝑟𝑒, 𝑠 𝑖𝑠 𝑢𝑠𝑒𝑑

T value is derived from the table of Distribution of t probability with

df= n-1 𝛼 = 100 − 𝐿 𝑜𝑓 𝐶

Margin of error, E – error allowed in the study


𝜎
E = 𝑍 + √𝑛 For large sample and also for small sample wherein
𝜎 𝑖𝑠 𝑘𝑛𝑜𝑤𝑛
𝑠
E = 𝑡 + √𝑛
For large sample wherein
𝜎 𝑖𝑠 𝑛𝑜𝑡 𝑘𝑛𝑜𝑤𝑛 𝑎𝑛𝑑 𝑡ℎ𝑒𝑟𝑒𝑓𝑜𝑟𝑒, 𝑠 𝑖𝑠 𝑢𝑠𝑒𝑑

Measures of Central Tendency of Position


and of Variability of Ungrouped Data
 Tendency of typical values to lie centrally within a set of data
arranged according o the magnitude

 Descriptive measure that are used to indicate where the


center, the middle property or the most typical value of a set
of data lies.

1. Mean, -- the sum of all the given values or items in


a distribution divided by the number of values, n

𝑥1+𝑥2+𝑥3+𝑥4……….
= 𝑛

13
Characteristics of mean:

➢ It is reliable or a more stable measurement to use.


➢ It is a point that balances all the values on the either side
➢ It is greatly affected by extremes.

Uses of the Mean:

➢ To obtain an average value of a series of values


➢ Useful measure for inferential statistics
➢ For interval and ratio measurements
➢ For approximately normal distribution

Limitation of the Mean

➢ It cannot be used if the clustering of values or items is not


substantial

o Values that are far apart such as 10 and 100

➢ It is easily affected by extremely large or small values

o One small value can easily pull down the mean

➢ It is a poor measure of central tendency if the given values of


a distribution do not cluster around a central value.

Limitation of the Mean:

➢ It cannot be used to compare distributions since the means of


2 or more distributions may be the same but their other
characteristics may be entirely different.

Distribution A: 90, 95, 100 mean: 95

Distribution B: 96, 95, 94 mean: 95

14
➢ Same mean but different pattern of variation

➢ Unbiased and preferred measure of central tendency but is


affected by extreme values.

2. Median, Md – value of the middle observation in an ordered


distribution

➢ The middle value of the value that divides the ordered


distribution in half.

➢ In distribution with odd number of values, the median is simply


the middle value

➢ In the distribution with even number of values, the median is


the midpoint between the two middle values.

Position of Median = (n +1)/2

Characteristics of Median:

➢ It is easy to understand
➢ It is easy to compute
➢ It is not affected by extremes

Limitations of the median:

➢ It cannot be determined if the given values are not arranged


according to magnitude

➢ The value of a median is not as accurate as the mean because


it is just an ordinal statistic

➢ Arranging values according to magnitude is a laborious task in


distributions with several values or items.

➢ It is easily affected by the number of items in a distribution

➢ It is easy to compute but generally, it is not preferred measure


of central tendency.
15
3. Mode, Mₒ -- item or value in a distribution with the highest
frequency or most number of cases.

-- most frequently occurring value

Set A:

67 68 68 70 70 70 70 71 72 75 76 78
Mode : 70 – unimodal

Set B:

67 68 68 70 71 72 74 76 79 79 80
Modes: 68 and 79 – bimodal

Set C:

65 66 68 69 70 72 73 75 76 78 80
Mode: no mode – all have the same frequency

Characteristics of a mode:

➢ It is easy to determine
➢ It is not affected by extremes
➢ It is the simplest but unreliable

Uses of the mode:

➢ It is a quick estimate of the average


➢ It helps to spot a trend
➢ Provides information to businessmen and producers that
would help them planning and decision making

Example: size of dresses or shoes commonly purchased

Limitation of the mode:

➢ It is rarely or seldom used since it does not always exist

16
➢ It is very unstable
➢ It is just a rough estimate of the center of concentration of a
distribution
➢ Mode is the poorest measure of central tendency

Measure of Position for Ungrouped Data


➢ A value that divides a distribution into certain specified
portions when the terms are arranged in an ascending or
descending order, also called as quantiles.
➢ The position of a particular value marks the demarcation line
between portions

1. Median – divides a distribution into two equal parts. Hence,


50% of the terms are below and 50% of the terms are above
it

--- a middle term when data are arranged in descending


or ascending order.

2. Quartiles – divide the distribution into 4 equal parts

-- The quartiles: Q1 (first quartile)


Q2 (second quartile)
Q3 (third quartile)

The three quartile values divide a frequency distribution


into four parts and each contain a quarter of the sample population

Q1 (first quartile) where: n is the number of items in a distribution

Q1 = ¼ (n +1)

Q2 (second quartile)

Q2 = (n + 1)/2

17
Q3 (third quartile)

Q3 = ¾ (n +1)

Determine the quartiles of the distribution below:

65 66 68 69 70 72 73 75 76 78 80

3. Deciles – divide the distribution into 10 equal parts

The deciles: D1 (first decile)

D2 (second decile)

D9 (ninth decile)

The nine decile values divide a frequency distribution


into ten parts and each contain one-tenth of the sample
population.

4. Percentiles or centiles

-- divide the distribution into 100 equal parts

-- The percentiles: P1 (first percentile)

P2 (second percentile)

P99 (99th percentile)

18
Equivalents of the Measures of Position

Median = Q2 = D5 = P50

Q1 = P25 D5 = P50

Q3 = P75 D6 = P60

D1 = P10 D7 = P70

D2 = P20 D8 = P80

D3 = P30 D9 = P90

D4 = P40 D10 = P100

Percentile of score =
Example: Scores of 45 students

70 76 81 87 91 96 100 106 112

72 76 82 88 92 98 101 107 114

72 78 83 89 93 97 102 108 115

74 79 84 90 94 98 104 110 118

75 80 86 91 95 99 105 111 118

Calculate the percentile of the ff. scores:

1.) 90 = 40% or P40 2.) 102 = 71% or P71


3.) 115

 To determine the score when given the percentile:

Locator, L = (P/100) x n

where: P is the given percentile and is the # of sores

19
The score following the L position is the score of the
given percentile

Determine the scores with the ff. percentiles:

1.) 75th percentile, P75

2.) 90th percentile, P90

3.) score of a student to belong to upper 20%

Measure of Variability for Ungrouped data

➢ Also referred to as measures of dispersion


➢ Refer to the spread of the values
➢ Describe the characteristics of a set of sample in a given
population
➢ If the measure of variability is small, the group is more or less
homogeneous
➢ If the measure of variability is large, the group is more or less
heterogeneous.

1. Range – refers to the difference between the highest and the


lowest values

-- the simplest and the easiest to determine among the


measures of variability

-- the most unstable because its value easily fluctuates with


the highest or lowest value.

Say, for the set of scores below:

84 88 93 99 105 107 112 118 125

Range = 125 – 84 = 41

20
Characteristics of range

➢ Easy to compute and understand


➢ Emphasizes the extreme values
➢ Most unstable or unreliable measure of variability
➢ Not preferably used

Uses of range:

-- to report the movement of a process over a period of


time

2. Average deviation or mean deviation

➢ Refers to the sum of the absolute deviations of the arithmetic


mean divided by the number of cases

∑|𝑥 − | Where: Md is mean deviation, is the mean of


Md= the values and n is the number of values
𝑛

➢ A measure of variability which is infrequently used.

3.Quartile deviation, Q.D.


➢ Measure the spread of half of the range of the middle 50% of
the values in a distribution
Q.D. = (Q3 – Q1)/2 = (76 – 68) = 8/2 =4

For the data below

65 66 68 69 70 72 73 75 76 78 80

Q1 = ¼ (n +1) = ¼ (11+1) = 3rd 68


Q3 = ¾ (n +1) = ¾ (11 +1) =9th 76

21
4.Variance
It is the average squared deviation around the mean

For population

∑ (𝑥 − 𝜇 ) ²
𝜎² =
𝑛
For a sample Where: x is the observation
∑(𝑥− )² value,𝜇 𝑖𝑠 𝑡ℎ𝑒 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑚𝑒𝑎𝑛,
s² = 𝑛 𝑖𝑠 𝑡ℎ𝑒 𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟
𝑛 −1
𝑜𝑓 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛
= the sample mean

5. Standard Deviation

It is the positive square root of variance

For a population ∑(𝑥 − 𝜇)2


𝜎=√
𝑛

For a sample ∑(𝑥− )2


s= √ 𝑛 −1

For the example given earlier where the s² = 2.5


the standard deviation, S

S= √𝑠2 = √5.0 = 2.5

22
Characteristics of standard deviation:

➢ Most important and useful measure of dispersion

➢ Widely used in research and is used in drawing inferences


from samples to populations.

Uses of the standard deviations:

➢ To determine how far the data are from the mean

➢ Indicator of how much variability there is in an entire


distribution

➢ If the standard deviation is small, the values are clustered


tightly about their mean

➢ If the standard deviation is large, the values become more and


more scattered about their mean.

6. Coefficient of Variation, CV

A measure of relative variability that is used to compare two


different sets of data

√𝑠 2 𝑠
CV= x 100 = x 100
𝑥 𝑥
So, for the example given earlier where x = 6 and s =2.2361

𝑠
CV= = 𝑥 x 100 = 2.2361/ 6 = 37.3%

7. Standard error of the mean, S

𝑆 2.2361 2.2361
S = = = = 1.0
√𝑛 5 2. 2361

23
Measures of Central Tendency of Position
and of Variability of Grouped data
➢ When the researcher gathers all the needed data, the next
task is to organize and present them with the use of
appropriate tables and graphs.

Frequency distribution is one system used to facilitate the


description of important features of data.

➢ A frequency distribution is a tabulation or grouping of data


into appropriate categories showing the number of
observations in each group or category.
➢ It has the following features: class intervals or class limits,
class boundaries or exact limits of a class interval, class
marks, class size and class frequency.

Features of a frequency distribution:

1. Class interval or class limit

➢ Is the grouping defined by a lower limit and an upper limit.

2. Class Boundary or exact limits of a class interval


➢ Is minus one-half or 0.5 or plus one-half or 0.5 from the lower
and upper limits, respectively.

3. A class mark

➢ Is the midpoint or the middle value of the class interval


➢ Is obtained by finding the average of the lower class limit and
the upper class limit

4. Class Size

➢ Refers to the difference between the upper class


boundary and the lower class boundary of a class
interval

24
5. Class Frequency

➢ Refers to the number of observations belonging to a


class interval

Steps to followed in constructing a grouped frequency


distribution:

1. Find the range of the values (range = highest value – lowest


value)

2. Determine the class width by dividing the range by the desired


number of groupings or class intervals. The number of class
intervals normally is not less than 10 and not more than 20.

C = range/desired number of class intervals

Where: # of class intervals, k = + 3.3 log n

Scores of students in Stat 1

71 80 87 93 99 106 116 Range = highest value – lowest


72 80 88 93 99 107 117 value
C = range/desired # of
73 81 88 94 100 108 118 intervals
74 82 89 95 101 109 120 Lower limit = multiples of class
size
74 83 90 96 102 110 123
Upper limit = lower limit + (c -
75 84 90 96 102 111 1)
76 85 91 97 103 112 Class marks = midpoint in each
interval
77 86 91 97 103 113 Tally the frequencies for each
78 86 92 97 104 114 interval and get the sum.
Calculate the cum. Frequency
79 87 93 98 105 115 and cum percentage.

25
Class Class Class Frequency Cumulative Cumulative
Interval Boundary mark Frequency %

120- 124 119.5 – 124.5 122 2 65 100

115 – 119 114.5 – 119.5 117 4 63 96.9

110 - 114 109.5 – 114.5 112 5 59 90.8

105 - 109 104.5 – 109.5 107 5 54 83.1

100 - 104 99.5 – 104.5 102 7 49 75.4

95 - 99 94.5 – 99.5 97 9 42 64.6

90 - 94 89.5 – 94.5 92 9 33 50.8

85 - 89 84.5 – 89.5 87 8 24 36.9

80 -84 79.5 – 84.5 82 6 16 24.6

75 - 79 74.5 – 79.5 77 5 10 15.4

70 - 74 69.5 – 74.5 72 5 5 7.7

Measure of Central Tendency

1. Mode (modal class) – is the class width the greatest frequency


2. Mean,
∑𝑓𝑖 𝑋𝑖𝑗
=
𝑛

Where: f = number of observations in a class


= midpoint or class mark of a class

n = total frequency in the sample distribution

26
Measure of Position

1. Median, Md

In the computation of the median, the number of desired


item is first determined by n/2 where: n is the number of items
𝑛⁄ −𝑐
2 𝑓𝑝
Md= 𝐿B + [ 𝑓𝑚𝑑 ]𝐶

Where: Lb = lower boundary of the median class (median class


is the class interval where n/2 is found)
Cfp = cumulative frequency for the class interval preceding the
median class
fmd = frequency of the median class

2. Quartile, Q
𝑛⁄ −𝑐
4 𝑓𝑝
Q1= 𝐿B + [ ]𝐶
𝑓𝑞

3𝑛⁄ −𝑐
4 𝑓𝑝
Q3= 𝐿B + [ ]𝐶
𝑓𝑞

3. Deciles, D

9𝑛/10−𝑐𝑓𝑝
D9= 𝐿B + [ ]𝐶
𝑓𝑞

27
Measures of Variability

1.Mean Deviation, Md
Where: X = midpoint or class mark of a
∑|𝑥 −𝑥| class
M.D. =
𝑛 x = mean of the sample observation

2. Variance, S²

∑𝑓 (𝑥−𝑥)²
s² = 𝑛 −1
3. Standard Deviation

∑ 𝑓(𝑥−𝑥)2
s= √ 𝑛 −1

Probability
➢ It is a mathematical measure of the likelihood of
an event occurring

➢ Are always fractions or decimals indicating the


portion or percent of the time that the event occurs

➢ Words associated with the notion of probability:

▪ Chance
▪ likelihood
Examples:

1. Forty percent (40%) chance of rain

28
Interpretation: A 40% chance of rain means that if we look at all
days with similar conditions, 40% of those days had rain

2. Batting average of 0.313

Interpretation: A batting average of 0.313 man that the player


got a base hit in 31.3% (0.313) of his attempts at bat

➢ The smallest possible value of a probability is 0 that is,


no chance for an event to occur.

➢ The highest possible value of a probability is 1 when


expressed in decimal or 100 when expressed in
percentage that is, an event will definitely occur.

Outcomes – any possible result of a probability


experiment.

Sample space – the collection of all possible outcome


for an experiment. The sample space is often denoted
by S.

Examples:

1. The sample space of the experiment Flip a Coin has 2


outcomes heads and tails, so we could write:

S = (heads, tails)

2. The experiment of rolling a die has 6 possible outcomes


1-6, so:

S = (1, 2, 3, 4, 5, 6)

29
Events – any collection of outcomes from sample space.

 Using the symbol P(A) for the probability of an event A to


occur

0 ≥ P(A)≤ 1

P (A) is equal to equal to or greater than 0 but less than or equal


to 1

The probability of the event A to occur ranges from 0 to 1

Considering P (Ă) for the probability of the non-occurrence of


event A, then the sum of the probability of occurrence and non-
occurrence is equal to 1

P(A) + P(Ă) = 1 Complement Rule

Types of Probability

1. Classical Probability

✓ Result of performing an experiment

2. Empirical Probability

✓ Relative frequency

3. Subjective Probability

✓ Sometimes called an “educated guess”

30
Classical Probability

Result of performing an experiment that is usually in terms of:

➢ Tossing a coin

➢ Rolling a dice

➢ Choosing a card from a full deck of cards

𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑠 𝑓𝑎𝑣𝑜𝑢𝑟𝑎𝑏𝑙𝑒 𝑡𝑜 𝑡ℎ𝑒 𝑒𝑣𝑒𝑛𝑡


P= 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑝𝑜𝑠𝑠𝑖𝑏𝑙𝑒 𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑠

Empirical Probability

𝐹𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 𝑜𝑓 𝑜𝑐𝑐𝑢𝑟𝑒𝑛𝑐𝑒 𝑓𝑎𝑣𝑜𝑢𝑟𝑎𝑏𝑙𝑒 𝑡𝑜 𝑡ℎ𝑒 𝑒𝑣𝑒𝑛𝑡


P = 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑝𝑜𝑠𝑠𝑖𝑏𝑙𝑒 𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑠

Examples:

1. The statistics of a developing country show that out of


1000 live births, 150 babies die soon after birth. What is
the probability of death for a child born in this country?

P (death soon after birth) = 150/ 1000 = 0.15 or 15%

Examples:

2. Students in a class of 50 are classified according to grade


range as follows:

Grade Range Number of Students

1.25 – 1.00 3
1.26 – 1.50 14
1.51 – 1.75 25
1.76 – 2.0 8

31
Find the probability that the students have a grade of:

i.) 1.25 – 1.00 iii.) 1.51 – 1.75

ii.) 1.26 – 1.50 iv.) 1.76 – 2.00

Subjective Probability

➢ Result of a person’s evaluation of an event based on


available evidence, experience, gut-fuel and even belief

➢ Will vary among persons who are evaluating the event

Examples:

1. What is the probability that a favoured athlete will win the


competition?

2. What is the probability of a candidate to win during the


election?

3. What is the probability that it will rain tomorrow?

Basic concepts in probability

1. Determine the number of possible ways that the outcome


you are interested in can occur.

2. Determine the number of possible ways that every


possible outcomes can occur.

3. Divide the first number by the second; the answer gives


the probability that the event in question will occur.

32
Determining the total # of possible outcomes:

Based on 2 ways of arranging objects:

1. Combinations

2. Permutations

Combinations – are arrangements of objects with no


consideration of sequence

𝑛!
nCk = 𝑘!(𝑛−𝑘)! Where: C
n k is the combination of n taken k at a time; n
is the total number of items; k is the number of items to be
chosen

n! – n factorial

-- is the descending product beginning with n and ending with 1


n! = n x (n-1) x (n-2) x…….x 2 x 1

Examples:

4! = 4x3x2x1 = 24

5! = 5x4x3x2x1 = 120

10! = 10x9x8x7x6x5x4x3x2x1 = 3,628,800

33
Permutation

-- are arrangement of objects with consideration of sequence.

𝑛!
nPk = (𝑛−𝑘)! Where: nPk is the permutation of n taken k at a time;
n is the total number of items; k is the number of items to
be chosen

Sample problems:

1. Given 3 letters (A, B and C), what is the probability of


getting a combination of A and B? What is the probability
of getting BA arrangement?

Manually, combinations and permutations are determined as:

Combinations: AB, AC, BC 3

Permutations: AB, BA

AC, CA 6

BC, CB

Mathematically,

𝑛! 3!
nCk = 𝑘!(𝑛−𝑘)! = 2!(3−2)!

3.2.1 6
=2=3
2.1 (1)

34
𝑛! 3! 3.2.1 6
nPk = (𝑛−𝑘)! = (3−2)! = =1=6
1

The probability of getting a combination of A and B

𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑠 𝑓𝑎𝑣𝑜𝑢𝑟𝑎𝑏𝑙𝑒 𝑡𝑜 𝑡ℎ𝑒 𝑒𝑣𝑒𝑛𝑡


P= 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑝𝑜𝑠𝑠𝑖𝑏𝑙𝑒 𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑠

= 1/3 = 0.33

The probability of getting a BA arrangement

1
P= 6 = 0.17

Summary:

Data refer to the collections of observations (such as measurements,


genders, survey responses. It can be obtained from various sources
which is needed by a person for a certain purpose.
Due to the vagueness of numerical data, there is a need for appropriate
rounding numbers based on the desired place value.
Population which is the entirety of an entity, any objects, place or
events can be calculated and estimated through sampling techniques.
For the measure of tendency of position and variability are said to be
the tendency of typical values to lie centrally within a set of data
arranged according o the magnitude.
For the measure of central tendency of position and variability for the
grouped data, it requires the frequency distribution table.
Probability is always fractions or decimals indicating the portion or
percent of the time that the event occurs.
Assessment:

Task 1. Answer the following in brief with 2-3 sentences only.


a. Differentiate quantitative variable from qualitative variable. (5pts)
b. Give the difference between continuous variable and discrete
variable. (5pts)
c. Compare and contrast dependent and independent variable. (5pts)
35
Task 2. Round the following number according to the required place value.
1237855678.9054
a. ones
b. tenths
c. ten thousands
d. hundred thousands
e. hundreds
f. millions
g. ten millions
h. thousands
i. hundredths
j. tens

Task 3. Problem Solving

1. Problem 1. In a 2-year ladderized degree programs, not all students


obtain their degree within the prescribed period. Some students take
a little longer to finish the ladderize degree program. In a survey of
36 randomly selected, recently graduated students, the following
results were obtained: The average number of years that the
students completed their course was 3.5 with a standard deviation
of 1.5 years.

a. Find the 95% and 99% confidence intervals for the average number
of years that it takes the ladderized students to earn their degrees.

b. Calculate also the margin of error.

2. A statistics teacher would like to know the average length of time


that her students spend in studying their lessons everyday. She
interviewed 30 randomly selected students and obtained the
following information. The daily average study time is 2 hours and
30 minutes with a standard deviation of 45 minutes.

a. Find the 95% and 99% confidence interval for the average daily
study time of the students in the university.

b. Calculate also the margin of error

36
Laboratory Activity No. 2
FREQUENCY DISTRIBUTION AND DATA PRESENTATION
Procedure:
1. Answer the Problems below: Test scores of 50 students in Statistics 1

37 27 19 68 63 33 19 38 49 77

25 48 73 63 54 25 23 47 52 68

70 53 65 24 78 47 63 50 66 39

38 50 36 62 32 68 43 64 48 53

45 67 40 42 57 75 72 27 72 25

a. Prepare frequency distribution with an interval of 6.


b. Set up the less than cumulative frequency.
c. Construct histogram and frequency polygon.

2. Below are test score of 35 students

45 37 59 62 84 87 67 78 94 78
90 54 68 73 37 83 48 75 39 66
82 58 62 60 43 78 46 50 43 35
73 67 40 74 53

a. Prepare a frequency distribution with an interval of 5.


b. Set up the greater than cumulative frequency.
c. Construct a histogram and frequency distribution

37
References:

CMU Statistics Manual, 2014


APEC Agricultural and Technical Cooperation Working Group. (2013).
Agricultural Statistics Best PracticeMethodology handbook
DAVIS, BOB.(2000). Introduction to Agricultural Statistics. Delmar Cengage
Learning; 1st Edition.
IDAIKKADAR M. N. (2001). Agricultural Statistics. A handbook for
Developing Countries. 1st Edition
RANGASWAMY.R. (2009). Agricultural Statistics. New Age
International Publisher. 8122425925

Prepared by:

JESSA D. PABILLORE
[email protected]
09179869017

38

You might also like