Chapter1 Statistics
Chapter1 Statistics
STATISTICS
1
Statistics and Probability
Qualitative data is an attribute whose value is indicated by a label. This data can be
classified into four levels of measurement, nominal, ordinal, interval and ratio.
Nominal scale data are not ordered and cannot be used in calculations. Nominal
data such as colors, names, labels and favorite foods along with yes or no responses.
2
CHAPTER 1: INTRODUCTION TO STATISTICS
Example: Smartphone companies are have Sony, Oppo, Asus, Samsung and Apple.
This is just a list and there is no agreed upon order. Some people may favor Apple but
that is a matter of opinion.
The ordinal scale data can be ordered but cannot be used in calculations. Example:
A cruise survey where the responses to questions about the cruise are “excellent,”
“good,” “satisfactory,” and “unsatisfactory.” These responses are ordered from the
most desired response to the least desired.
The inteval data is the data that is measured using the interval scale. Interval level
data can be used in calculations, but comparison cannot be done. For example:
Highest daily temperature in Malaysia is between 30 to 40° C. 80° C is not four times
as hot as 20° C.
The ratio data is the data that is measured using the ratio scale. Ratio data can be
calculated but you will not have a negative value. For example, four multiple choice
statistics final exam scores are 80, 68, 20 and 92 (out of a possible 100 points).
Quantitative Data Qualitative Data
Quantitative data are the result Qualitative data are the result
Definition of counting or measuring attributes of of categorizing or describing attributes of a
a population. population.
Data that
Qualitative data are generally described by
you will Quantitative data are always numbers.
words or letters.
see
Hair color
Amount of money you have
Blood type
Height,Weight
Examples Ethnic group
Number of people living in your town
The car a person drives
Number of students who take statistics
The street a person lives on
𝑛
1 𝑥1 + 𝑥2 + ⋯ + 𝑥𝑛
𝑥̅ = ∑ 𝑥𝑖 =
𝑛 𝑛
𝑖=1
There are other measures of central tendency is the sample median. Given that the
3
Statistics and Probability
Example
Suppose the data set is the following: 1.7, 2.2, 3.9, 3.11 and 14.7. Find the sample
mean and median.
Solution
𝑛
1 1.7 + 2.2 + 3.9 + 3.11 + 14.7
𝑥̅ = ∑ 𝑥𝑖 = = 5.12
𝑛 5
𝑖=1
𝑥̅ = 5.12 and 𝑥̃ = 3.9
4
CHAPTER 1: INTRODUCTION TO STATISTICS
1
where population mean, 𝜇 = 𝑁 ∑𝑁
𝑖=1 𝑥𝑖 .
2
The symbol s represents the sample variance; the sample standard deviation s is the
square root of the sample variance. The sample variance and sample standard
deviation are
1 1
𝑠 2 = 𝑛−1 ∑𝑛𝑖=1(𝑥𝑖 − 𝑥̅ )2 and 𝑠 = √𝑛−1 ∑𝑛𝑖=1(𝑥𝑖 − 𝑥̅ )2
Example
Suppose the data set is the following: 10, 15, 20, 25 and 30. Find the sample mean
and standard deviation.
Solution
1 10+15+20+25+30
Sample Mean 𝑥̅ = ∑𝑛𝑖=1 𝑥𝑖 = = 20
𝑛 5
1
Sample Variance 𝑠 2 = 𝑛−1 ∑𝑛𝑖=1(𝑥𝑖 − 𝑥̅ )2
1
= 5−1 ((10 − 20)2 + (15 − 20)2 + ⋯ + (30 − 20)2 ) = 62.5
1
Population variance 𝜇 = 𝑁 ∑𝑁
𝑖=1 𝑥𝑖 = 20
1
Population Variance 𝜎 2 = 𝑁 ∑𝑁 2
𝑖=1(𝑥𝑖 − 𝜇) = 50
5
Statistics and Probability
FX570MS
FX570ES PLUS
6
CHAPTER 1: INTRODUCTION TO STATISTICS
The standard deviation provides a measure of the overall variation in a data set.
The standard deviation is always positive or zero.
The standard deviation is small when the data are all concentrated close to the
mean, exhibiting little variation or spread.
The standard deviation is larger when the data values are more spread out
from the mean, exhibiting more variation.
Example 1
A die is rolled, find the probability of getting (a) a 3, (b) an even number and (c) at
least five.
Solution
(a) The event of interest is "getting a 3". so E = {3}.
The sample space S is given by S = {1,2,3,4,5,6}.
Hence
𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑠 𝑖𝑛 𝐸 𝑛(𝐸) 1
𝑃(𝐸 ) = 𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑠 𝑖𝑛 𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑝𝑎𝑐𝑒 = =6
𝑛(𝑆)
7
Statistics and Probability
Hence
𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑠 𝑖𝑛 𝐸 𝑛(𝐸) 3 1
𝑃(𝐸 ) = 𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑠 𝑖𝑛 𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑝𝑎𝑐𝑒 = =6=2
𝑛(𝑆)
P(B|A) = P(B)
8
CHAPTER 1: INTRODUCTION TO STATISTICS
P(A ∩ B) = P(A)P(B)
Example 2
Suppose the sample space S = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}. Let A = {1, 2, 3, 4, 5}, B =
{4, 5, 6, 7, 8}, and C = {7, 9}. Determine wheather (a) A and B are mutually
exclusive, (b) A and C are mutually exclusive, (c) Find P(A ∪ B) and P(A ∪ C).
Solution
2
(a) A AND B = {4, 5}. P(A ∩ B)=10 and is not equal to zero.
5 5 2 8 4
(c) P(A ∪ B) = P(A) + P(B) – P(A ∩ B) = 10 + 10 − 10 = 10 = 5
5 2 0 7
P(A ∪ C) = P(A) + P(C) – P(A ∩ C) = 10 + 10 − 10 = 10
Example 3
Suppose we toss one fair, six-sided die. The sample space S = {1, 2, 3, 4, 5, 6}. Let
event A = face is 2 or 3 and B = event that face is even. Find (a) 𝑃(𝐴/𝐵) and (b)
𝑃(𝐴′/𝐵).
Solution
(a) Event A = {2, 3}, Event B ={2, 4, 6} and A ∩ B = {2}.
1
1 3 𝑃(𝐴∩𝐵) ( ) 1
Hence 𝑃(𝐴 ∩ 𝐵) = 6 , 𝑃(𝐵) = 6 and 𝑃(𝐴/𝐵) = = 6
1 =3
𝑃(𝐵) ( )
2
Example 4
(i) If 𝑃 (𝐴/𝐵) = 0.4, 𝑃 (𝐵) = 0.8 and 𝑃(𝐴) = 0.5, are the event A and B
9
Statistics and Probability
Solution
(i) (a) 𝑃(𝐴/𝐵) = 0.4, 𝑃(𝐵) = 0.8 and 𝑃 (𝐴) = 0.5,
Two events are independent if P(A|B) = P(A),
𝑃 (𝐴/𝐵) = 0.4 ≠ 𝑃(𝐴) = 0.5
Hence A and B are not independent.
𝑃(𝐴∩𝐵)
(b) 𝑃(𝐴/𝐵) = so 𝑃(𝐴 ∩ 𝐵) = 𝑃 (𝐴/𝐵)𝑃(𝐵) = 0.4(0.8) = 0.32 ≠ 0
𝑃(𝐵)
10
CHAPTER 1: INTRODUCTION TO STATISTICS
The mean (also called the "expectation value" or "expected value") of a discrete
random variable X is
𝜇 = 𝐸 (𝑋) = ∑ 𝑥𝑃(𝑋 = 𝑥)
The mean of a random variable may be interpreted as the average of the values
assumed by the random variable in repeated trials of the experiment.
Example 1
A fair coin is tossed three times. Let X is the number of heads that are observed.
(a) Construct the probability distribution of X and show that X is discrete random
variable.
(b) Find the probability that at least one head is observed.
(c) Find 𝐹(𝑥 ).
(d) Find mean and variance of X.
Solution
(a) Let X = the number of heads you get when you toss three fair coins.
𝑆 = {𝑇𝑇𝑇, 𝑇𝐻𝐻, 𝐻𝑇𝐻, 𝐻𝐻𝑇, 𝐻𝑇𝑇, 𝑇𝐻𝑇, 𝑇𝑇𝐻, 𝐻𝐻𝐻}
So, 𝑥 = {0,1,2,3}, X is in words and x is a number.
11
Statistics and Probability
𝑥 0 1 2 3
𝑃(𝑋 = 𝑥) 1/8 3/8 3/8 1/8
1 3 3 1
Since ∑ 𝑃(𝑋 = 𝑥 ) = 8 + 8 + 8 + 8 = 1, then X is discrete random variable.
3 3 1 7
(b) 𝑃 (𝑋 ≥ 1) = 8 + 8 + 8 = 8
(c) 𝐹 (𝑥 ) = 𝑃 (𝑋 ≤ 𝑥 )
𝐹(0) = 𝑃 (𝑋 ≤ 0) = 1/8; 𝐹(1) = 𝑃(𝑋 ≤ 1) = 4/8; 𝐹 (2) = 𝑃(𝑋 ≤ 2) = 7/8
and 𝐹 (3) = 𝑃(𝑋 ≤ 3) = 1
𝑥 0 1 2 3
𝑃(𝑋 = 𝑥) 1/8 3/8 3/8 1/8
𝐹 (𝑥 ) 1/8 4/8 7/8 1
Figure 3: Probability Distribution and Cumulative Distibution for Tossing a Fair Coin Thrice
1 3 3 1 12 4
(d) 𝜇 = 𝐸 (𝑋) = ∑ 𝑥𝑃(𝑋 = 𝑥 ) = 0 (8) + 1 (8) + 2 (8) + 3 (8) = =3
8
𝜎 2 = ∑ (𝑥 − 𝜇 ) 2 𝑃 (𝑋 = 𝑥 )
4 2 1 4 2 3 4 2 3 4 2 1
= (0 − ) ( ) + (1 − ) ( ) + (2 − ) ( ) + (3 − ) ( ) = 0.778
3 8 3 8 3 8 3 8
𝜎 = √𝜎 2 = 0.882
Example 2
A pair of fair dice is rolled. Let X represent the sum of two dice of dots on the top
faces.
(a) Construct the probability distribution of X for a paid of fair dice and show that X
is discrete random variable.
(b) Find 𝑃(𝑋 ≥ 9).
(c) Find the probability that X takes an even value.
(d) Find 𝐹(𝑥 ).
12
CHAPTER 1: INTRODUCTION TO STATISTICS
Solution
Let X represent the sum of two dice.
(a) The sample space is
𝑆 = {(1 + 1), (1 + 2), … , (5 + 6), (6 + 6)}, 𝑛(𝑆) = 36
Then the probability distribution of X is as follows:
x 2 3 4 5 6 7 8 9 10 11 12
𝑃 (𝑋 = 𝑥) 1/36 2/36 3/36 4/36 5/36 6/36 5/36 4/36 3/36 2/36 1/36
1 2 2 1
Since ∑ 𝑃(𝑋 = 𝑥 ) = 36 + 36 + ⋯ + 36 + 36 = 1, then X is discrete random
variable.
10
(b) 𝑃 (𝑋 ≥ 9) = 𝑃(𝑋 = 9) + 𝑃(𝑋 = 10) + 𝑃 (𝑋 = 11) + 𝑃(𝑋 = 12) = 36
(d)
x 2 3 4 5 6 7 8 9 10 11 12
𝑃 (𝑋 = 𝑥) 1 2 3 4 5 6 5 4 3 2 1
36 36 36 36 36 36 36 36 36 36 36
𝐹 (𝑥) 1 3 6 10 15 21 26 30 33 35 1
36 36 36 36 36 36 36 36 36 36
13
Statistics and Probability
∞
ii. ∫−∞ 𝑓(𝑥)𝑑𝑥 = 1
𝑏
iii. 𝑃(𝑎 ≤ 𝑋 ≤ 𝑏) = 𝑃 (𝑎 < 𝑋 < 𝑏) = ∫𝑎 𝑓(𝑥 )𝑑𝑥 = area under 𝑓(𝑥) from a to b
as illustrated in Figure 1.
Example 1
Let the continuous variable X denote the current measured in a thin copper wire in
milliamperes. Assume that the range of X is [0, 20 mA], and assume that the
probability density function of X is
14
CHAPTER 1: INTRODUCTION TO STATISTICS
0.05 , 0 ≤ 𝑥 ≤ 20
𝑓 (𝑥 ) = {
0, others
Solution
∞
(a) X is continuous random variable if ∫−∞ 𝑓(𝑥)𝑑𝑥 = 1
∞ 0 20 ∞
∫−∞ 𝑓(𝑥)𝑑𝑥 = ∫−∞ 0𝑑𝑥 + ∫0 0.05𝑑𝑥 + ∫20 0𝑑𝑥
= 0 + 0.05𝑥]20
0 +0 =1
5 5
iii.𝑃(𝑋 < 5) = ∫−∞ 𝑓(𝑥)𝑑𝑥 = ∫0 0.05𝑑𝑥 = 0.05𝑥]50 = 0.025
𝑥
(c) 𝐹(𝑥 ) = 𝑃(𝑋 ≤ 𝑥 ) = ∫−∞ 𝑓 (𝑥 )𝑑𝑥
For x < 0
𝑥
𝐹(𝑥 ) = 𝑃(𝑋 ≤ 𝑥 ) = ∫−∞ 0𝑑𝑥 = 0
For 0 ≤ 𝑥 ≤ 20
𝑥
𝐹(𝑥 ) = 𝑃(𝑋 ≤ 𝑥 ) = ∫0 0.05𝑑𝑥 = 0.05𝑥
For 𝑥 > 20
𝐹 (𝑥 ) = 𝑃 (𝑋 ≤ 𝑥 ) = 1
Then
0 , 𝑥<0
𝐹(𝑥 ) = {0.05𝑥, 0 ≤ 𝑥 ≤ 20
1, 𝑥 > 20
20
∞ 20 0.05𝑥 2
(d) 𝜇 = 𝐸 (𝑋) = ∫−∞ 𝑥𝑓 (𝑥 )𝑑𝑥 = ∫0 𝑥(0.05)𝑑𝑥 = ] = 10 𝑚𝐴
2 0
15
Statistics and Probability
∞ 20
𝜎 = 𝑉(𝑋) = ∫ 𝑥 𝑓 (𝑥 )𝑑𝑥 − 𝜇 = ∫ 𝑥 2 (0.05)𝑑𝑥 − 102
2 2 2
−∞ 0
3 20
0.05𝑥
= ] − 102
3 0
400
= − 102 = 33.33 𝑚𝐴2
3
Exercise
1. Determine the correct data type (quantitative or qualitative). Indicate whether
quantitative data are continuous or discrete.
i. The number of pairs of shoes you own
ii. The type of car you drive
iii. The place where you go on vacation
iv. The distance it is from your home to the nearest grocery store
v. The number of classes you take per school year.
vi. The tuition for your classes
vii. The type of calculator you use
viii.Movie ratings
ix. Political party preferences
x. Weights of sumo wrestlers
xi. Amount of money (in dollars) won playing poker
xii. Number of correct answers on a quiz
xiii.Peoples’ attitudes toward the government
xiv.IQ scores
2. A fair coin is tossed twice. Let X be the number of heads that are observed.
a) Construct the probability distribution of X.
b) Find the probability that at least one head is observed.
c) Find 𝐹 (𝑥 ).
d) Find mean and variance of X.
3. A random variable X has the uniform distribution on the interval [0,1]: the density
function is 𝑓 (𝑥 ) = 1 if x is between 0 and 1 and 𝑓 (𝑥 ) = 0 for all other values of x.
a) Show X is continuous random variable.
16
CHAPTER 1: INTRODUCTION TO STATISTICS
b) Find (i) 𝑃(0.5 < 𝑋 ≤ 1), (ii) 𝑃(𝑋 > 0.10) and (iii) 𝑃(𝑋 < 0.5) .
c) Find 𝐹 (𝑥 ).
d) Find mean and variance of X.
17