0% found this document useful (0 votes)

66 views

Chapter1 Statistics

This document provides an introduction to statistics, including key concepts and terminology. It discusses: 1. Descriptive statistics, which involves collecting and presenting data through tables and graphs like histograms, bar graphs, and box plots. 2. Inferential statistics, which allows drawing conclusions about a population based on a sample. It involves estimation and testing hypotheses. 3. Key terms like population, sample, variable, parameter, and qualitative vs. quantitative data. It also discusses different scales of measurement for qualitative data. 4. Measures of central tendency like the mean and median, and how skewness can impact these values. 5. Measures of spread, specifically the standard deviation as a measure

Uploaded by

arokia samy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

66 views

Chapter1 Statistics

Uploaded by

arokia samy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

CHAPTER 1

STATISTICS

1.1 INTRODUCTION TO STATISTICS

Statistics is a mathematical science deals with the collection, organizing, presentation
of data, interpretation and analyzing data in such a way that meaningful conclusions
can be drawn from them. In general, its investigations and analyses fall into two broad
categories called descriptive and inferential statistics.

1.1.1 Descriptive Statistics

Once you have collected data, what will you do with it? Data can be described and
presented in many different formats such using table or graph. This area of statistics is
called “Descriptive Statistics.”
A statistical graph is a tool that helps you learn about the shape or distribution of a
sample or a population. A graph can be a more effective way of presenting data than a
mass of numbers because we can see where data clusters and where there are only a
few data values. Newspapers and the Internet use graphs to show trends and to enable
readers to compare facts and figures quickly. Statisticians often graph data first to get
a picture of the data. Some of the types of graphs that are used to summarize and
organize data are the dot plot, the bar graph, the histogram, the stem-and-leaf plot, the
frequency polygon (a type of broken line graph), the pie chart, and the box plot.

Figure 1: The types of graphs

1
Statistics and Probability

1.1.2 Inferential Statistics

Inferential statistics is the process of drawing conclusions about a population based
on certain statistics calculated from a sample of data drawn from that population.
Statistical inference is important in order to analyze data properly. Indeed , proper
data analysis is necessary to interpret research results and to draw appropriate
conclusion. In this courses, two basic statistics concepts are presented: estimation, and
statistical hypothesis are applied to comparison of means, varianances and
proportions.

1.1.3 Terms and Definations

Population is all individuals, objects, or measurements whose properties are being
studied.
Sample is a subset of the population studied.
Statistic is a numerical characteristic of the sample; a statistic estimates the
corresponding population parameter.
Variable is a characteristic of interest for each person or object in a population
Numerical Variable is variables that take on values that are indicated by numbers.
Parameter is a number that is used to represent a population characteristic and that
generally cannot be determined easily.
Proportion is the number of successes divided by the total number in the sample.
Random Variable Notation using upper case letters such as X denote a random
variable. Lower case letters like x denote the value of a random variable.

1.1.4 Type of Data

Most data can be put into two groups: qualitative or quantitative. Quantitative data
can be separated into two subgroups: discrete and continuous. Quantitative data is
discrete if the corresponding data values take the result of counting (such as the
number of students, cars, books, and children). Data is continuous if it is the result of
measuring (such as distance, weight, speed, time pressure).

Qualitative data is an attribute whose value is indicated by a label. This data can be
classified into four levels of measurement, nominal, ordinal, interval and ratio.
Nominal scale data are not ordered and cannot be used in calculations. Nominal
data such as colors, names, labels and favorite foods along with yes or no responses.

2
CHAPTER 1: INTRODUCTION TO STATISTICS

Example: Smartphone companies are have Sony, Oppo, Asus, Samsung and Apple.
This is just a list and there is no agreed upon order. Some people may favor Apple but
that is a matter of opinion.
The ordinal scale data can be ordered but cannot be used in calculations. Example:
A cruise survey where the responses to questions about the cruise are “excellent,”
“good,” “satisfactory,” and “unsatisfactory.” These responses are ordered from the
most desired response to the least desired.
The inteval data is the data that is measured using the interval scale. Interval level
data can be used in calculations, but comparison cannot be done. For example:
Highest daily temperature in Malaysia is between 30 to 40° C. 80° C is not four times
as hot as 20° C.
The ratio data is the data that is measured using the ratio scale. Ratio data can be
calculated but you will not have a negative value. For example, four multiple choice
statistics final exam scores are 80, 68, 20 and 92 (out of a possible 100 points).
Quantitative Data Qualitative Data
Quantitative data are the result Qualitative data are the result
Definition of counting or measuring attributes of of categorizing or describing attributes of a
a population. population.
Data that
Qualitative data are generally described by
you will Quantitative data are always numbers.
words or letters.
see
Hair color
Amount of money you have
Blood type
Height,Weight
Examples Ethnic group
Number of people living in your town
The car a person drives
Number of students who take statistics
The street a person lives on

1.1.5 Measures of the Center of the Data

The “center” of a data set is also a way of describing location. The two most widely
used measures of the “center” of the data are the mean (average) and the median.
The mean is simply a numerical average. Suppose that the observations in a
sample are 𝑥1 , 𝑥2 , … , 𝑥𝑛 . The sample mean denoted by 𝑥̅ is

𝑛
1 𝑥1 + 𝑥2 + ⋯ + 𝑥𝑛
𝑥̅ = ∑ 𝑥𝑖 =
𝑛 𝑛
𝑖=1

There are other measures of central tendency is the sample median. Given that the

3
Statistics and Probability

observations in a sample are 𝑥1 , 𝑥2 , … , 𝑥𝑛 , arranged in increasing order of magnitude,

the sample median is
𝑥𝑛+1 , if 𝑛 is odd
2
𝑥̃ = {1
(𝑥𝑛 + 𝑥𝑛+1 ) , if 𝑛 is even
2 2 2

Example
Suppose the data set is the following: 1.7, 2.2, 3.9, 3.11 and 14.7. Find the sample
mean and median.

Solution
𝑛
1 1.7 + 2.2 + 3.9 + 3.11 + 14.7
𝑥̅ = ∑ 𝑥𝑖 = = 5.12
𝑛 5
𝑖=1
𝑥̅ = 5.12 and 𝑥̃ = 3.9

The mean is influenced considerably by the presence of the extreme observation,

whereas the median places emphasis on the true ‘centre’ of the data set.
The median is generally a better measure of the center when there are extreme
values or outliers because it is not affected by the precise numerical values of the
outliers. The mean is the most common measure of the center.

Skewness and the Mean, Median, and Mode

The distribution is symmetrical if the shape to the left and the right of the vertical line
are mirror images of each other. In a perfectly symmetrical distribution, the mean
and the median are the same (Figure 2a). The distribution of data is skewed to the
left (negative skew) if the mean is less than the median, which is often less than the
mode. (mean < median < mode) (Figure 2b). The distribution of data is skewed to the
right (positive skew) if the mean is the largest, while the mode is the smallest
(mean > median > mode) Figure 2c.

Figure 2(a) Figure 2(b) Figure 2(c)

Figure 2 : Skewned Distribution of the Data

4
CHAPTER 1: INTRODUCTION TO STATISTICS

1.1.6 Measures of the Spread of Data

An important characteristic of any set of data is the variation in the data. In some data
sets, the data values are concentrated closely near the mean; in other data sets, the
data values are more widely spread out from the mean. The most common measure of
variation, or spread, is the standard deviation. The standard deviation is a number
that measures how far data values are from their mean.
The symbol 𝜎 2 represents the population variance; the population standard
deviation σ is the square root of the population variance. Suppose that the
observations in a population are 𝑥1 , 𝑥2 , … , 𝑥𝑁 . The population variances and standard
deviation are
1 1
𝜎 2 = 𝑁 ∑𝑁 2 𝑁
𝑖=1(𝑥𝑖 − 𝜇) and 𝜎 = √𝑁 ∑𝑖=1(𝑥𝑖 − 𝜇)
2

1
where population mean, 𝜇 = 𝑁 ∑𝑁
𝑖=1 𝑥𝑖 .
2
The symbol s represents the sample variance; the sample standard deviation s is the
square root of the sample variance. The sample variance and sample standard
deviation are
1 1
𝑠 2 = 𝑛−1 ∑𝑛𝑖=1(𝑥𝑖 − 𝑥̅ )2 and 𝑠 = √𝑛−1 ∑𝑛𝑖=1(𝑥𝑖 − 𝑥̅ )2

Example
Suppose the data set is the following: 10, 15, 20, 25 and 30. Find the sample mean
and standard deviation.

Solution
1 10+15+20+25+30
Sample Mean 𝑥̅ = ∑𝑛𝑖=1 𝑥𝑖 = = 20
𝑛 5
1
Sample Variance 𝑠 2 = 𝑛−1 ∑𝑛𝑖=1(𝑥𝑖 − 𝑥̅ )2
1
= 5−1 ((10 − 20)2 + (15 − 20)2 + ⋯ + (30 − 20)2 ) = 62.5

Sample standard deviation 𝑠 = √𝑠 2 = √62.5 = 7.9057

1
Population variance 𝜇 = 𝑁 ∑𝑁
𝑖=1 𝑥𝑖 = 20

1
Population Variance 𝜎 2 = 𝑁 ∑𝑁 2
𝑖=1(𝑥𝑖 − 𝜇) = 50

Population standard deviation 𝜎 = √𝜎 2 = √50 = 7.07107

5
Statistics and Probability

Calculation of Mean and Standard Deviation Using Calculator

FX570MS

FX570ES PLUS

6
CHAPTER 1: INTRODUCTION TO STATISTICS

The standard deviation provides a measure of the overall variation in a data set.
The standard deviation is always positive or zero.
 The standard deviation is small when the data are all concentrated close to the
mean, exhibiting little variation or spread.
 The standard deviation is larger when the data values are more spread out
from the mean, exhibiting more variation.

1.2 INTRODUCTION TO PROBABILITY

Probability is used to quantify the chance that an outcome of a random experiment
will occur or degree of belief that outcome will occur. For example, if you toss a coin,
will you obtain a head or tail? If you roll a die will obtain 1, 2, 3, 4, 5 or 6?
The value of a probability is a number between 0 and 1 inclusive. An event that
cannot occur has a probability (of happening) equal to 0 and the probability of an
event that is certain to occur has a probability equal to 1.
In order to quantify probabilities, we need to define the sample space of an
experiment and the events that may be associated with that experiment. The sample
space (S) is the set of all possible outcomes in an experiment. An event (E) is some
specific outcome of an experiment or an event is a subset of the sample space.While
the probability is defined as
𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑠 𝑖𝑛 𝐸 𝑛(𝐸)
𝑃(𝐸 ) = 𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑠 𝑖𝑛 𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑝𝑎𝑐𝑒 = 𝑛(𝑆)

Example 1
A die is rolled, find the probability of getting (a) a 3, (b) an even number and (c) at
least five.

Solution
(a) The event of interest is "getting a 3". so E = {3}.
The sample space S is given by S = {1,2,3,4,5,6}.
Hence
𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑠 𝑖𝑛 𝐸 𝑛(𝐸) 1
𝑃(𝐸 ) = 𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑠 𝑖𝑛 𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑝𝑎𝑐𝑒 = =6
𝑛(𝑆)

(b) The event of interest is an “even number”so E = {2,4,6}.

7
Statistics and Probability

Hence
𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑠 𝑖𝑛 𝐸 𝑛(𝐸) 3 1
𝑃(𝐸 ) = 𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑠 𝑖𝑛 𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑝𝑎𝑐𝑒 = =6=2
𝑛(𝑆)

(c) The event of interest is at “least five”so E = {5,6}.

𝑛(𝐸) 2 1
𝑃 (𝐸 ) = =6=3
𝑛(𝑆)

1.6.1 Terms and Definations

i. “OR” Event, 𝑨 ∪ 𝑩 - An outcome is in the event A OR B = 𝑨 ∪ 𝑩 if the outcome
is in A or is in B or is in both A and B.
ii “AND” Event, 𝑨 ∩ 𝑩 - An outcome is in the event A AND B= 𝑨 ∩ 𝑩 if the
outcome is in both A and B at the same time.
iii. Complimentary Event, A′ - The complement of event A is denoted A′.
A′ consists of all outcomes that are NOT in A.
iv. Mutually Exclusive Events - A and B are mutually exclusive events if they
cannot occur at the same time. This means that A and B do not share any outcomes
and P(A ∩ B) = 0.
v. The Addition Rule - If A and B are defined on a sample space, then:
P(A ∪ B) = P(A) + P(B) – P(A ∩ B).
If A and B are mutually exclusive, then P(A ∩ B) = 0. Then
P(A ∪ B) = P(A) + P(B) – P(A ∩ B) becomes
P(A ∪ B) = P(A) + P(B).
vi. Conditional Probability of an Event
The conditional probability of A given B is written P(A|B).
P(A|B) is the probability that event A will occur given that the event B has already
occurred. The formula to calculate P(A|B) is
𝑃 (𝐴 ∩ 𝐵 )
𝑃(𝐴/𝐵) =
𝑃 (𝐵 )
vii. Independent Events
Two events A and B are independent if one occurred does not affect the chance
the other occurs. Two events are independent if the following are true:
 P(A|B) = P(A)

 P(B|A) = P(B)

8
CHAPTER 1: INTRODUCTION TO STATISTICS

 P(A ∩ B) = P(A)P(B)

Example 2
Suppose the sample space S = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}. Let A = {1, 2, 3, 4, 5}, B =
{4, 5, 6, 7, 8}, and C = {7, 9}. Determine wheather (a) A and B are mutually
exclusive, (b) A and C are mutually exclusive, (c) Find P(A ∪ B) and P(A ∪ C).

Solution
2
(a) A AND B = {4, 5}. P(A ∩ B)=10 and is not equal to zero.

Therefore, A and B are not mutually exclusive.

(b) A AND C = { } =do not have any numbers so P(A ∩ C) = 0.

Therefore, A and C are mutually exclusive.

5 5 2 8 4
(c) P(A ∪ B) = P(A) + P(B) – P(A ∩ B) = 10 + 10 − 10 = 10 = 5
5 2 0 7
P(A ∪ C) = P(A) + P(C) – P(A ∩ C) = 10 + 10 − 10 = 10

Example 3
Suppose we toss one fair, six-sided die. The sample space S = {1, 2, 3, 4, 5, 6}. Let
event A = face is 2 or 3 and B = event that face is even. Find (a) 𝑃(𝐴/𝐵) and (b)
𝑃(𝐴′/𝐵).

Solution
(a) Event A = {2, 3}, Event B ={2, 4, 6} and A ∩ B = {2}.
1
1 3 𝑃(𝐴∩𝐵) ( ) 1
Hence 𝑃(𝐴 ∩ 𝐵) = 6 , 𝑃(𝐵) = 6 and 𝑃(𝐴/𝐵) = = 6
1 =3
𝑃(𝐵) ( )
2

(b) Event 𝐴′ = {1,4,5,6}, Event B ={2, 4, 6} and 𝐴′ ∩ 𝐵 = {4, 6}.

2
2 3 𝑃(𝐴′ ∩𝐵) ( ) 2
Hence 𝑃(𝐴′ ∩ 𝐵) = 6 , 𝑃(𝐵) = 6 and 𝑃 (𝐴′ /𝐵) = = 6
1 =3
𝑃(𝐵) ( )
2

Example 4
(i) If 𝑃 (𝐴/𝐵) = 0.4, 𝑃 (𝐵) = 0.8 and 𝑃(𝐴) = 0.5, are the event A and B

9
Statistics and Probability

(a) independent, and (b) mutually exclusive?

(ii) If 𝑃 (𝐴) = 0.5, 𝑃(𝐵) = 0.3, and A and B are mutually exclusive, are they
independent?

Solution
(i) (a) 𝑃(𝐴/𝐵) = 0.4, 𝑃(𝐵) = 0.8 and 𝑃 (𝐴) = 0.5,
Two events are independent if P(A|B) = P(A),
𝑃 (𝐴/𝐵) = 0.4 ≠ 𝑃(𝐴) = 0.5
Hence A and B are not independent.

𝑃(𝐴∩𝐵)
(b) 𝑃(𝐴/𝐵) = so 𝑃(𝐴 ∩ 𝐵) = 𝑃 (𝐴/𝐵)𝑃(𝐵) = 0.4(0.8) = 0.32 ≠ 0
𝑃(𝐵)

Hence A and B are not mutually exclusive events.

(ii) A and B are mutually exclusive events, then 𝑃(𝐴 ∩ 𝐵) = 0.

Two events are independent if 𝑃(𝐴 ∩ 𝐵) = 𝑃 (𝐴)𝑃(𝐵).
𝑃 (𝐴)𝑃(𝐵). = 0.5(0.3) = 0.15 ≠ 𝑃 (𝐴 ∩ 𝐵) = 0
Hence, two events are not independent.

1.3 DISCRETE RANDOM VARIABLES

The probability distribution of a discrete random variable X is a list of each possible
value of X together with the probability that x takes that value in one trial of the
experiment.
The probability distribution of a discrete random variable X must satisfy the
following two conditions:
i. Each probability 𝑃(𝑋 = 𝑥) must be between 0 and 1:
ii. 0 ≤ 𝑃(𝑋 = 𝑥) ≤ 1
iii. The sum of all the possible probabilities is 1: ∑ 𝑃(𝑋 = 𝑥 ) = 1.

1.3.1 Cumulative Distribution Function

The cumulative distribution function of a discrete random variable x, denoted as
𝐹(𝑥 ) = 𝑃(𝑋 ≤ 𝑥 ) = ∑𝑖=1 𝑃(𝑋 = 𝑥𝑖 )

1.3.2 The Mean and Variance of a Discrete Random Variable

10
CHAPTER 1: INTRODUCTION TO STATISTICS

The mean (also called the "expectation value" or "expected value") of a discrete
random variable X is

𝜇 = 𝐸 (𝑋) = ∑ 𝑥𝑃(𝑋 = 𝑥)

The mean of a random variable may be interpreted as the average of the values
assumed by the random variable in repeated trials of the experiment.

The variance 𝜎 2 of a discrete random variable X is

𝜎 2 = 𝑉 (𝑋) = ∑(𝑥 − 𝜇)2 𝑃(𝑋 = 𝑥)

The standard deviation of X is 𝜎 = √𝜎 2 .

The variance and standard deviation of a discrete random variable X may be
interpreted as measures of the variability of the values assumed by the random
variable in repeated trials of the experiment.

Rules for Means and Variances

i. 𝐸 (±𝑎) = ±𝑎
ii. 𝑉 (±𝑎) = 0
iii. 𝐸 (𝑎 ± 𝑏𝑋) = 𝑎 ± 𝑏𝐸(𝑋)
iv. 𝑉 (𝑎 ± 𝑏𝑋) = 𝑏2 𝑉(𝑋)

Example 1
A fair coin is tossed three times. Let X is the number of heads that are observed.
(a) Construct the probability distribution of X and show that X is discrete random
variable.
(b) Find the probability that at least one head is observed.
(c) Find 𝐹(𝑥 ).
(d) Find mean and variance of X.

Solution
(a) Let X = the number of heads you get when you toss three fair coins.
𝑆 = {𝑇𝑇𝑇, 𝑇𝐻𝐻, 𝐻𝑇𝐻, 𝐻𝐻𝑇, 𝐻𝑇𝑇, 𝑇𝐻𝑇, 𝑇𝑇𝐻, 𝐻𝐻𝐻}
So, 𝑥 = {0,1,2,3}, X is in words and x is a number.

11
Statistics and Probability

Then the probability distribution of X is as follows:

𝑥 0 1 2 3
𝑃(𝑋 = 𝑥) 1/8 3/8 3/8 1/8

1 3 3 1
Since ∑ 𝑃(𝑋 = 𝑥 ) = 8 + 8 + 8 + 8 = 1, then X is discrete random variable.
3 3 1 7
(b) 𝑃 (𝑋 ≥ 1) = 8 + 8 + 8 = 8

(c) 𝐹 (𝑥 ) = 𝑃 (𝑋 ≤ 𝑥 )
𝐹(0) = 𝑃 (𝑋 ≤ 0) = 1/8; 𝐹(1) = 𝑃(𝑋 ≤ 1) = 4/8; 𝐹 (2) = 𝑃(𝑋 ≤ 2) = 7/8
and 𝐹 (3) = 𝑃(𝑋 ≤ 3) = 1
𝑥 0 1 2 3
𝑃(𝑋 = 𝑥) 1/8 3/8 3/8 1/8
𝐹 (𝑥 ) 1/8 4/8 7/8 1

Figure 3: Probability Distribution and Cumulative Distibution for Tossing a Fair Coin Thrice

1 3 3 1 12 4
(d) 𝜇 = 𝐸 (𝑋) = ∑ 𝑥𝑃(𝑋 = 𝑥 ) = 0 (8) + 1 (8) + 2 (8) + 3 (8) = =3
8

𝜎 2 = ∑ (𝑥 − 𝜇 ) 2 𝑃 (𝑋 = 𝑥 )

4 2 1 4 2 3 4 2 3 4 2 1
= (0 − ) ( ) + (1 − ) ( ) + (2 − ) ( ) + (3 − ) ( ) = 0.778
3 8 3 8 3 8 3 8
𝜎 = √𝜎 2 = 0.882

Example 2
A pair of fair dice is rolled. Let X represent the sum of two dice of dots on the top
faces.
(a) Construct the probability distribution of X for a paid of fair dice and show that X
is discrete random variable.
(b) Find 𝑃(𝑋 ≥ 9).
(c) Find the probability that X takes an even value.
(d) Find 𝐹(𝑥 ).

12
CHAPTER 1: INTRODUCTION TO STATISTICS

Solution
Let X represent the sum of two dice.
(a) The sample space is
𝑆 = {(1 + 1), (1 + 2), … , (5 + 6), (6 + 6)}, 𝑛(𝑆) = 36
Then the probability distribution of X is as follows:

x 2 3 4 5 6 7 8 9 10 11 12
𝑃 (𝑋 = 𝑥) 1/36 2/36 3/36 4/36 5/36 6/36 5/36 4/36 3/36 2/36 1/36

1 2 2 1
Since ∑ 𝑃(𝑋 = 𝑥 ) = 36 + 36 + ⋯ + 36 + 36 = 1, then X is discrete random
variable.

Figure 4: Probability Distribution for Tossing Two Fair Dice

10
(b) 𝑃 (𝑋 ≥ 9) = 𝑃(𝑋 = 9) + 𝑃(𝑋 = 10) + 𝑃 (𝑋 = 11) + 𝑃(𝑋 = 12) = 36

(c) Probability of X takes an even value is

18
𝑃(𝑋 = 2) + 𝑃(𝑋 = 4) + ⋯ + 𝑃 (𝑋 = 12) = 36

(d)
x 2 3 4 5 6 7 8 9 10 11 12
𝑃 (𝑋 = 𝑥) 1 2 3 4 5 6 5 4 3 2 1
36 36 36 36 36 36 36 36 36 36 36
𝐹 (𝑥) 1 3 6 10 15 21 26 30 33 35 1
36 36 36 36 36 36 36 36 36 36

1.4 CONTINUOUS RANDOM VARIABLES

The probability distribution of a continuous random variable X is an assignment of
probabilities to intervals of decimal numbers using a function𝑓(𝑥), called a density
function. For a continuous random variable X, a probability density function is a
function such that
i. 𝑓(𝑥) ≥ 0

13
Statistics and Probability

∞
ii. ∫−∞ 𝑓(𝑥)𝑑𝑥 = 1
𝑏
iii. 𝑃(𝑎 ≤ 𝑋 ≤ 𝑏) = 𝑃 (𝑎 < 𝑋 < 𝑏) = ∫𝑎 𝑓(𝑥 )𝑑𝑥 = area under 𝑓(𝑥) from a to b
as illustrated in Figure 1.

Figure1: Probability Given as Area of a Region under a Curve

A continuous variable is a variable whose value is obtained by measuring.

Examples: height of students in class, weight of students in class, time it takes to
get to school, distance traveled between classes.

1.4.1 The cumulative distribution function

The cumulative distribution function of a continuous random variable X is
𝑥
𝐹(𝑥 ) = 𝑃(𝑋 ≤ 𝑥 ) = ∫−∞ 𝑓 (𝑥 )𝑑𝑥 for −∞ < 𝑥 < ∞

1.8.2 Mean and Variance of a continuous random variable

Suppose X is a continuous random variable with probability density function 𝑓(𝑥), the
mean and variance of X are
∞
𝜇 = 𝐸 (𝑋) = ∫−∞ 𝑥𝑓(𝑥 )𝑑𝑥
∞ ∞
𝜎 = 𝑉 (𝑋) = ∫ (𝑥 − 𝜇) 𝑓(𝑥)𝑑𝑥 = ∫ 𝑥 2 𝑓 (𝑥)𝑑𝑥 − 𝜇2
2 2
−∞ −∞

The standard deviation of X is 𝜎 = √𝜎 2 .

Example 1
Let the continuous variable X denote the current measured in a thin copper wire in
milliamperes. Assume that the range of X is [0, 20 mA], and assume that the
probability density function of X is

14
CHAPTER 1: INTRODUCTION TO STATISTICS

0.05 , 0 ≤ 𝑥 ≤ 20
𝑓 (𝑥 ) = {
0, others

(a) Show X is continuous random variable.

(b) Find (i) 𝑃(5 < 𝑋 ≤ 10), (ii) 𝑃(𝑋 > 10) and (iii) 𝑃(𝑋 < 5) .
(c) Find 𝐹 (𝑥 ).
(d) Find mean and variance of X.

Solution
∞
(a) X is continuous random variable if ∫−∞ 𝑓(𝑥)𝑑𝑥 = 1
∞ 0 20 ∞
∫−∞ 𝑓(𝑥)𝑑𝑥 = ∫−∞ 0𝑑𝑥 + ∫0 0.05𝑑𝑥 + ∫20 0𝑑𝑥
= 0 + 0.05𝑥]20
0 +0 =1

∴ X is continuous random variable

10
(b) i. 𝑃(5 < 𝑋 ≤ 10) = ∫5 0.05𝑑𝑥 = 0.05𝑥 ]10
5 = 0.25
20 20
ii. 𝑃(𝑋 > 10) = ∫10 0.05𝑑𝑥 = 0.05𝑥]10 = 0.05(10) = 0.5

5 5
iii.𝑃(𝑋 < 5) = ∫−∞ 𝑓(𝑥)𝑑𝑥 = ∫0 0.05𝑑𝑥 = 0.05𝑥]50 = 0.025

𝑥
(c) 𝐹(𝑥 ) = 𝑃(𝑋 ≤ 𝑥 ) = ∫−∞ 𝑓 (𝑥 )𝑑𝑥
For x < 0
𝑥
𝐹(𝑥 ) = 𝑃(𝑋 ≤ 𝑥 ) = ∫−∞ 0𝑑𝑥 = 0
For 0 ≤ 𝑥 ≤ 20
𝑥
𝐹(𝑥 ) = 𝑃(𝑋 ≤ 𝑥 ) = ∫0 0.05𝑑𝑥 = 0.05𝑥
For 𝑥 > 20
𝐹 (𝑥 ) = 𝑃 (𝑋 ≤ 𝑥 ) = 1
Then
0 , 𝑥<0
𝐹(𝑥 ) = {0.05𝑥, 0 ≤ 𝑥 ≤ 20
1, 𝑥 > 20
20
∞ 20 0.05𝑥 2
(d) 𝜇 = 𝐸 (𝑋) = ∫−∞ 𝑥𝑓 (𝑥 )𝑑𝑥 = ∫0 𝑥(0.05)𝑑𝑥 = ] = 10 𝑚𝐴
2 0

15
Statistics and Probability

∞ 20
𝜎 = 𝑉(𝑋) = ∫ 𝑥 𝑓 (𝑥 )𝑑𝑥 − 𝜇 = ∫ 𝑥 2 (0.05)𝑑𝑥 − 102
2 2 2
−∞ 0

3 20
0.05𝑥
= ] − 102
3 0
400
= − 102 = 33.33 𝑚𝐴2
3

Exercise
1. Determine the correct data type (quantitative or qualitative). Indicate whether
quantitative data are continuous or discrete.
i. The number of pairs of shoes you own
ii. The type of car you drive
iii. The place where you go on vacation
iv. The distance it is from your home to the nearest grocery store
v. The number of classes you take per school year.
vi. The tuition for your classes
vii. The type of calculator you use
viii.Movie ratings
ix. Political party preferences
x. Weights of sumo wrestlers
xi. Amount of money (in dollars) won playing poker
xii. Number of correct answers on a quiz
xiii.Peoples’ attitudes toward the government
xiv.IQ scores

2. A fair coin is tossed twice. Let X be the number of heads that are observed.
a) Construct the probability distribution of X.
b) Find the probability that at least one head is observed.
c) Find 𝐹 (𝑥 ).
d) Find mean and variance of X.

3. A random variable X has the uniform distribution on the interval [0,1]: the density
function is 𝑓 (𝑥 ) = 1 if x is between 0 and 1 and 𝑓 (𝑥 ) = 0 for all other values of x.
a) Show X is continuous random variable.

16
CHAPTER 1: INTRODUCTION TO STATISTICS

b) Find (i) 𝑃(0.5 < 𝑋 ≤ 1), (ii) 𝑃(𝑋 > 0.10) and (iii) 𝑃(𝑋 < 0.5) .
c) Find 𝐹 (𝑥 ).
d) Find mean and variance of X.

KIN 206 Assignment #2
No ratings yet
KIN 206 Assignment #2
14 pages
Educ 201
No ratings yet
Educ 201
2 pages
Edexcel S1 Revision Sheets
No ratings yet
Edexcel S1 Revision Sheets
9 pages
Introduction To Statistics
100% (1)
Introduction To Statistics
60 pages
Statistics For Data Science
100% (1)
Statistics For Data Science
27 pages
Statistics
No ratings yet
Statistics
88 pages
Day 01-Basic Statistics
No ratings yet
Day 01-Basic Statistics
36 pages
STATISTICS
No ratings yet
STATISTICS
98 pages
PDS_Unit4
No ratings yet
PDS_Unit4
18 pages
Lesson 5 (Descriptive Statistics Part 1)_Oct 2024
No ratings yet
Lesson 5 (Descriptive Statistics Part 1)_Oct 2024
72 pages
Lesson 02 Probability and Statistics
No ratings yet
Lesson 02 Probability and Statistics
127 pages
Basics of Statistics
No ratings yet
Basics of Statistics
40 pages
Analytics compendium (incl stats)
No ratings yet
Analytics compendium (incl stats)
31 pages
Module 6 Statistics
No ratings yet
Module 6 Statistics
44 pages
Presentation 4
No ratings yet
Presentation 4
29 pages
Statistics Notes
No ratings yet
Statistics Notes
16 pages
Module I. Basic Calculations. Average, Standard Deviation by Excel (5)
No ratings yet
Module I. Basic Calculations. Average, Standard Deviation by Excel (5)
48 pages
Statistics
No ratings yet
Statistics
11 pages
Statistics Theory
No ratings yet
Statistics Theory
3 pages
Educational Statistics Notes
No ratings yet
Educational Statistics Notes
32 pages
2466939-EDA_and_STATISTICS_NOTES
No ratings yet
2466939-EDA_and_STATISTICS_NOTES
15 pages
Lecture 1
No ratings yet
Lecture 1
32 pages
ISM Session 1-8+webinar1,2 Merged
No ratings yet
ISM Session 1-8+webinar1,2 Merged
718 pages
Article Review 1 Eng
No ratings yet
Article Review 1 Eng
30 pages
Descriptive Stat
No ratings yet
Descriptive Stat
13 pages
Module 5 Ge 114
No ratings yet
Module 5 Ge 114
15 pages
Midterms Gec Math Adooooor
No ratings yet
Midterms Gec Math Adooooor
6 pages
Chapter 2 Descriptive Statistics
No ratings yet
Chapter 2 Descriptive Statistics
12 pages
Descriptive Statistic
No ratings yet
Descriptive Statistic
37 pages
Ch 2 Lecture Notes
No ratings yet
Ch 2 Lecture Notes
12 pages
43hyrs Principles of Statistics 3
No ratings yet
43hyrs Principles of Statistics 3
56 pages
2 - Introduction To Statistics
No ratings yet
2 - Introduction To Statistics
97 pages
Session 1 ISM May 2024
No ratings yet
Session 1 ISM May 2024
59 pages
Bioepi Lesson 6. Descriptive Statistics
No ratings yet
Bioepi Lesson 6. Descriptive Statistics
38 pages
Data Management
No ratings yet
Data Management
48 pages
Introduction
No ratings yet
Introduction
10 pages
Notes 3 Descriptive Statistics RJMurden 2021
No ratings yet
Notes 3 Descriptive Statistics RJMurden 2021
47 pages
Statistics SLM
No ratings yet
Statistics SLM
7 pages
Statistics and Probabilities Quarter 1
No ratings yet
Statistics and Probabilities Quarter 1
6 pages
Descr Iptive Statis Tics: Inferential Statistics
No ratings yet
Descr Iptive Statis Tics: Inferential Statistics
36 pages
Statistical Analysis_ Descriptive Stat (2)
No ratings yet
Statistical Analysis_ Descriptive Stat (2)
6 pages
Stats 1 Module Updated
No ratings yet
Stats 1 Module Updated
53 pages
W1 Lesson 1 - Basic Statistical Concepts - Module PDF
No ratings yet
W1 Lesson 1 - Basic Statistical Concepts - Module PDF
11 pages
Unit 8. Data Analysis
No ratings yet
Unit 8. Data Analysis
69 pages
UNIT II_ Statistics for Data Science_new (1)
No ratings yet
UNIT II_ Statistics for Data Science_new (1)
153 pages
Assignment
No ratings yet
Assignment
23 pages
Assignment
No ratings yet
Assignment
30 pages
Statistics
No ratings yet
Statistics
21 pages
Unit 1 - Business Statistics & Analytics
No ratings yet
Unit 1 - Business Statistics & Analytics
25 pages
Statistics
No ratings yet
Statistics
14 pages
Topic 2- Descriptive_statistics
No ratings yet
Topic 2- Descriptive_statistics
36 pages
City_Uni_of_New_York
No ratings yet
City_Uni_of_New_York
33 pages
Basic Statistics (3685) PPT - Lecture On 20-01-2019
100% (1)
Basic Statistics (3685) PPT - Lecture On 20-01-2019
64 pages
Week 01
No ratings yet
Week 01
71 pages
f592b059 1643454320549
No ratings yet
f592b059 1643454320549
39 pages
Statistics For Data Science
No ratings yet
Statistics For Data Science
93 pages
Stat Chapter 5-9
No ratings yet
Stat Chapter 5-9
32 pages
Introduction To Statistics Lecture 7
No ratings yet
Introduction To Statistics Lecture 7
32 pages
Introduction and Descriptive Statistics
No ratings yet
Introduction and Descriptive Statistics
50 pages
Statistical Foundations for Psychology
From Everand
Statistical Foundations for Psychology
James C. Ware
No ratings yet
De-Mystifying Math and Stats for Machine Learning: Mastering the Fundamentals of Mathematics and Statistics for Machine Learning
From Everand
De-Mystifying Math and Stats for Machine Learning: Mastering the Fundamentals of Mathematics and Statistics for Machine Learning
Seaport AI Madhavan
No ratings yet
Descriptive Statistics: Six Sigma Thinking, #3
From Everand
Descriptive Statistics: Six Sigma Thinking, #3
Sumeet Savant
No ratings yet
Example 2 - Flexibility
No ratings yet
Example 2 - Flexibility
10 pages
Example 6 - Flexibity
No ratings yet
Example 6 - Flexibity
2 pages
FinalExam3412 - 21221 Question
100% (1)
FinalExam3412 - 21221 Question
4 pages
Example 4 - Flexibility
No ratings yet
Example 4 - Flexibility
6 pages
Final 20192020 Sem 2 (With Solution)
No ratings yet
Final 20192020 Sem 2 (With Solution)
13 pages
Sample Exam 1
No ratings yet
Sample Exam 1
4 pages
CS2 CMP Upgrade 2022
No ratings yet
CS2 CMP Upgrade 2022
128 pages
Lecture Notes 7 1 1
No ratings yet
Lecture Notes 7 1 1
12 pages
Guidelines For Determining Flood Flow Frequency. US Water Resources Council (1981)
No ratings yet
Guidelines For Determining Flood Flow Frequency. US Water Resources Council (1981)
194 pages
Unit 1: Essence of Biostatistics: CS4220: Knowledge Discovery Methods For Bioinformatics
No ratings yet
Unit 1: Essence of Biostatistics: CS4220: Knowledge Discovery Methods For Bioinformatics
114 pages
Basic Statistics Mcqs For Pcs Exams
100% (1)
Basic Statistics Mcqs For Pcs Exams
4 pages
Chapter 2 MC Quiz
No ratings yet
Chapter 2 MC Quiz
2 pages
SQQS1013 Ch2 A122
No ratings yet
SQQS1013 Ch2 A122
44 pages
Workbook Sampling Distributions Practice
No ratings yet
Workbook Sampling Distributions Practice
3 pages
Accounts Receivable Management and Finan Quoted Firms Nigeria
No ratings yet
Accounts Receivable Management and Finan Quoted Firms Nigeria
5 pages
Skewness: Positive (Right) Dataset 1 Interval Frequency
No ratings yet
Skewness: Positive (Right) Dataset 1 Interval Frequency
3 pages
Setiawan - 2019 - IOP - Conf. - Ser. - Mater. - Sci. - Eng. - 523 - 012082
No ratings yet
Setiawan - 2019 - IOP - Conf. - Ser. - Mater. - Sci. - Eng. - 523 - 012082
8 pages
Islami University Curriculum
No ratings yet
Islami University Curriculum
99 pages
Range and Quartiles
0% (1)
Range and Quartiles
4 pages
Statistics Simulation Exam
No ratings yet
Statistics Simulation Exam
22 pages
Students Attitudes Toward Blended Teaching Among Students of The University of Calcutta
No ratings yet
Students Attitudes Toward Blended Teaching Among Students of The University of Calcutta
8 pages
Presentation of Data PDF
No ratings yet
Presentation of Data PDF
10 pages
Lawshe Method
No ratings yet
Lawshe Method
16 pages
Ballb III Semester
No ratings yet
Ballb III Semester
22 pages
Vito Indra Buana - 21211144049 - Tugas 1 BB Dan TB Statistika
No ratings yet
Vito Indra Buana - 21211144049 - Tugas 1 BB Dan TB Statistika
9 pages
Univariate and Multivariate Skewness and Kurtosis For Measuring Nonnormality: Prevalence, in Uence and Estimation
No ratings yet
Univariate and Multivariate Skewness and Kurtosis For Measuring Nonnormality: Prevalence, in Uence and Estimation
21 pages
Free Access to Introductory Statistics Using SPSS 2nd Edition Knapp Solutions Manual Chapter Answers
100% (4)
Free Access to Introductory Statistics Using SPSS 2nd Edition Knapp Solutions Manual Chapter Answers
60 pages
MODULE 3 - Data Management
0% (1)
MODULE 3 - Data Management
24 pages
Forecasting VaR and ES Using Dynamic Conditional Score Models and Skew Student Distribution
No ratings yet
Forecasting VaR and ES Using Dynamic Conditional Score Models and Skew Student Distribution
8 pages
Frequencydistribution
No ratings yet
Frequencydistribution
52 pages
Unit 3 DS
No ratings yet
Unit 3 DS
16 pages
Intellipaat-Com-Bl
No ratings yet
Intellipaat-Com-Bl
24 pages

Chapter1 Statistics

Uploaded by

Chapter1 Statistics

Uploaded by

CHAPTER 1

1.1 INTRODUCTION TO STATISTICS

1.1.1 Descriptive Statistics

Figure 1: The types of graphs

1.1.2 Inferential Statistics

1.1.3 Terms and Definations

1.1.4 Type of Data

1.1.5 Measures of the Center of the Data

observations in a sample are 𝑥1 , 𝑥2 , … , 𝑥𝑛 , arranged in increasing order of magnitude,

The mean is influenced considerably by the presence of the extreme observation,

Skewness and the Mean, Median, and Mode

Figure 2(a) Figure 2(b) Figure 2(c)

Figure 2 : Skewned Distribution of the Data

1.1.6 Measures of the Spread of Data

Sample standard deviation 𝑠 = √𝑠 2 = √62.5 = 7.9057

Population standard deviation 𝜎 = √𝜎 2 = √50 = 7.07107

Calculation of Mean and Standard Deviation Using Calculator

1.2 INTRODUCTION TO PROBABILITY

(b) The event of interest is an “even number”so E = {2,4,6}.

(c) The event of interest is at “least five”so E = {5,6}.

1.6.1 Terms and Definations

Therefore, A and B are not mutually exclusive.

(b) A AND C = { } =do not have any numbers so P(A ∩ C) = 0.

(b) Event 𝐴′ = {1,4,5,6}, Event B ={2, 4, 6} and 𝐴′ ∩ 𝐵 = {4, 6}.

(a) independent, and (b) mutually exclusive?

Hence A and B are not mutually exclusive events.

(ii) A and B are mutually exclusive events, then 𝑃(𝐴 ∩ 𝐵) = 0.

1.3 DISCRETE RANDOM VARIABLES

1.3.1 Cumulative Distribution Function

1.3.2 The Mean and Variance of a Discrete Random Variable

The variance 𝜎 2 of a discrete random variable X is

𝜎 2 = 𝑉 (𝑋) = ∑(𝑥 − 𝜇)2 𝑃(𝑋 = 𝑥)

The standard deviation of X is 𝜎 = √𝜎 2 .

Rules for Means and Variances

Then the probability distribution of X is as follows:

Figure 4: Probability Distribution for Tossing Two Fair Dice

(c) Probability of X takes an even value is

1.4 CONTINUOUS RANDOM VARIABLES

Figure1: Probability Given as Area of a Region under a Curve

A continuous variable is a variable whose value is obtained by measuring.

1.4.1 The cumulative distribution function

1.8.2 Mean and Variance of a continuous random variable

The standard deviation of X is 𝜎 = √𝜎 2 .

(a) Show X is continuous random variable.

∴ X is continuous random variable

You might also like