0% found this document useful (0 votes)
252 views85 pages

Introduction To Statistics Material 2023

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
252 views85 pages

Introduction To Statistics Material 2023

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 85

INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

CHAPTER ONE

1. Introduction
1.1. Definition and Classification of Statistics
Statistics has become an integral part of our daily lives. Every day we are confronted with some
form of statistical information through newspapers, magazines and other forms of
communications. Such statistical information has become highly influential in our lives. The
term „statistics‟ is derived from the Latin word status, meaning state, and historically statistics
referred to the display of facts and figures relating to the demography of states or countries.
Statistics can be defined in two senses: plural (as Statistical Data) and singular (as Statistical
Methods).
i. Plural sense: Statistics are collection of facts (figures). This meaning of the word is widely used
when reference is made to facts and figures on sales, employment or unemployment, accident,
weather, death, education, etc. E.g: Sales Statistics, Labor Statistics, Employment Statistics, etc.
In this sense the word Statistics serves simply as data. But not all data are statistics. In order for
the numerical data to be identified as statistics, it must possess certain identifiable
characteristics. Some of these characteristics are described as follows:
a. Statistics are aggregate of facts. Single or isolated facts or figures cannot be called statistics
as these cannot be compared or related to other figures within the same framework.
Accordingly, there must be an aggregate of these figures. For example, if a person says that
“I earn Birr 30,000 per year”, it would not be considered as statistics. On the other hand if we
say that the average salary of a professor at our university is Birr 30,000 per year, then this
would be considered as statistics since the average has been computed from many related
figures such as yearly salaries of many professors.
b. Statistics, generally, are not the outcome of a single cause but affected by multiple
causes. There are a number of forces working together that affect the facts and figures. For
example, when we say the crime rate in a certain city has increased by 15% over the last
year, a number of factors might affect these changes. These factors may be general level of
economy such as economic recession, unemployment rate, extent of use of drugs, extent of
legal effectiveness and so on. While these factors can be isolated by themselves, the effect of
these factors cannot be isolated and measured individually. Similarly, a marked increase in
food grain production in a certain country may have been due to combined effect of many
factors such as better seeds, more extensive use of fertilizers, governmental and banking
support, adequate rainfall and so on. It is generally not possible to segregate and study the
effect of each of these forces individually.
c. Statistics are numerically expressed. All statistics are stated in numerical figures only.
Qualitative statements cannot be called statistics. For example, such qualitative statements as

Prepared by:-MILLION W. Page 1


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

„Ethiopia is a developing country‟ or „Jack is very tall‟ would not be considered as statistical
statements. On the other hand, comparing per capita income of Ethiopia with that of Kenya
would be considered statistical in nature. Similarly, Jack‟s height in numbers compared to
average height in Ethiopia would also be considered as statistics.
d. Statistical data are collected in a systematic manner for predetermined purpose. The
purpose and objective of collecting pertinent data must be clearly defined, decided upon and
determined prior to data collection. Also the procedures for collecting data should be
predetermined and well planned. These would facilitate the collection of proper and relevant
data.
e. Statistics are enumerated or estimated according to reasonable standard of accuracy.
There are basically two ways of collecting data. One is the actual counting or measuring,
which is the most accurate way. The second way of collecting data is by estimation and is
used in situations where actual counting or measuring is not feasible or where it involves
prohibitive costs. Estimates, based on samples cannot be as precise and accurate as actual
counts or measurements, but these should be consistent with the degree of accuracy desired.
ii. Singular sense: Statistics is the science that deals with the methods of data collection,
organization, presentation, analysis and interpretation of data. It refers the subject area that is
concerned with extracting relevant information from available data with the aim to make
sound decisions. According to this meaning, statistics is concerned with the development and
application of methods and techniques for collecting, organizing, presenting, analyzing and
interpreting statistical data.
1.1.1. Classification of Statistics
Based on the scope of the decision, statistics can be classified into two; Descriptive and
Inferential Statistics.
Descriptive Statistics refers to the procedures used to organize and summarize masses of data.
It is concerned with describing or summarizing the most important features of the data. It deals
only the characteristics of the collected data without going beyond it. That is, this part deals with
only describing the data collected without going any further: that is without attempting to
infer(conclude) anything that goes beyond the data themselves.
The methodology of descriptive statistics includes the methods of organizing (classification,
tabulation, Frequency Distributions) and presenting (Graphical and Diagrammatic Presentation)
data and calculations of certain indicators of data like Measures of Central Tendency and
Measures of Dispersion (Variation).
Inferential Statistics includes the methods used to find out something about a population, based
on the sample. It is concerned with drawing statistically valid conclusions about the

Prepared by:-MILLION W. Page 2


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

characteristics of the population based on information obtained from sample. In this form of
statistical analysis, inferential statistics is linked with probability theory in order to generalize the
results of the sample to the population. Performing hypothesis testing, determining relationships
between variables and making predictions are also inferential statistics.
Example: Classify the following statements as Descriptive or Inferential Statistics
a. The average age of the students in this class is 21 years.
b. There is a strong association between smoking and lung cancer.
c. Of the students enrolled in Haramaya University in this year 74% are male and 26% are
female.
d. The price of wheat will be increased by 5% in the coming year.
e. The chance of winning the Ethiopian National Lottery in any day is 1 out of 167000.
1.2. Stages in Statistical Investigation
According to the singular sense definition of statistics, a statistical study (statistical
investigation) involves five stages: Collection of Data, Organization of Data, Presentation of
Data, Analysis of Data and Interpretation of Data.
1. Collection of Data: This is the first stage in any statistical investigation and involves the
process of obtaining (gathering) a set of related measurements or counts to meet
predetermined objectives. The data collected may be primary data (data collected directly by
the investigator) or it may be secondary data (data obtained from intermediate sources such as
newspaper s, journals, official records, etc).
2. Organization of Data: It is usually not possible to derive any conclusion about the main
features of the data from direct inspection of the observations. The second purpose of
statistics is describing the properties of the data in a summary form. This stage of statistical
investigation helps to have a clear understanding of the information gathered and includes
editing (correcting), classifying and tabulating the collected data in a systematic manner. Thus
the first step in the organization of data is editing. It means correcting (adjusting) omissions,
inconsistencies, irrelevant answers and wrong computations in the collected data. The second
step of the organization of data is classification that is arranging the collected data according
to some common characteristics. The last step of the organization of data is presenting the
classified data in tabular form, using rows and columns (tabulation).
3. Presentation of Data: The purpose of data presentation is to have an overview of what the
data actually looks like, and to facilitate statistical analysis. Data presentation can be done
using Graphs and Diagrams which have great memorizing effect and facilitates comparison.

Prepared by:-MILLION W. Page 3


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

4. Analysis of Data: The analysis of data is the extraction of summarized and comprehensive
numerical description in order to reach conclusions or provide answers to a problem. The
problem may require simple or sophisticated mathematical expressions.
5. Interpretation of Data: This is the last stage of statistical investigation. Interpretation
involves drawing conclusions from the data collected and analyzed in order to make decision.
1.3. Application, Uses and Limitations of Statistics
1.3.1. Applications of Statistics
In this modern time, statistical information plays a very important role in a wide range of fields.
Today, statistics is applied in almost all fields of human endeavor.
 In Scientific Research: Statistics is used as a tool in a scientific research. Statistical
formulas and concepts are applied on a data which are results of an experiment.
 In Quality Control: Statistical methods help to check whether a product satisfies a given
standard.
 For Decision Making: statistics helps to enhance the power of decision making in the face
of uncertainty by providing sufficient information.
 In Agriculture: Experiments are designed and analyzed using statistical procedures.
 In Public Health and Medicine: statistical methods are used for computation and
interpretation of birth and death rates.
 In Economics: for modeling functional relationships between or among variables
 In Education and Agricultural Extension: to study the effects of certain trainings.
 In Natural and Social Sciences, Business, Planning, Behavior Sciences, etc.

1.3.2. Uses of Statistics


 Condenses and summarizes masses of data and presents facts in numerical and definite
form
 Facilitates comparison: statistical devises such as averages, percentages, ratios, etc. are used
for this purpose.
 Formulating and testing hypothesis: For instance, hypothesis like whether a new medicine is
effective in curing a disease, whether there is an association between variables can be tested
using statistical tools.
 Forecasting: Statistical methods help in studying past data and predicting future trends.

Prepared by:-MILLION W. Page 4


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

1.3.3. Limitations of Statistics


 It cannot deal with a single observation; rather it deals aggregate of facts.
 Statistical methods are not applicable to qualitative character i.e. it deals with quantitative
characteristics.
 Statistical results are true on average; i.e. for the majority of case. Laws of statistics are not
universally true like the laws of physics, chemistry and mathematics.
 Statistics are liable to be misused or misinterpreted. This may be due to incomplete
information, inadequate and faulty procedures during data collection and sample selection
and mainly due to ignorance (lack of knowledge).
1.4. Types of Variables and Measurement Scales
1.4.1. Variable
Variable is any phenomenon or an attribute that can assume different values or Variable is a
characteristics or an attribute that can assume different values. The most important single
distinguishing feature of a variable is that it varies; that is, it can take on different values.

For example: Height, Family size, Gender, consumption, automobile color, etc.
Based on the values that variables assume, variables can be classified as
1. Qualitative variables A qualitative variable has values that are intrinsically non-numerical
(categorical).
For example: Gender, marital status, religion, phone number, ID number, etc.
2. Quantitative variables are variables values that are intrinsically numerical (assumed to be
numeric values). These variables are numeric in nature.
For example: Height, Family size, Time, SAT score, etc
Quantitative variable can be expressed either in whole number or decimal points. Based on
its value expressed in whole number, decimal or both quantitative variables are classified in
to two; discrete and continuous variables.
 Discrete variable takes whole number values and consists of distinct recognizable
individual elements that can be counted. It is a variable that assumes a finite or countable
number of possible values. These values are obtained by counting (0, 1, 2. . .).
For example: Family size, Number of children in a family, number of cars at the traffic
light, number of goal per play, etc.
 Continuous variable takes any value including decimals. Such a variable can
theoretically assume an infinite number of possible values. These values are obtained by
measuring.

Prepared by:-MILLION W. Page 5


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

Example: Height, Weight, Time, Temperature, Age, etc


Generally the values of a variable can be obtained either by counting for discrete variables, by
measuring for continuous variables or by making categories for qualitative variables.
Exercise: Classify each of the following as Qualitative or Quantitative and if it is quantitative
classify as Discrete and Continuous.
a. Color of automobiles in a dealer‟s show room.
b. Number of seats in a movie theater.
c. Classification of patients based on nursing care needed (complete, partial or seafarer)
d. Number of tomatoes on each plant on a field.
e. Weight of newly born babies.
1.4.2. Scales/levels of Measurements
The level of measurement is one way in which variables can be classified. Broadly, this relates to
the level of information content implicit in the set of values and how each value may be
interpreted (mathematically) relative to other values on the variable - an issue which dictates how
the variable can be used and interpreted in statistical analysis. Consider the following two cases.
 Mr A wears 5 when he plays football.
 Mr B wears 6 when he plays football.
Who plays better? What is the average shirt number?
 Mr A scored 5 in Stat quiz.
 Mr B scored 6 in Stat quiz.
Who did better? What is the average score?
Based on the number on the shirts it is not possible to judge, whether Mr B plays better or not.
But by using the test score, it is possible to judge that Mr B did better in the exam. Also it not
possible to find the average shirt numbers (or the average shirt number is nothing) because the
numbers on the shirts are simply codes but it is possible to obtain the average test score.
Therefore scales of measurement
 Shows the information contained in the value of a variable.
 Shows also that what mathematical operations and what statistical analysis are permissible
to be done on the values of the variable.
Different measurement scales allow for different levels of exactness, depending upon the
characteristics of the variables being measured. The four types of scales available in statistical
analysis are:

Prepared by:-MILLION W. Page 6


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

1. Nominal variables: are those qualitative variables which show category of individuals. They
reflect classification into categories (name of groups) where there is no particular order or
qualitative difference to the labels. Numbers may be assigned to the variables simply for coding
purposes. It is not possible to compare individual basing on the numbers assigned to them. The
only mathematical operation permissible on these variables is counting.
These variables
 Have mutually exclusive (non-overlapping) and exhaustive categories.
 No ranking or order between (among) the values of the variable.
Examples: Gender, Religion, ID No, Ethnicity, Color
2. Ordinal variables: are also those qualitative variables whose values can be ordered and ranked.
Ranking and counting are the only mathematical operations to be done on the values of the
variables. But there is no precise difference between the values (categories) of the variable.
Examples: Academic qualifications (B.Sc., M.Sc., Ph.D.), Grade Scores (A, B, C, D, F), Strength
(very weak, week, strong, very strong), Wealth Index (very poor, poor, rich, very rich)
3. Interval variables: are those quantitative variables when the value of the variables is zero it
does not show absence of the characteristics i.e. there is no true zero. Zero indicates low than
empty. There is a precise difference between the units of measurement (levels)
Examples: temperature, 00c does not mean there is no temperature but to say it is too cold.
4. Ratio variables: are those quantitative variables when the values of the variables are zero it
shows absence of the characteristics. Zero indicates absence of the characteristics.
Examples: Income, Amount of yield, Expenditure, Consumption.
All mathematical operations are allowed to be operated on the values of the variables.
Exercise
1. What is the difference and similarity between sample and population?
2. What are the merits of using sample over population?
3. Discuss the difference between the four levels of measurements.
4. What are the applications of Statistics in your field of study?

Prepared by:-MILLION W. Page 7


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

CHAPTER TWO

2. Methods of Data Collection And Presentation


2.1. Data Types
Based on the source, data can be classified into two: Primary Data and Secondary Data.
 Primary data are data collected for the first time either through direct observation or by
enquiring individuals. It refers to the data collected either by or under the direct supervision
and instruction of the researcher.
 Secondary data are data obtained from published or unpublished sources like newspapers,
journals, official records, etc.
Based on the role of time, data can be classified as Cross-sectional and Time series.
 Cross-sectional Data: is a set of observations taken at a point of time.
 Time series Data: is a set of observations collected for a sequence of time usually at equal
intervals.
2.2. Methods of Data Collection
The first and foremost task in statistical investigation is data collection. Before data collection,
four important points should be considered. These are the purpose of data collection (why we
need to collect data), the data to be collected (what kind of data to be collected), the source of
data (where we can get the data) and the methods of data collection (how can we collect this
data). These steps are called the why, what, where and how of the data collection. Primary data
are collected from primary sources and secondary data from secondary sources. Primary data can
be collected through experimental methods in laboratory in natural sciences and through survey
method in social sciences. The survey methods of data collection are personal interview,
telephone interview, mailed questionnaire and personal observation.
 Observational Method: This method involves monitoring of an ongoing activity and direct
recording of data. It avoids incompleteness of data. However, it is rarely used as it is not
possible to plan when the events will happen.
 Personal Interview: a trained interviewer asks a series of questions and records responses on
a specially designed form called questionnaire. In this approach the enumerator is with the
respondent s/he explains some points which is not clear for the respondent. In this approach
the quality of the data affected both the design of the questionnaire and the quality of the
interviewer. It has the advantage of obtaining information in depth from a person being

Prepared by:-MILLION W. Page 8


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

interviewed, since we can make some clarifications to the questions and avoids
incompleteness and disorder responses.
Disadvantage:
 It is costly than other methods, since it requires training of interviewers and transportation
cost.
 The respondent may not tell us the real information for sensitive questions, since there is
face to face interaction. Eg: Asking about salary, if his/her salary is very small, he/she might
tell us the wrong one, since the respondent gets ashamed of it.
 Telephone Interview: This method involves contacting the respondent on telephone and
collecting information. It is faster to collect information. The absence of telephone lines
makes this approach less usable. It cannot be also used for rural surveys.
Advantage: It is less costly, since it requires less number of interviewers and the cost for
calling is than the cost for transportation. The respondent may give his/her opinion candidly
since there is no face to face interaction. Because of this, the data we get through this
method are more realistic than the previous one.
Disadvantage: this method is not applicable in developing countries because of the lack of
access to telephone. The respondent might not be in his/her house or may not respond to the
call, and in the meantime the interviewer might get bored. There is a high chance of getting
incomplete response, since the connection can be interrupted.
 Mailed Questionnaire: the researcher sends the questionnaire to the respondent; the
respondents complete the form and sends back to the researcher. Costs are low. The
responses are free from biases of the interviewer and respondents can have more time to
give well thought answers. But it is applicable for educated persons. Non response, Partial
response, low return rates.
Disadvantage: the respondent might give in appropriate answers to questions, since there is
no one is there with them they may understand the question wrongly and repond it
incorrectly.
2.3. Questionnaire
It is a form containing the cover letter that explains about the person conducting the survey and
the objectives of the survey, and a set of related questions which will be answered by the
respondents. It requires great care in preparing a questionnaire for data collection. One of the

Prepared by:-MILLION W. Page 9


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

most important points in preparing it is that all questions in it must have relevance to the
objectives of the survey.
Having decided which type of questionnaire to use, the following points should be kept in mind
while designing a questionnaire.
 The person conducting the survey should introduce himself and state the objective of the
survey, promise of the anonymity and include instructions as are necessary in giving correct
responses (on the cover letter).
 The number of questions should be as few as possible.
Once the objectives of the survey are clearly defined only questions pertinent to the
objectives should. The time of the respondent should not be wasted by asking irrelevant
questions. In general 5 to 25 may be regarded as affair number. If a lengthy questionnaire is
unavoidable, it should preferably be divided into two or more parts.
 Questions should be logically arranged. Put the questions in the appropriate sequence of
topics. Topics should not be mixed up.
The questions should be in a logical order so that a natural and spontaneous reply is
introduced. They should not skip back and forth.
It is undesirable to ask a person how many children s/he has before asking whether s/he is
married or not. Questions related to identification and description of the respondent should
be come first, followed by major information questions. If opinions are requested, such
questions should usually be placed at the end of the list.
 Questions should be simple, short and easy to understand and they should convey one and
only one idea. Technical terms should be avoided.
 Sensitive questions (questions of personal and financial nature) should be avoided. Such
questions should be obtained indirectly, among asset of ranges. Unless put them at the last
part and within a set of ranges. Eg: Age (0-25, 26-50, 51-75,>75)
Salary (Below 200,200-500,500-1000,>1000)
 Leading questions should be completely avoided. If you ask person like “Don not you
smoke?” the person will automatically say „Yes I do not‟
 Answers to the questions should not require any calculation.
 There should be instructions how to fill the form.
 Questions should be capable of objective answers.

Prepared by:-MILLION W. Page 10


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

Types of questions
Different types of questions that may form a questionnaire can be grouped into two categories.
1. Closed-ended (Dichotomous questions and Multiple-choice questions)
2. Open-ended questions
Dichotomous questions are type of questions which have two alternative responses. Such
questions can be answered in „Yes‟ or „No‟.
Example: Do you intend to purchase TV? A Yes B. No
Do you drink coffee? A. Yes B. No
Multiple-choice questions: in such types of questions the respondent is asked to select one out of
a number of alternative responses. This process not only facilitates tabulation of data but also
takes very little time of the respondent to fill the questionnaire.
Example: Why did you purchase a Sony TV?
Lower price
 Best quality
 Better picture
 Longer guarantee
 Any other
The problem with multiple-choice questions is that the respondent may like to tick more than one
alternative. So to avoid such a problem either we have to inform the respondent to choose the
most important one or to make a rank among his choices. The use of multiple choice questions
are indicated only when the investigator is confident of the existence of a limited group of
important alternatives. Open-ended or free answer questions: In such types of questions, the
respondent will have the chance to answer the questions in his/her own words.
Example: -What is your opinion on the teaching policy?
The difficulty with these types of questions is in classifying the questions during tabulations and
analysis.
2.4. Methods of Data Organization
In order to describe situations, draw conclusions or make inferences about the population even to
describe the sample, the collected data must organize into some meaningful way. The most
convenient way of organizing data is to construct a frequency distribution.
Frequency distribution is the organization of raw data in table form, using classes and
frequencies.
Definition of some terms
Class: is a description of a group of similar numbers in a data set.

Prepared by:-MILLION W. Page 11


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

Frequency: is the number of times a variable value is repeated.


Class frequency: the number of observations belonging to a certain class.
There are three types of frequency distributions; categorical, ungrouped (discrete or frequency
array) and grouped (continuous) frequency distributions.
Categorical FD:-a FD in which the data is qualitative i.e. either nominal or ordinal. Each
category of the variable represents a single class and the number of times each category repeats
represents the frequency of that class (category).
Example:-The blood type of 25 students is given below
A B B AB O A
O O B AB B A B
B B O A O AB
A O O O AB O

Class(Blood type) Frequency(number of students)


A 5
B 7
AB 4
O 9
Total 25

Exercise:-Construct FD for the following letter grade of 25 students


A B C C C
C B B A D
A C C A B
F C C A B
Ungrouped FD (Frequency Array):- A FD of numerical data (quantitative) in which each value
of a variable represents a single class (i.e. the values of the variable are not grouped) and the
number of times each value repeats represents the frequency of that class.
Example:-Number of children for 21 families.
2 3 5 4 3 3 2
3 1 0 4 3 2 2
1 1 1 4 2 2 2

Prepared by:-MILLION W. Page 12


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

Class(Number of children) Frequency(Number of families)


0 1
1 4
2 7
3 5
4 3
5 1
Total 21
Grouped (Continuous) FD: - A FD of numerical data in which several values of a variable are
grouped into one class. The number of observations belonging to the class is the frequency of the
class.
Example:-Consider age group and number of persons
Class Limits Class Boundaries Frequency
(Age in years) (Age in years) (number of persons)
1-25 0.5-25.5 20
26-50 25.5-50.5 15
51-75 50.5-75.5 25
76-100 75.5-100.5 10
Total 70
Class Limits:-The lowest and highest values that can be included in a class are called Class
Limits. The lowest values are called Lower Class Limits and the highest values are called Upper
Class Limits.
Class limit for the first class 1-25
Lower class limit 1 and Upper class limit 25
Class Boundaries:-are class limits when there is no gap between the UCL of the first class and
the LCL of the second class. The lowest values are called Lower Class Boundaries and the
highest values are called Upper Class Boundaries.
Cass Boundary for the first class 0.5-25.5
Lower class boundary 0.5 and Upper class boundary 25.5

Class Width (Class Size):-the difference between UCB and LCB of a class. It is also the
difference between the lower limits of two consecutive classes or it is the difference between
upper limits of two consecutive classes.

Prepared by:-MILLION W. Page 13


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

W=UCB-LCB or W=LCLi-LCLi-1 or W=UCLi-UCLi-1


For the above Example W=25.5-0.5=25 or W=26-1=25 or W=50-25=25
Class Mark (Class Midpoint):-is the half way between the class limits or the class boundaries.
LCL  UCL LCB  UCB
CM= or CM=
2 2

Class Limits Class Boundaries Class Mark Frequency


1-25 0.5-25.5 13 20
26-50 25.5-50.5 38 15
51-75 50.5-75.5 63 25
76-100 75.5-100.5 88 10
Total 70

Note that W=CMi-CMi-1


Relative frequency: The relative frequency distribution can be formed by dividing the
frequency in each class of the frequency distribution by the total number of observations. It
can be converted in to a percentage frequency distribution by simply multiplying each relative
frequency by 100. The absolute frequency distribution is a summary table in which the original
data is condensed into groups and their frequencies, which is called absolute frequency
distribution. But if a researcher would like to know the proportion or percentage of cases in each
group, instead of simply, the number of cases, s/he can do so by constructing a relative frequency
distribution table.

The relative frequencies are particularly helpful when comparing two or more frequency
distributions in which the numbers of cases under investigation are not equal. The percentage
distributions make such a comparison more meaningful, since percentages are relative
frequencies and hence the total number in the sample or population under consideration becomes
irrelevant.
Percentage frequency: - Relative frequency ×100
Class Class Boundaries Class Mark Frequency Relative Percentage
Limits frequency frequency
1-25 0.5-25.5 13 20 20/70
26-50 25.5-50.5 38 15 15/70
51-75 50.5-75.5 63 25 25/70
76-100 75.5-100.5 88 10 10/70
Total 70 70/70=1 100

Prepared by:-MILLION W. Page 14


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

The above frequency distributions tell us the actual number (percentage) of units in each class, it
does not tell us directly the total number (percentage) of units that lie below or above the
specified values of the classes.

Cumulative frequency: is the sum of frequencies (total number of observations) below or above
a certain value. A cumulative frequency distribution displays the total number of observations
above (below) a certain value.
Less than Cumulative Frequency: is the total number of values of a variable below a certain
Upper Class Boundary. When the interest of the investigator focuses on the number of items
below a specified value, then this specified value is the upper boundary of the class. It is known
as less than cumulative frequency distribution
More than Cumulative Frequency: - is the total number of values of a variable above a certain
Lower Class Boundary. When the interest lies in finding the number of cases above a specified
value, then this value is taken as the lower boundary of the specified class and is known as more
than cumulative frequency distribution.

Class Class Class Frequency Less than More than


Limits Boundaries Mark Cum. Freq. Cum. Freq.
1-25 0.5-25.5 13 20 20 10+25+15+20=70
26-50 25.5-50.5 38 15 20+15=35 10+25+15=50
51-75 50.5-75.5 63 25 20+15+25=60 10+25=35
76-100 75.5-100.5 88 10 20+15+25+10=70 10
Total 70

Construction of Grouped Frequency Distribution


1. Arrange the data in an array form (increasing or decreasing order).
2. Find the Unit of Measurement (U). U is the smallest difference between any two distinct values
of the data.
3. Find the Range(R). R is the maximum numerical difference in the data set, i.e. the difference
between the largest and the smallest values of the variable.
4. Determine the number of classes (K) using Sturge‟s Rule. K=1+3.322logN where N is the total
number of observations.
R
5. Specify the class width (W). W=
K

Prepared by:-MILLION W. Page 15


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

6. Put the smallest value of the data set as the LCL of the first class. To obtain the LCL of the
second class add the class width W to the LCL of the first class. Continue adding until you get
K classes.
Let X be the smallest observation
LCL1=X
LCLi=LCLi-1+W for i=2, 3… K.
7. Obtain the UCLs of the FD by adding W-U to the corresponding LCLs.UCLi=LCLi+(W-U) for
i=1,2…K.
1 1
8. Generate the class boundaries.LCBi=LCLi- U and UCBi=UCLi+ U for i=1,2…K.
2 2
Example 1: Mark of 50 students out of 40
16 21 26 24 11 17 25 26 13 27 24 26 3 27 23 24 15 22 22 12 22 29 18 22 28 25 7
17 22 28 19 23 23 22 3 19 13 31 23 28 24 9 20 33 30 23 20 8 21 24
Construct grouped frequency distribution.
Solution:
1. The array form of the data (increasing order)
3 3 7 8 9 11 12 13 13 15 16 17 17 18 19 19 20 20 21 21 22 22 22 22 22 22
23 23 23 23 23 24 24 24 24 24 25 25 26 26 26 27 27 28 28 28 29 30 31 33
2. U=9-8=1
3. R=L-S=33-3=3
4. K=1+3.322logN=1+3.322log50=6.64≈7
5. W=R/K=30/6.64=4.5≈5
6. W-U=5-1=4
Class Limits Class Class Frequency Relative Percentage LCF MCF
Boundaries Mark Frequency Frequency
3-7 2.5-7.5 5 3 3/50=0.06 6 3 50
8-12 7.5-12.5 10 4 4/50=0.08 8 7 47
13-17 12.5-17.5 15 6 6/50=0.12 12 13 43
18-22 17.5-22.5 20 13 13/50=0.26 26 26 37
23-27 22.5-27.5 25 17 17/50=0.34 34 43 24
28-32 27.5-32.5 30 6 6/50=0.12 12 49 7
33-37 32.5-37.5 35 1 1/50=0.02 2 50 1
Total 50 1 100

Prepared by:-MILLION W. Page 16


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

Exercise: In a survey the age of 44 women at marriage was reported as follows. Construct the
appropriate FD for this data.
24 25 27 26 22 23 24 25 24 23 26 28 24 25 23 24 25 25 25 22 27 28
27 24 25 24 25 28 26 25 24 28 24 25 25 24 25 24 26 27 27 25 28 26
Properties of Classes (Class Boundaries)
Classes should be:
 Complete and non-overlapping
Complete- it should include all the data set. Non-overlapping and no data should belong
to two classes.
 Clear and properly set
The W and K should be calculated properly and W should be the same for all classes.
 Standardized
A class should follow logical and chronological (increasing) order.
 The number of classes should be in between 5 and 20 i.e. 5≤K≤20. K depends on N. the
larger the N the more the K. But we need to condense the data set with minimum lose of
information in an easy manageable classes.
 Continuous
Even if there are no values in a class the class must be included in the frequency
distribution.
Advantages and disadvantages of frequency distributions
a. Advantages
 It condenses a large mass of data in to a comparatively small table.
 It attracts the attention of even a layman and gives him an insight into the nature of the
distribution.
 It helps for further statistical analysis, like central tendency, scatter, symmetry,… of the
data.
b. Disadvantages
 In the grouped frequency distributions, the identity of the observations is lost. We know
only the number of observations in a class and don not know what the values are.
 Because the selection of the class width and the lower class limit of the first class are to a
certain extent arbitrary, different frequency distributions may be constructed for the same
data and hence may give contradictory impressions.

Prepared by:-MILLION W. Page 17


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

2.5. Data Presentation


2.5.1. Diagrammatical Presentation of Data
1. Bar Diagram:-It is the simplest and most commonly used diagrammatic representation of a
frequency distribution. It is appropriate to present Qualitative Data (nominal\ordinal). It uses a
serious of separated and equally spaced bars in which the width of the bars is constant and height
of bars corresponds to the frequency of the category. The bars are separated by constant distance.
a) Simple Bar Diagram: is a diagram in which categories of a variable are marked on the X
axis and the frequencies of the categories are marked on the Y axis. It is applicable for
discrete variables, that is, for data given according to some period, places and timings.
These periods and timings are represented on the base line (X-axis) at regular interval and
the corresponding frequencies are represented on the Y-axis.
 The width of the rectangle represents nothing, but it should be equal for all rectangles.
 Each rectangle is separated by an equal space.
 It can also represent some magnitude (on the Y axis) over time, space, groups on the X axis.
Example 1:
Marital Status Number of individuals
Single 100
Married 70
Divorced 30
Total 200

Figure 1: Simple bar-chart of marital status of individuals.

Prepared by:-MILLION W. Page 18


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

b) Component Bar Diagram: is used when there is a desire to show a total or aggregate is
divided into its component parts. The bars represent total value of a variable with each total
broken into its component parts and different colors are used for identification. In such type
of diagrams, a bar is subdivided in to parts in proportion to the size of the sub division.
These subdivided rectangles are shaded differently by lines, dots and colors so that they will
be very easy to compare the components.
Sometimes the volumes of different attributes may be greatly different. For making meaningful
comparisons, the components of the attributes are reduced to percentages. In that case each
attribute will have 100 as its maximum volume. This sort of component bar diagram is known as
percentage bar-diagram. Each rectangle represents total value of a variable and is broken into its
component parts.
Example:
Marital Status Male Female Total
Single 90 10 100
Married 30 40 70
Divorced 1 29 30

Figure 2: Component bar-chart of marital status with Gender of individuals.


c) Multiple Bars Diagram: used to display data on more than one variable. In the multiple
bars diagram two or more sets of inter-related data are interpreted.
Example:
Year Coffee Butter Sugar Total
1997 120 127 75 322
1998 25 98 87 210
1999 100 120 75 295
2000 198 98 60 356

Prepared by:-MILLION W. Page 19


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

Figure 2: Multiple bar-chart of commodity produced over time.


d) Deviation Bar Diagram: When the data contains both positive and negative values such as
data on net profit, net expense, percent change, etc.
Example:

Commodity Net profit


Soap 80
Sugar -95
Coffee 125

2. Pie chart: - Pie chart is popularly used in practice to show percentage break down of data. A
pie chart is a circle representing a set of data by dividing the circle into sectors proportional to
the number of items in the categories or a pie chart is a circle representing the total, cut into
slices in proportional to the size of the parts that make up the total. It gives the proportional
sizes of different data groups as slice of a pie or a circle.

Prepared by:-MILLION W. Page 20


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

Example:
Marital Status Number of individuals Percentage Degree
Single 100 50 180
Married 70 35 126
Divorced 30 15 54
Total 200 100 360

2.5.2. Graphical Presentation of Data


1. Histogram: A graph in which the classes are marked on the X axis (horizontal axis) and the
frequencies are marked along the Y axis (vertical axis).
 The height of each bar represents the class frequencies and the width of the bar represents
the class width.
 The bars are drawn adjacent to each other.
Example: Construct a histogram to the following grouped data.

Class boundaries Frequency


99.5–104.5 2
104.5–109.5 8
109.5–114.5 18
114.5–119.5 13
119.5–124.5 7
124.5–129.5 1
129.5–134.5 1

Prepared by:-MILLION W. Page 21


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

2. Frequency Polygon: A graph that consists of line segments connecting the intersection of the
class marks and the frequencies.
 Can be constructed from Histogram by joining the mid-points of each bar.
Example: Construct frequency polygon for the following Grouped frequency Distribution.

Prepared by:-MILLION W. Page 22


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

3. Cumulative Frequency (Ogive) curves: is a smooth free hand curve of frequency polygon.
Example: Construct Ogive curve for the following Grouped frequency Distribution.

Class boundaries Frequency


99.5–104.5 2
104.5–109.5 8
109.5–114.5 18
114.5–119.5 13
119.5–124.5 7
124.5–129.5 1
129.5–134.5 1

Prepared by:-MILLION W. Page 23


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

CHAPTER THREE

3. MEASURES OF CENTRAL TENDENCY


Usually the collected data is not suitable to draw conclusions about the mass from which it has
been taken. Even though the data will be ,somewhat summarized after it is depicted using
frequency distributions and presented by using graphs and diagrams, still we cannot make any
inferences about the data since we have many groups. Hence, organizing a data into a FD is not
sufficient, there is a need for further condensation, particularly when we want to compare two or
more distributions we may reduce the entire distribution into one number that represents the
distribution we need. A single value which can be considered as a typical or representative of a
set of observations and around which the observations can be considered as centered is called an
„Average‟ (or average value or center of location). Since, such typical values tend to lie centrally
within asset of observations when arranged according to magnitudes; averages are called
Measures of Central Tendency.

3.1. Objectives of Measures of Central Tendency


1. To condense a mass of data in to one single value. That is to get a single value which is best
representative of the data (that describes the characteristics of the entire data). Measures of
central tendency, by condensing masses of in to one single value enable us to get an idea of
the entire data. Thus one value can represent thousands of data even more.
2. To facilitate comparison. Statistical devices like averages, percentages and ratios used for this
purpose. Measures of central tendency, by condensing masses of data in to one single value,
facilitates comparison. For example, to compare two classes A and B, instead of comparing
each student result, which is infeasible, we can compare the average mark of the two classes.
There are many types of measures of central tendency, each possessing particular properties and
each being typical in some unique way. The most frequently encountered ones are
 Computed averages: Mean (Arithmetic Mean. Geometric Mean and Harmonic Mean)
 Positional averages: Median and Quantiles (Quartiles, Deciles, Percentiles)
 Mode
3.2. Properties of Good Measures of Central Tendency
A measure of central tendency is good or satisfactory if it possesses the following characteristics.
1. It should be calculated based on all observations.

Prepared by:-MILLION W. Page 24


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

2. It should not be affected by extreme values. It should be as close to the maximum number of
observed values as possible.
3. It should be defined rigidly which means it should have a definite value (it should be
unique).
4. It should always exist.
5. It should be easy to understand calculate. It should not be subject to complicated and tedious
calculations, though the advent of electronic calculators and computers has made it possible.
6. It should be capable of further algebraic treatment. By algebraic treatment, we mean that the
measures should be used further in the formulation of other formulae or it should be used for
further statistical analysis.
3.3. Summation Notation
n
The sum X1+X2+…+Xn is denoted by the Greek letter ∑ (sigma) as X
i 1
i
= X1+X2+…+Xn and

it is called the Summation Notation.


Properties of the summation notation:
n n n
  ( X i  Yi ) =  X i +  Yi
i 1 i 1 i 1

n
 X Y
i 1
i i  X 1Y1  X 2Y2  ...  X nYn

n n
  ( X i  c)   X i  nc
i 1 i 1

n n
  CX i =C  X i , where C is a constant.
i 1 i 1

n
  a =n a where a is a constant.
i 1
n
From now onwards we will use ∑X in place of X
i 1
i
just for simplicity.

3.4. Mean and its Properties


3.4.1. Arithmetic Mean

Simple Arithmetic Mean:-is the sum of all observations divided by total number of observations.
For a sample of n observations X1, X2,…,Xn the sample mean is denoted by X (X-bar) and
calculated as follows.

Prepared by:-MILLION W. Page 25


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

X = = 1
X X  X 2  ....  X n
For a frequency array (ungrouped FD),
n n

X=
 fX = f X1 1  f 2 X 2  ....  f K X K
For grouped FD, X represents class mark.
f f1  f 2  ...  f K

Example1: The high temperatures for a 7-day week during December in Haramaya University
were 29 , 31 , 28 , 32 , 29 , 27 , and 55 . find the mean high temperature for the
week.
Solution: X = = =33 .
The mean or average, high temperature for the week was 33 .
Example2: The amounts of drops of water in drip irrigation were registered from 43 sample drip
holes in one day and the data are as follows

Class Interval Frequency


3-7 3
8-12 4
13-17 6
18-22 13
23-27 17
Solution:
Class interval Frequency Class mark(Mi)
3-7 3 5
8-12 4 10
13-17 6 15
18-22 13 20
23-27 17 25

The given table is grouped FD, so we can apply X =


 fM
f
X= = =19.30. The average drip holes in drip irrigation is 19.30

Properties of Arithmetic Mean

 The algebraic sum of the deviations of each value from the arithmetic mean is zero. That is
∑(X- X ) =0.
 The sum of the squares of the deviations from the mean is less than the sum of the squares of
the deviations about the other score in the distribution.

Prepared by:-MILLION W. Page 26


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

That is ∑(X- X ) 2≤∑(X-A) 2, A≠ X


 If a constant C is added or subtracted from each value in a distribution, then the new mean
will be X new= X old  C respectively.
 If each value of a distribution is multiplied by a constant C, the new mean will be the original
mean multiplied by C.

Combined Mean: If there are p different groups (having the same unit of measurement) with
mean X 1 , X 2 ,…, X p and number of observations n1,n2,…np respectively, then the mean of all the

groups i.e. the combined mean is given by X C

  

XC =
 nX =
n1 X 1  n2 X 2  ....  n p X p
n n1  n2  ...  n p

Example: The mean weight of 50 women workers in a factory is 48 kg. The mean weight of 75
men working in the same factory is 58 kg. Find the mean weight of all workers in the factory.
̅ ̅
Solution: ̅ . Therefore, the mean weight of the
factors workers is 54kgs.

Weighted Arithmetic Mean:


While calculating the simple arithmetic mean we had given equal importance to all values. But
there are cases where the relative importance is not the same for all items. When this is case, it is
necessary to assign them weights (i.e. relative importance) and then calculate a weighted
arithmetic mean. Let X1, X2,…,Xn be the values and W1,W2,…,Wn be the corresponding weights

then the weighted arithmetic mean denoted by X W is given by X W =


WX =
W
W1 X 1  W2 X 2  ....  Wn X n
W1  W2  ...  Wn
Example: If a final examination in a course is weighted three times as much as a quiz and a
student has a final examination grade of 85 and quiz grades of 70 & 90, find the mean grade of a
student.
Solution: let X1=1st quiz=70, X2=2ndquiz=90 and X3=final=85 with the corresponding weights‟
W1=1, W2=1 and W3=3

XW =
WX = = =83, so the average grade of a student is 83.
W

Prepared by:-MILLION W. Page 27


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

Arithmetic mean fulfills almost all characteristics of good measures of central tendency with the
exception that it is highly affected by extreme values. And it cannot be calculated for a FD with
open-ended classes (a FD with no lower class boundary of the first class or with no upper class
boundary of the last class or with both).
3.4.2. Geometric Mean
Geometric mean is the nth root of the product of the n values.
GM= n X = n X 1 X 2 ... X n

But this formula is used if n is small. If it is large, it is difficult to calculate the n th root. Thus to
facilitate the computation, we make use of logarithms.
1
GM=Antilog( ∑logX)
n
1
For ungrouped FD, GM=Antilog ( ∑flogX)
f
For grouped FD, X represents class mark.
If the variable values are measures as ratios, proportions or percentage and some values are
larger in magnitude and others are small, then the geometric mean is a better representative of
the data than the simple average. In a “geometric series”, the most meaning full average is the
geometric mean. The arithmetic mean is very biased toward the large numbers in the series.
The geometric mean is important in determining the average rate of growth, percentages, ratios
and portions.
The disadvantage of GM is that it cannot be calculated if one or more observations are zero or
negative. It is also affected by extreme values but not to the extent of AM.
Exercise:
1. Find the geometric mean of A) 1, 2, 3, 4, 5. B) 1, 2, 3, 4, 100. Is there a great difference
between the GM of A and that of B?
2. The price of a commodity increased by 5% from 1989 to 1990, 8% from 1990 to 1991 and by
77% from 1991 to 1992. Find the average price increase.
3. A machine depreciated by 10% each in the first two years and by 40% in the third year. Find
out the average rate of depreciation.
4. Decadal percentage growth of population in country A is given below. Find the average rate
growth.

Prepared by:-MILLION W. Page 28


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

Year 1921 1931 1941 1951 1961 1971 1981

% Increase 8.25 19.08 32.09 41.49 25.89 37.91 46.02

3.4.3. Harmonic Mean


Harmonic Mean is another specialized average which is useful in averaging variables expressed
as rate per unit of time, such as speed, number of units produced per day. It is the reciprocal of
the arithmetic mean of the numbers.
n n
HM= =
1 1 1 1
X 
X1 X 2
 ... 
Xn

For ungrouped FD, HM=


f =
f1  f 2  ...  f K
f f1 f f
X  2  ...  K
X1 X 2 XK
For grouped FD, X represents class mark.

Weighted harmonic mean, HM=


W =
W1  W2  ...  Wn
W W1 W2 W
X 
X1 X 2
 ...  n
Xn
Harmonic mean is not affected by extreme values. But it cannot be calculated when one or more
observations are zero.

Relationships between AM, GM and HM


 For n observations AM ≥ GM ≥ HM
 For two positive observations GM = AM * HM
Example 1: Find the H.M of 2, 4 and 8.
n
Solution: X HM = = = =3.43
1
X
Example 2: In a small company two typists are employed, typist A types one page in 10 minutes
and typist B types one page in 20 minutes.
a) Both are asked to types 10 pages. What is the average time taken for typing one page?
b) Both are asked to types for one hour. What is the average time taken by them for typing one
page?
( ) ( )
Solution: a) X = =15 minute

Prepared by:-MILLION W. Page 29


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

b) X HM= 13 min. & 20 sec.

Exercise:
1. Find the harmonic mean of A) 1, 2, 3, 4, 5. B) 1, 2, 3, 4, 100. Is there a great difference
between the HM of A and that of B?
2. A driver traveled 400 km per day for three days at a speed of 60, 50 and 40 kilometers per
hour. Find the average speed of the driver.
3. A student reads the first 100 pages of a book at a rate of 5 pages per hour, the next 100 pages
at a rate of 8 pages per hour. What is the student‟s average reading speed?
4. Suppose a train moves 100 km with a speed of 40 km per hour, then 150 km with a speed of
50 km per hour and the next 135 km with a speed of 45 km per hour. Calculate the average
speed of the train.
5. In a factory a mechanic takes 15 days to fabricate a machine, the second mechanic takes 18
days, the third takes 30 days and the fourth takes 90 days. Find the average number of days
taken by the workers to fabricate the machine.
6. Suppose a train moves 5 hours at a speed of 40 km per hour, then 3 hours at a speed of 50 km
per hour and the next 5 hours with a speed of 45 km per hour. Calculate the average speed of
the train.
3.5. Median
Median is the half-way point in a data set. It divides a data set into two equal parts such that half
of the numbers have a value less than the median and have will have values greater than the
median. Graphically median is the intersection of the less than and more than cumulative
frequency curves.
The median of a set of n observations X1X2,…,Xn arranged in ascending order of magnitude is
the middle value if n is odd or the arithmetic mean of the two middle values if n is even. That is
n n
( ) th value  (  1) th value
~ n  1 th ~
If n is odd X = ( ) valueand if n is even X = 2 2
2 2
Median for continuous grouped data: for grouped frequency distributions median is given by the
n
 FX~ 1
~ 2
formula X = L X~  ( )w
f X~
Where n=∑f= sum of frequencies
L X~ is the LCB of the median class.
FX~ 1 is the less than cumulative frequency just before the median class.

Prepared by:-MILLION W. Page 30


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

f X~ is frequency of the median class.


First obtain the less than cumulative frequencies. From the cumulative frequencies select the
n
minimum one which contains the value . Then the median class is the class corresponding to
2
n
this minimum cumulative frequency which contains the value .
2
Median is not influenced by extreme values. It can be calculated for FD with open-ended classes,
even it can be located if the data is incomplete.
Examples:
1. Find the median of the following data sets.
a) 180, 201, 220, 191, 219, 209 and 220.
Solution: 4th value=209
b) 62, 63, 64, 65, 66, 66, 68 and 78.
Solution: (4th value+5th value)/2= (65+66)/2=65.5
2. Find the median weight of the 40 males college students at state university and
Interpretation the result.
Weight Frequency LCF
118-126 3 3
127-135 5 8
136-144 9 17
145-153 12 29
154-162 5 34
163-171 4 38
172-180 2 40
Total 40

Solution: The median class is the class having the less than cumulative frequency containing the
value n/2=40/2=20. This implies, 145-153 is the median class.
L X~ =144.5, n=40, FX~ 1 =17, f X~ =12 and w=9

n
 FX~ 1
~
X = L X~  ( 2 ) w =144.5+ (20-17)* =146.8.
f X~
3.6. Mode
The mode denoted by X̂ , is the most frequently occurring value in a set of observations or it is
the value with the highest frequency. A data set may have one mode (uni-modal), two modes (bi-

Prepared by:-MILLION W. Page 31


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

modal), more than two modes (multi-modal) or no mode at all (i.e. when all observations are
equally frequent).
Ungrouped (individual series): Arrange the data in ascending order and take the value
appearing most frequently (the most frequent value).
Grouped (continuous) series: In a frequency distribution, the mode is located in the class with
highest frequency and that class is the modal class.
f Xˆ  f Xˆ 1
Then the formula for mode is X̂ = L Xˆ  ( )w
( f Xˆ  f Xˆ 1 )  ( f Xˆ  f Xˆ 1 )

Mode is not affected by extreme values and can be calculated for open-ended classes. But it
often does not exist and is value may not be unique.
Example 1: The study of the relationship between age and varies function (such as acuity and
depth perception) reported the following observation on area of sclera lamina (mm2) from human
optic nerve heads (experimental eye research 1988): 2.75, 2.62, 2.74, 3.85, 2.34, 2.74, 3.93, 4.21,
3.88, 4.33, 3.46, 4.52, 2.43, 3.65, 2.78, 3.56, 3.01. Find mean, median, mode,Q1, D5, P75.
Solution: Check the answer (mean=3.341, median=3.46, mode=2.71, Q1=2.74, D5=3.46 &
P75=3.93)
Example 2: Find the mode & interpret the result of 40 male college students in state university.
Solution: the most frequency appears at class interval 145-153, so
L X~ =144.5, n=40, FX~ 1 =9, FX~ 1 =5 f X~ =12 and w=9

f Xˆ  f Xˆ 1
X̂ = L Xˆ  ( ) w =144.5+ =144.5+2.7=147.2
( f Xˆ  f Xˆ 1 )  ( f Xˆ  f Xˆ 1 )

Interpretation: The mode of the 40 males‟ college students is 147.2.


Properties of Mode
1. It is simple to calculate and easy to determine.
2. It is not based on all observations.
3. The mode can be used for both qualitative (such as religious preference, gender, political
affiliation, etc) and quantitative data types.
3.7. The Quantiles:
The median of a set of data divides a given data set into two equal parts; there are also measures
that divide a given data set in to more than two equal parts. These measures are collectively
known as Quantiles. Quantiles include quartiles, deciles and percentiles.
Quartiles: are values that divide a dataset into four equal parts. These values are denoted by Q 1,
Q2 and Q3 such that 25% of the data fall below Q1, 50%below Q2 and 75% below Q3.

Prepared by:-MILLION W. Page 32


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

Deciles: are values that divide the data into ten equal parts. These values are denoted by D1, D2,
…, D9 such that 10% of the data fall below D1, 20% below D2, …, 90% below D9.
Percentiles: are values that divide a dataset into 100 equal parts. These values are denoted by P 1,
P2, …, P99.
Methods of calculation
a. Ungrouped (individual) series: Arrange the values in ascending order. Then
 Quartiles: Let Qi be the ith quartile (i=1,2,3), then
i(n  1) th
i= ( ) value
4
 Deciles: Let Di be the ith decile (i=1,2,…,9)
i(n  1) th
Di= ( ) value
10
 Percentiles: Let Pi be the ith percentile (i=1,2,…,99)

i(n  1) th
Pi= ( ) value
100
If x1, x2, . . . , xn are sorted data set and j and k are integral and fractional parts of Qi respectively,
then Qi is between xj and xj+1 given by
Qi = xj + k(xj+1 − xj)

Example: Given the data 42, 43, 35, 38, 41, 49, 50, 51, 52 and 55. Find

A) All quartiles
B) The 2nd and 8th deciles
C) 35th and 75th percentiles

1(10  1) th
Solutions:-a) Q1 = ( ) value =2.75th value
4

=2nd value +0.75(3rd value-2nd value)

=38+0.75(41-38)=40.25

2(10  1) th
Q2= ( ) value =5.5th value
4

=5th value +0.5(6th value-5th value)

=43+0.5(49-43)=46 (Q3 left as exercise).

2(10  1) th
b) D2= ( ) value =2.2th value
10

=2nd +0.2(3rd value-2nd value)

Prepared by:-MILLION W. Page 33


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

=38+0.2(41-38)=38.6 (D8 left as exercise)

35(10  1) th
c) P35= ( ) value =3.85th value
100
=3rd value+0.85(4th value-3rd value)
=41+0.85(42-41)=41.85 (P75 left as exercise)
b. Group (continuous) data:
in
 FQi 1
 Quartiles: Qi= LQi  ( 4 )w i=1, 2, 3.
f Qi
in
 FDi 1
 Deciles: Di= LDi  ( 10 )w i=1, 2,…., 9.
f Di
in
 FDi 1
 Percentiles: Pi= LDi  ( 100 )w i=1, 2,…,99.
f Di
Where n=∑f= sum of frequencies
L is the LCB of the ith(quartile, decile and percentile) class.
F is the less than cumulative frequency just before the ith(quartile, decile and percentile)
class.
f is frequency of the ith(quartile, decile and percentile) class .
w is the class width.
Example 2: In a certain investigation, 460 persons were involved in the study, and based on an
enquiry on their age, the following frequency distribution shows the age composition of the
persons under study.
Age interval 10.5-15.5 15.5-20.5 20.5-25.5 25.5-30.5 30.5-35.5 35.5-40.5 40.5-45.5 45.5-50.5
in years
Number of 24 64 90 122 51 56 20 33
persons

LCF 24 88 178 300 351 407 427 460

a. Find the values of all quartiles.


b. Compute the 5th decile, 25th percentile, 50th percentile and the 75th percentile
Solution
in
 FQi 1
a) Quartiles: Qi= LQi  ( 4 )w i=1, 2, 3.
f Qi

The first quartile class is 1x460/4=115th which is 20.5-25.5

Prepared by:-MILLION W. Page 34


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

1x 460
 88
Q1= 20.5  ( 4 )5 =22
90

The first quartile class is 2x460/4=230th which is 25.5-30.5

2 x 460
 178
Q2= 25.5  ( 4 )5 =27.63
122
in
 FDi 1
b) Deciles: Di= LDi  ( 10 )w i=1, 2,…., 9.
f Di
The 5th deciles class is 5*460/10=230 which lies in 25.5-30.5.

5 * 460
 178
D5= 25.5  ( 10 )5 ==27.63
122
The 25th percentile is 25*460/100=115 which lies in 20.5-25.5.

25 * 460
 88
P25= 20.5  ( 100 )5 ==22.
90

Q3, P50 and P75 left as an exercise.


Relationship between median, quartiles, deciles and percentiles
~
 X =Q2=D5=P50
 Qi=Pi*25
 Di=Pi*10

Prepared by:-MILLION W. Page 35


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

CHAPTER FOUR

4. MEASURES OF VARIATION (DISPERSION)


In the third chapter, we concentrated on a central value (measures of central tendency), which
gives an idea of the whole mass that is a complete set of values. However, the information so
obtained is neither exhaustive nor comprehensive, as the mean does not lead us to know whether
the observations are close to each other or far apart. Median is a positional average and has
nothing to do with the variability of the observations in a data set. Mode is the largest occurring
value independent of the other values in the set. This leads us to conclude that a measure of
central tendency is not enough to have a clear idea about the data unless all observations are the
same. Moreover two or more data sets may have the same mean and/or median but they may be
quite different. So MCT alone do not provide enough information about the nature of the data.

To illustrate this let us consider the following three data sets: the price of a certain commodity in
four Maya cities in five different months.

Month

January February March April May

A 30 30 30 30 30
City
B 28 29 31 30 32

C 15 5 55 45 30

D 3 5 37 30 75

Now if we calculate the mean and median for each of the city, we will come up with the value
30. This value implies that, the price of the commodity in the four cities A, B, C and D, on
average, is the same. That is the average price of the commodity in the four cities is the same.
But by inspection, it is apparent that the price of the commodity in the cities differs remarkably
from one another. For city A, it is right, for city B more or less it is ok, but for city C and D it is
not realistic to say the price of the commodity is 30. This means, just only by looking at the

Prepared by:-MILLION W. Page 36


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

average we cannot talk about the data set confidently. So, along with the average values
(measures of central tendency), we have to study the scatterdness or dispersion of the data.
Dispersion or variation may be defined as the extent of scatterdness of value around the
measures of central tendency. Thus measure of dispersion tells us the extent to which the values
of a variable vary about the measure of central tendency.

4.1. Objectives of Measures of Variation


1. To have an idea about the reliability of the measure of central tendency. If the degree
of scatterdness is large, an average is less reliable. If the value of the dispersion is small,
it indicates that a central value is a good representative of all the values in the data set.
2. To compare two or more sets of data with regard to their variability. Two or more
data sets can be compared by calculating the same measure of dispersion having the same
unit of measurement. A set with smaller value posses less variability or is more uniform
(or more consistent).
3. To provide information about the structure the data. A value of a measure of
dispersion gives an idea about the spread of the observations. Further, one can surmise
about the limits of the expansion of the values in the data set.
4. To pave way to the use of other statistical measures. Measures of dispersion,
especially variance and standard deviation, lead to many statistical techniques like
correlation, regression, analysis of variance.
4.2. Types of Measures of Variation
Absolute measures of variation: A measure of variation is said to be an absolute form when it
shows the actual amount of variation of an item from a measure of central tendency and are
expressed in concrete units in which the data have been expressed.
Relative measure of variation: It is the quotient obtained by dividing the absolute measure by a
quantity in respect to which absolute deviation has been computed. Relative measure of variation
is a pure number and used for making comparisons between different distributions.
Absolute Measures Relative Measures
Range Coefficient of Range
Quartile Déviation Coefficient of Quartile Deviation
Mean Deviation Coefficient of Mean Deviation
Variance and Standard Deviation Coefficient of Variation
Standard Scores

Prepared by:-MILLION W. Page 37


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

4.2.1. Range
It is the simplest and crudest measure of dispersion. Range is defined as the difference between
the largest and the smallest values in the data.

Ungrouped Data: R=L-S


Grouped Data: R=UCLlast-LCLfirst
Coefficient of Range (CR)
LS
For raw data: CR=
LS
UCLlast  LCL first
For grouped data: CR=
UCLlast  LCL first

Range hardly satisfies any property of good measure of dispersion as it is based on two
extreme values only, ignoring the others. It is not liable to further algebraic treatment.
4.2.2. Quartile Deviation
 Sometimes known as Semi-interquartile Range (SIR)
 Interquartile Range=Q3-Q1
Q3  Q1
Q  Q1
QD= 3 Coefficient of QD= Q3  Q1
2
QD involves only the middle 50% of the observations by excluding the observations below the
lower quartile and the observations above the upper quartile. Note that QD does not take into
account all the individual values occurring between Q1 and Q3. It means that, no idea about the
variation of even 50% mid values is available from this measure. Anyhow it provides some idea
if the values are uniformly distributed between Q1 and Q2. It can be cal calculated for open-
ended classes.

4.2.3. Mean Deviation


It is the arithmetic mean of the absolute values of the deviation from some measures of central
tendency usually the mean and the median of a distribution. Hence we have mean deviation
~
about the mean MD( X ) and mean deviation about the median MD( X ).
~
Ungrouped Data: MD( X )=  |XX| ~
MD( X )=
 |XX|
n n
~
Grouped Data: MD( X ) =
f |XX| ~
MD( X ) =
f |X X |
f f
Coefficient of Mean Deviation

Prepared by:-MILLION W. Page 38


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

~
MD( X ) ~ MD( X )
MD( X )= MD( X )= ~
X X
MD is not affected by extreme values. Its main drawback is that the algebraic negative signs of
the deviations are ignored. MD is minimum when the deviation is taken from median.
4.2.4. Variance and Standard Deviation
The Variance and Standard Deviation are the most superior and widely used measures of
dispersions and both measure the average dispersion of the observations around the mean.
For a population containing N elements, the population variance (  2 ) is calculated by using the

formula  2 =
(X  X ) 2

for ungrouped data and  2 =


 f (X  X ) 2

for grouped data.


N f
For a sample of n elements, the sample variance (S2) is calculated by using the formula S2=

(X  X ) 2

for ungrouped data and S = 2  f (X  X ) 2

for grouped data.


n 1  f 1
The first main demerit of variance is that its unit is the square of the unit of measurement of the
variable values. For example the sample variance of 2m, 6m and 4m is 4m2. The interpretation is
on average, each value differs from the mean by 4m2, which is completely wrong because one
thing the unit of measurement of variance is not the same as that of the data set; secondly the
variation of the data is exaggerated from two to four since it is taking the square of the
deviations.
Thus the other disadvantage of variance is, the variation of the data is exaggerated because the
deviation (difference) of each value from the mean is squared. Also it gives more weight the
extreme values as compared to those which are near to the mean value.
Standard Deviation: Standard deviation is the positive square root of variance.

Population Standard Deviation (δ) =  2

Sample Standard Deviation (S) = S 2


Standard deviation is considered to be the best measure of dispersion because the unit of
measurement is the same as the data set and the exaggeration made by variance will be
eliminated by taking the square root of it.
If the standard deviation of the data is small the values are concentrated near the mean and if it
large the values are scattered away from the mean.

Prepared by:-MILLION W. Page 39


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

Interpretation of the Standard Deviation


If the data are a sample and the distribution is normal or bell-shaped (or close to it!) or
approximately normally distributed, then the following conclusions can be reached:
 approximately 68% of the scores in the sample fall within one standard deviation of the mean i.e.
X  S will include approximately 68% of the data
 approximately 95% of the scores in the sample fall within two standard deviations of the mean
i.e. X  S will include approximately 95% of the data
 Approximately 99% of the scores in the sample fall within three standard deviations of the mean
i.e. X  S will include approximately 99.73% of the data.
Even if standard deviation is better than variance, there is however on difficulty with it. If there
are two or more distributions of different variables (having different units of measurement), there
variability cannot be compared by comparing the values of the standard deviation.
Examples:
1) Compute the variance (S2) and standard deviation(S) for the following dataset: 11, 12, 13, 14,
15, 16, 17, 18, 19, 20 and 21.
n n

 x i  ( x i ) 2 / n
2

i 1 i 1 2926  (176) 2 / 11
S2    11
n 1 10

So, S  S 2  11  3.316
2) Computing the variance & standard deviation for the data given below.
Observation(Xi) 32 36 40 44 48 Total

Frequency(fi) 2 5 8 4 1 20

fiXi 64 180 320 176 48 788

fiXi2 2048 6480 12800 7744 2304 31376

fx  ( f i xi ) 2 /  f i
2
31376  (788) 2 / 20
S2    17.31
i i

f i 1 19

So, S  S 2  17.31  4.16


3) Calculate the variance and standard deviation for the following grouped frequency
distribution.
Class intervals Frequency(fi) mi fimi fimi2

Prepared by:-MILLION W. Page 40


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

1-3 1 2 2 4

3-5 9 4 36 144

5-7 25 6 150 900

7-9 35 8 280 2240

9-11 17 10 170 1700

11-13 10 12 120 1440

13-15 3 14 42 588

fm  ( f i mi ) 2 /  f i
2
7016  (800) 2 / 100
   6.22
2 i i
S
f i 1 99
2
=6.22. So, S=√ =2.49
Properties of Variance and Standard Deviation
1. The variance and standard deviation always non-negative
2. If every value is multiplied by a constant C the new variance is S2new=C2S2old and standard
deviation is Snew=CSold
3. When a constant C is added (subtracted) to or (from) each and every value, the standard
deviation and variance remains the same.
4.2.5. Coefficient of Variation
All absolute measures of dispersion have units. If two or more distributions differ in their units
of measurement, there variability cannot be compared by any of the absolute measure given
before. Also, the size of these measures of dispersion depends up on the size of the values. That
is if the size of the values is larger, the value of the absolute measures will also be larger. Hence,
in situations where either the two or more data sets have different units of measurement, or their
means differ sufficiently in size, absolute measures fails to be appropriate.
It is a relative measure of standard deviation. The coefficient of variation is the ratio of the
standard deviation to the mean and it is expressed as percent.

CV= ×100%, for population

S
CV= ×100%, for sample
X

Prepared by:-MILLION W. Page 41


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

It is used for comparing the variability of two or more distributions. The distribution having less
CV is said to be less variable or more consistent or more uniform.
Since absolute measures depend on the units of measurement of the data, they fail to be
appropriate for comparing two or more groups if
1. The groups have different units of measurement.
2. The size of the data between the groups is not the same.
When either of these two conditions happens we have to use relative measures of variation. CV
is a unit less measure of variation and also takes into account the size of the means of the
distributions.
EX: Given Data Set A: 2 Meters, 4 Meters, 6 Meters
Data Set B: 1000 Liters, 800 Liters, 900Liters
Compare the variability of the two data sets using standard deviation and coefficient of variation.
4.2.6. Standard Score(Z-score)
It used to determine how many standard deviations a given value is above or below the
mean which is depend on whether the z-score is negative or positive.
for Population

for Sample
Example: Suppose Ablakat scored 90 on a basic statistics test in which the mean and standard
deviation of the class were 70 and 10 respectively. In the second test, Meklit scored 60 on which
the mean and standard deviation of the class were 56 and 4 respectively. Who is better of relative
to her class?
Solution:
Ablakat ==2.0
Meklit ==1.0
The score of Ablakat (90) in her class is 2 standard deviation above the mean whereas the score
of Meklit (60) in her class is 1 standard deviation above the mean. This implies that the
Ablakat‟s score is the better relative score when considered in the context of Meklit‟s score.
4.3. Moments (about the origin and about the mean)
If X is variable that assumes values X1,X2,…XN,

a. The rth moment about a number A is defined as  r =


 ( X  A) r

For grouped data,  r =


 f ( X  A) r

f

Prepared by:-MILLION W. Page 42


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

b. If A=0,  r =
X r

for ungrouped data and  r =


 fX r

for grouped data, then  r is called


N f
Moment about the origin.

c. If A= X ,  r' =
(X  X ) r

for grouped data and  r' =


 f (X  X ) r

for grouped data, then


N f
 r' is called the rth Central Moment.
For any distribution,
  0' =1 ,  1' =0 and  2' =δ2
 By using the first two moments about any arbitrary value A, the mean and variance may be
computed as X = 1 +A and δ2=  2 -( 1 )2 here  r is the rth moment about the origin.
4.4. Skewness
4.4.1. Frequency Curve
Frequency Curve is one of the graphical methods of data presentation. It is a smooth free hand
curve of frequency polygon which is a graph of line segment joining the intersection point of
class marks and frequencies.
1. Normal (Symmetrical) Curve (bell shaped curve): is a frequency curve when it looks the same to
the left and right of the central point. The distribution spread around a central tendency value in a
symmetrical pattern. That is, all observations are equally distributed about mean the distribution.
In this case
 The lengths of both tails (right and left) are the same.
 The mean median and mode are equal.
 The corresponding pairs of quartiles, deciles and percentiles are equi-distance from the
median. For example, first quartile and third quartile have the same distance from the
median.

Prepared by:-MILLION W. Page 43


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

2. Positively Skewed curve: If one or more observations are extremely large, the mean of the
distribution becomes greater than the median or mode and the distribution is said to be positively
skewed.
In this case
 The right tail is more elongated, longest tail to the right of the central point.
 More values are on the left of the mean.
 The extreme variation is towards large values (to the right).
 Smaller values are more frequent.
 Mean>Median>Mode

3. Negatively Skewed Curve: If one or more extremely small observations are present, the mean
is the smallest of the three averages, and the distribution is said to be negatively skewed.
In this case
 The left tail is more elongated.
 More observations are concentrated on the right of the mean
 The extreme variation is towards lower values (to the left).
 Larger values are more frequent than small values
 Mean<Median<Mode

Prepared by:-MILLION W. Page 44


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

Relationship between the Arithmetic mean, Median and Mode


 In a symmetrical and uni-modal distribution, mean median and mode coincide.
i.e. Mean=Median=Mode
 For a moderately skewed distribution
Mean-Mode=3(Mean-Median)
4.4.2. Measures of Skewness
Skewness is the lack of symmetry or departure (asymmetry) from the normal curve. If the
frequency curve is symmetrical then it has no skewness that is the skewness is zero.
The measure of the degree of asymmetry is called a measure of skewness. If both tails (left and
right) of a frequency curve are not equally distributed, the curve is asymmetric and is called a
skewed curve.
1. Comparing the three measures of central tendency (Mean, Median and Mode)
If mean>Median>Mode, Positively Skewed Distribution
If Mean<Median<Mode, Negatively skewed Distribution
If Mean=Median=Mode, Symmetrical (Normal) Distribution
2. The Moment Measure of Skewness(α3)
3 3
α3= =
 23 3

If α3 =0, Symmetrical
If α3>0, Positively Skewed
If α3<0, Negatively Skewed
Where  r is the rth central moment.
Example:

3. The Karl Pearson‟s Coefficient of Skewness(Skp)


X  Xˆ
Skp=
S
If Skp =0, Symmetrical
If Skp >0, Positively Skewed
If Skp <0, Negatively Skewed
4. The Bowley‟s Coefficient of Skewness (Skb)

Prepared by:-MILLION W. Page 45


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

Q3  Q1  2Q2
Skb=
Q3  Q1
If Skb =0, Symmetrical
If Skb >0, Positively Skewed
If Skb <0, Negatively Skewed
4.4.3. Kurtosis
The shape of the peak of a distribution may be sharp or flat. Kurtosis refers to the peakedness or
flatness of a certain distribution with respect to the normal distribution. It is the event to which
the curve is more peaked or more flat toped than normal.
1. If a distribution is more picked than normal, is called a leptokurtic distribution.

2. If a distribution is flat toped than normal it is called platykurtic.

3. A distribution which is neither more peaked nor flat topped than normal is called
mesokurtic.

Prepared by:-MILLION W. Page 46


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

Measures of Kurtosis
1. The coefficient of Kurtosis
Q3  Q1
K=
D9  D1
2. The Moment Measure of Kurtosis
4 4
β= =
 22  4
If β=3, Mesokurtic, β>3, Leptokurtic and β<3, Platykurtic

Prepared by:-MILLION W. Page 47


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

CHAPTER FIVE

5. ELEMENTARY PROBABILITY

5.1. Definition of Terms and concepts


As a general concept, probability is the measure of a chance that something will occur. It is a
numerical measure with a value between 0 (0%) and 1 (100%) where the probability of 0
indicates that the given event cannot occur and a probability of 1(100%) assures certainty of such
an occurrence.
Probability: is any activity whose outcome is determined by chance and uncertainty or
quantitative measure of uncertainty.
Experiment: it is an activity or a trial that leads to well-defined results called outcomes, but it is
uncertain to which result will occur.
Outcome is particular result of an experiment.
Sample space: It is the set of all possible outcomes for the experiment. Each possible outcome is
called sample point. It is denoted by S.
Examples: Define the sample space for the following probability experiments.
I. Tossing a coin: S={H, T}
II. Tossing two coins: S={HH, HT, TH, TT}
III. Rolling a die: S={1, 2, 3, 4, 5, 6}
Event: An event is a subset of the sample space in other words; an event is a set containing
sample points of a certain sample space under consideration.
Example: If we roll a fair die, then the experiment is rolling the die.
The sample space S for this experiment is S= {1, 2, 3, 4, 5, 6}
If we are interested to the outcomes of even numbers, then the event is E={2, 4, 6}.
Elementary or simple event: An event having only one- simple point is an elementary or simple
event.
Mutually exclusive events: Two events E1 and E2 are said to be mutually exclusive events if
there is no sample point which is common to both events E1 and E2. That means, E1 n E2=.
Mutually exclusive events are events, which cannot happen at the same time. Example: consider
the experiment of tossing two coins. Let E1 be an event with not heads shown, E2 be an event

Prepared by:-MILLION W. Page 48


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

with one head shown and E3 be an event with two heads shown. Are E1, E2 and E3 mutually
exclusive?
Solution
S= {HH, HT, TH, TT}
E1= {TT}
E2= {HT, TH}
E3= {HH}
E1 n E2=E2 n E3=E1 n E3=
Thus, E1 and E2, E2 and E3, E1 and E3 are mutually exclusive events.
Independent events: Two events E1 and E2 are said to be independent if the occurrence of E1 has
no effect on the occurrence of E2. That means the knowledge of event E1 has occurred given no
information about the occurrence of the event E2. If two events are not independent, they are said
to be dependent.
Equally likely outcome: In a certain experiment if each outcome in the sample space has the
same chance to be occurred, then we say that the outcome is equally likely outcomes. Example:
in throwing a fair die all possible outcomes are equally likely comes/occurred. That means the
elements of the sample space have the same chance to occur.
Set theory
Set: is any well-defined list or collection of objects.
 Null event: is an event which has no outcome of the experiment.
 Intersection of two events: let A and B are events, then the intersection of A and B is the set
of elements that are common to both A and B.
 Union of events: let A and B are two events, then the union of two events is the set of
elements that belongs to A or B or both.
 Complement of events: let A be an event, A’ is the event that occurs if A doesn‟t occurred.
 Mutually exclusive events: two events A and B are said to be mutually exclusive events if
they cannot occur together. I.e. A B= .
 Exhaustive events: events A1, A2… An are said to be exhaustive if their union gives the
sample space.

Prepared by:-MILLION W. Page 49


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

 Independent events: two events A and B are said to be independent if the occurrence or
non- occurrence of one doesn‟t affect the occurrence or non-occurrence of the others.
Concept of Set
In order to discuss the theory of probability, it is essential to be familiar with some ideas and
concepts of mathematical theory of set. A set is a collection of well-defined objects which is
denoted by capital letters like A, B, C, etc.
In describing which objects are contained in set A, two common methods are available.
These methods are:
1. Listing all objects of A. For example, A = {1, 2, 3, 4} describes the set consisting of the
positive integers 1, 2, 3 and 4.
2. Describing a set in words, for example, set A consists of all real numbers between 0 and 1,
inclusive. It can be written as A = {x : 0 ≤ x ≤1}, that is, A is the set of all x‟s where x is a
real number between 0 and 1, inclusive.
If A = {a1, a2, ..., an}, then each object ai; i = 1, 2, ..., n belonging to set A is called a member or
an element of set A, i.e., aiƐA. A set consisting all possible elements under consideration is
called a universal set (denoted by U). On the other hand, a set containing no element is called
an empty set (denoted by Ø or {}).
If every element of set A is also an element of set B,A is said to be a subset of B and write as A С
B. Every set is a subset of itself, i.e., A С A. Empty set is a subset of every set. If A С B and B С
C, then A С C. If A С B and B С A, then A and B are said to be equal.
5.2. Counting Techniques
Counting techniques are mathematical models which are used to determine the number of
possible ways of arranging or ordering objects. They are used to find solution to fix the size of
the sample space that is extremely large.
In order to determine the number of outcomes, one can use several rules of counting.
 The addition rule
 The multiplication rule
 Permutation rule
 Combination rule
a. Addition Rule: suppose there are k procedures ( p1 , p 2 ,..., p k ) in which the i th procedure can

be done in ni , i  1,2,.., k ways. Hence, the total number of ways of performing p1 or p 2

Prepared by:-MILLION W. Page 50


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

or…or p k is n1  n2  ...  nk , provided that no two procedures can be performed at the same
time or one after the other.
Example:
b) In a certain class a class representative is to be chosen from 3 female and 4 male students.
Count the ways in which a class representative can be chosen.
c) There are 2 bus and 3 train routes from city X to city Y. in how many ways can a person go
from city X to city Y?
Solution a) Here, a female representative is to be chosen in 3 ways and a male representative is
to be chosen in 4 ways. Therefore, the number of ways in which a class representative can be
chosen will be 3+4=7ways.
b. The Multiplication Rule: If a choice consists of k steps of which the first can be made in n1
ways, the second can be made in n2 ways…, the kth can be made in nk ways, then the whole
choice can be made in (n1  n2  ........  nk ) ways.

Example:
i) The instructor gives a 6 question multiple choice examinations. There are 4 possible
responses to each question. How many answer keys can be made?
ii) The personal department of large corporation wishes to issue ID card for each employees
with 4 digit numbers .How many ID cards can be prepared
A. If repetition is allowed?
B. If repetition is not allowed?
iii) There are 2 bus routes from city X to city Y and 3 train routes from city Y to city Z. in how
many ways can a person go from city X to city Z?
Solution i: There are six questions (N=6) with each 4 choice,K1=K2=…=K6=4
Total=4.4.4…..4=46
ii A). We have 10 digits numbers
K1=K2=K3=K4=10.
Total=10.10.10.10=104 =10000
B). K1=10, K2=9, k3=8, K3=7 because repetition is not allowed.
Total=10.9.8.7=5040.
c. Permutation: is the arrangement of objects in a specified order.
a. Permutation Rule1: The number of permutations of n distinct objects taken all together is n!
Where n! n  (n  1)  (n  2)  .....  3  2  1 . By definition 1!=0!=1
Examples: In how many ways can 6 persons be seat in a row? Ans: 6!=720

Prepared by:-MILLION W. Page 51


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

Exercise: Suppose that a photographer must arrange 4 people in a row for photograph. How
many different ways the arrangement can be done?
b. Permutation Rule 2: The arrangement of n objects in a specified order using r objects at a

time is called the permutation of n objects taken r objects at a time. It is written as n Pr

and the formula is

n!
n Pr 
(n  r )!
Example: in how many ways can 9 books be arranged on a shelf having 4 places?
9!
Ans: 9P4=  3024
(9  4)!
Exercise: How many flags of two colors can be formed from a piece of cloth consisting of six
different colors?
c. Permutation Rule 3: The number of permutations of n objects in which n1 are alike,n2 are
alike, ----nk are alike is given by
n!
n r 
P
n1!n2 !...  nk !
Example: How many different permutations can be made from the letters in the word
“CORRECTION”?
Solutions:
Here n  10
Of which 2 are C , 2 are O, 2 are R ,1E ,1T ,1I ,1N
 n1  2, n2  2, n3  2, n4  n5  n6  n7  1
U sin g the 3 rd rule of permutation , thereare
10!
 453600 permutations.
2!*2!*2!*1!*1!*1!*1!
Exercise: In how many different ways can the letters in the term „STATISTICS‟ be arranged?

d. Combination: is the arrangement of n-objects without regarding to order.

Combination Rule: The number of combinations of r objects selected from n objects is denoted by

 n
C
n r or   and is given by the formula:
r

Prepared by:-MILLION W. Page 52


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

 n n!
  
 r  (n  r )!*r!
Examples:
1. In how many ways a committee of 5 people to be chosen out of 9 people?
Solutions:

n9 , r 5
n n! 9!
     126 ways
 
r ( n  r )!* r! 4!* 5!

2. Among 15 clocks there are two defectives .In how many ways can an inspector chose three
of the clocks for inspection so that:
a) There is no restriction.
b) None of the defective clock is included.
c) Only one of the defective clocks is included.
d) Two of the defective clock is included.
Solutions: n=15 of which 2 are defective and 13 are non-defective , r=3

a) If there is no restriction select three clocks from 15 clocks and this can be done in :

n  15 , r  3
n n! 15!
     455 ways
 
r ( n  r )!* r! 12!* 3!

b) None of the defective clocks is included.


This is equivalent to zero defective and three non-defective, which can be done in:

 2  13 
  *    286 ways.
 0  3 
c) Only one of the defective clocks is included.
This is equivalent to one defective and two non-defective, which can be done in:

Prepared by:-MILLION W. Page 53


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

 2  13 
  *    156 ways.
1  2 
d) Two of the defective clock is included.
This is equivalent to two defective and one non defective, which can be done in:

 2  13 
  *    13 ways.
 2  3 
5.3. Definitions of Probability approaches
1. Classical (Mathematical) Probability: Suppose there are N possible outcomes in the
sample space S of an experiment. Out of these N outcomes, only n are favorable to the event
n( E ) n
E, then the probability that the event E will occur is P( E )   .
n( S ) N
Examples:
a) What is the probability of getting number 6 in rolling a die?
b) What is the probability of getting two heads in tossing two coins?
c) A family plans to have three children. Describe the sample space for all possible gender
combinations. What is the probability that the family will have two boys?
d) A die is rolled. What is the probability of getting
i. An odd number.
ii. Number greater than 3.
e) Two dice are rolled. Describe the sample space. What is the probability of getting
i. A sum of 10 or more.
ii. A pair which at least one number is 3.
iii. A sum of 8, 9 or 10.
iv. One number less than 4.
Solutions:
a) S={1, 2, 3, 4, 5, 6} and E=getting number 6={6}. Thus n(S)=6 and n(E)=1
P(E)=n(E)/n(S)=1/6
b) S={HH, HT, TH, TT}and E={HH}. Thus n(S)=4 and n(E)=1
P(E)=n(E)/n(S)=1/4

Prepared by:-MILLION W. Page 54


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

2. Empirical or Relative Frequency Probability: It is based on a relative frequency. Given a


f
frequency distribution the probability of an event being in a given class is P( E ) 
f
where f is the class frequency and ∑f= n total number of observations.

The difference between classical and empirical probability is that the former uses sample space
to determine the numerical probability while the latter is based on frequency distribution.

Example: Given the following frequency distribution.

Grade A B C D F

No of students 10 20 50 15 5

What is the probability of selecting a student who scored B?


3. Subjective Probability: calculates probability based on an educated guess or experience or
evaluation of a problem. For example a physician might say that on the basis of his/her
diagnosis, there is a 30% chance the patient will need an operation.
5.4. Properties (Rules) of Probability
1. The probability of an event always lies in between 0 and 1, inclusive. It can never be
negative or greater than one, that is,0≤ P (E) ≤1.
 P (E) =0, means it is sure that E can never happen.
 P (E) =1, means the event E is certain to occur (E occurs surely).
Example: What is the probability of getting?
a) Number 9 in rolling a die. Ans: 0
b) A number less than 7 in rolling a die. Ans: 1
2. If the probability that an event E will occur is P(E), then the probability that this event will
not occur is P(E‟), where P(Ec)=1-P(E).
3. The sum of the probabilities of each outcome in the sample space S is 1 i.e. ∑Pi=1.
Example: Rolling a die

Outcome 1 2 3 4 5 6

Probability 1/6 1/6 1/6 1/6 1/6 1/6

Prepared by:-MILLION W. Page 55


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

∑Pi=1/6+1/6+1/6+1/6+1/6+1/6=1
4. If there are two events E1 and E2, the probability that at least one of these events will occur
is the sum of the probability that each event will occur minus the probability that both events
will occur at the same time (simultaneously).
P(E1 u E2)=P(E1)+P(E2)-P(E1 n E2)
Examples:
i. Assume that there are 50 students that take the exam statistics and economics .out of this
students 20 passed in statistics, 15 passed in economics and 18 filed in both subjects. If out of
this students one student is selected at random. find the probability that the students:
A. Passed in both exams.
B. Failed only in statistics.
C. Failed in statistics and economics.
ii. An MBA applies for job in two firms X and Y. the probability of his being selected in the firm
X is 0.7 and being rejected at Y is 0.5. The probability of at least one of his applications being
rejected is 0.6. What is the probability that he will be selected in one of the firms?
Solutions: Let E=the event that the student passes in statistics.
F=the event that the student passes in economics.
From this, P (E) =20/50, P (F) =15/50, P (E‟ F‟) =18/50
A. P(E F)=P(E)+P(F)-P(EUF),but P(EUF)=1-P(EUF)‟ =1-18/50=32/50
=20/50+15/50-32/50=3/50
B. P(E‟ F)=P(F)-P(E F)=15/50-3/50=12/50
C. P(E‟UF‟)=P(E‟)+P(F‟)-P(E‟ F‟)
=30/50+35/50-18/50=47/50
ii.. Let E=the event that person is selected in firm X.

F= the event that person is selected in firm Y.


P (E) =0.7, P (E‟) =0.3
P (F‟) =0.5, P (F) =0.5
P (E‟UF‟) =0.6
P (EUF) =P (E) +P (F)-P (E F), but P (E F) =1-P (E‟UF‟) =1-0.6=0.4
=0.7+0.5-0.4 =0.8

Prepared by:-MILLION W. Page 56


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

5.5. Conditional Probability and Independence


5.5.1. Conditional Probability
When the outcome or occurrence of an event affects the outcome or occurrence of another event,
the two events are said to be dependent (conditional).
If two events, A and B, are dependent to each other, the probability of event B occurring
knowing that event A has already occurred is said to be the conditional probability of B given
P( AnB)
that event A has already occurred, P( B / A)  , and the probability of event A occurring
P( A)
knowing that event B has already occurred is said to be the conditional probability of A given
P( AnB)
that event B has already occurred, P( A / B)  .
P( B)
=>P(AnB)=P(A)P(B/A) and P(AnB)=P(B)P(A/B).
Examples:
Example: A drawer contains 4 black, 6 brown, and 8 olive socks. Two socks are selected at
random from the drawer.
(a) What is the probability that both socks are of the same color?
(b) What is the probability that both socks are olive if it is known that they are of the same?
Solution
The sample space of this experiment consists of
S = {B, Br, O}.
18 
The cardinality of S is N(S) =   , S is the possible outcome N(S) =153
2 
Then,
(a). Let A be the event that two socks selected at random are of the same color.
Then the cardinality of A is given by
 4  6  8 
N (A) =   +   +   =49
 2  2  2
N ( A) 49
Then, p (A) = = == P (A) =0.32
N ( S ) 153
And
Let B be the event that two socks selected at random are olive. Then the

Prepared by:-MILLION W. Page 57


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

Cardinality of B is given by
8  N ( B ) 28
N (B) =   =28, then P (B) = = P (B) =0.283
 2 N ( S ) 153
(b).Notice that B is the proper subset of A (B c A), Hence
P( A & B) P ( B ) 0.283
P (B/A) = = =
P ( A) P ( A) 0.32
P (B/A) = 0.884
This indicates that the conditional probability of B given that A is about 0.884.
5.5.2. Independent events
Two events are said to be independent if the occurrence of one does not affect the occurrence of
the other. If A and B are independent, the probability of A occurring is in no way affected by
event B having occurred or vice versa. Hence, P (A B) =P (A).P (B)
Theorem: Let A and B be independent events, then
I. A and B‟ are independent.
II. A‟ and B‟ are independent.
III. A‟ and B are independent.
IV. P(A/B)=P(A), P(B) >0 and P(B/A)=P(B), P(A)>0
Proof
I. P(A B )=P(A)-P(A B)
= P (A)-P (A)* (B)
=P (A) [1-P (B)]
=P (A). P (B )
So, they are independent.
II. P(A B )=P(A B)
=1-P (A B)
=1-[P (A) +P (B)-P (A B)]
=1-P (A) -P (B) +P (A B)
=P (A )-P (B) +P (A).P (B)
= P (A )-P (B) [1-P (A)]
= P (A ) - P (B). P (A )
= P (A ) [1-P (B)]

Prepared by:-MILLION W. Page 58


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

= P (A ).P (B )
So, they are independent.
Example:
i. An urn contains 6 white and 3 black balls. Three balls are drawn. What is the probability that
all the drawn balls will be black?
A. If the selection is done with replacement.
B. If the selection is without replacement.
Solutions
1. Total balls N=9 , number of black balls n=3
Let E1= the first black ball selected, Let E2=the second black ball selected, Let E3= the third
black ball selected.
P(E1nE2nE3)=P(E1)P(E2/E1)P(E3/E1nE2)=3/9*2/8*1/7=0.0119
2. P(E1nE2nE3)=P(E1)P(E2)P(E3)=3/9*3/9*3/9
=0.0370
ii. A coin is tossed and a die is rolled. What is the probability of getting a head on the coin or
number 4 on the die?
Solution: Let A= getting a head on the coinP(A)=1/2. Let B=getting number 4 on the
dieP(B)=1/6.
P(AUB)=P(A)+P(B)-P(AnB) But P(AnB)=P(A)x P(B)
P(AUB)=1/2 +1/6 -1/2x1/6 =7/12 ==0.5833
Exercise
1. A, B, C are three mutually exclusive and exhaustive events .find p (B) if
1/3P (A) =1/2P (B) =P(C).
2. A part time student is taking two courses, namely Economics and Statistics. The probability
that the student will pass economics course is 0.60 and the probability of passing statistics
course is 0.70. The probability that the student will pass both courses is 0.50. Find the
probability that the student
a. Will pass at least one course.
b. Will fail both courses.
3. A certain travel club has 1000 members. 60%of these members are males. 45% of these
members pay by credit card when they travel including 175 females. If a member is selected
from the travel club at random, what is the probability that :

Prepared by:-MILLION W. Page 59


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

a. The member is a female.


b. The member is a female and pays in cash.
c. The member is a male or a credit card user.
d. The member pays cash if we know that the member is a female.
e. Are the sex of the member and the mode of payment statistically independent
events?
4. Two coins are tossed. What is the probability of getting two heads given that at least one coin
shows a head?
5. In a given hospital there are 50 females and 50 males. Among this 5 males and 25 females
are colorblind. If a colorblind person is chosen at random. What is the probability that is
male?

5.6. Basic Concepts of Probability distributions


5.6.1. Definition of Random Variable and probability distributions
Random Variable is a variable whose values are determined by chance or with some
probability. It is denoted by capital letter. The set consisting of all possible values of a random
variable is called range space (Rx). Commonly there are two types of random variable.
Discrete random variable: If the number of possible values of a random variable X (that is, Rx)
is finite or countable infinite.
Continuous random variable: If the random variable assumes an uncountable infinite number
of possible values and can be expressed in decimal points.

5.6.2. Probability Distribution

Probability Distribution is a listing of all possible values of a random variable together with
their corresponding probabilities. Based on the type of a random variable, a probability
distribution can be discrete or continuous.

5.6.2.1. Discrete Probability Distribution

With each possible value x i of a discrete random variable, a number p( xi )  P( X  xi ) , called

probability of x i is associated. The number p ( xi ) , i  1,2,... must satisfy the following conditions.

0  p ( xi )  1

Prepared by:-MILLION W. Page 60


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

∑P(X=xi) =1
This function p defined above is called probability mass function (pmf) of the random variable
X. the collection of pairs ( xi , p( xi )), i  1,2,... is called the probability distribution of X.
Examples:
1. Construct a probability distribution for the number of heads observed in tossing a coin
two times.
2. Construct a probability distribution for the number of heads observed in tossing a coin
three times.
3. Construct a probability distribution for the number of girls if a family plans to have four
children.

Solutions:

1. S={HH, HT, TH,TT}

Let X be the number of heads observed in tossing a coin two times. Rx={0, 1, 2}

x 0 1 2 Total
P x  14 2/ 4 ¼ 1
2. S={HHH, HHT, HTH,HTT, THH, THT, TTH, TTT}
Let X be the number of heads observed in tossing a coin three times. Rx={0, 1, 2, 3}

x 0 1 2 3 Total
P x  18 38 38 18 1

The Binomial Distribution


Binomial distribution is one of the simplest and most frequently used discrete probability
distribution and is very useful in many practical situations involving either /or types of events.
Properties of Binomial Experiment
1. Each trial has only two mutually exclusive outcomes or outcomes that can be reduced to
two. One of the outcomes is labeled as Success and the other as Failure.
2. The outcome of each trial is independent.
3. The probability of Success remains the same from trial to trial.
4. The experiment (trial) is performed for fixed number of times, say n.

Prepared by:-MILLION W. Page 61


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

Let X be the number of successes. Then X follows a binomial distribution with parameters n,
number of experiments performed and p, probability of success, and write as X~Bin(n,p).Then,
the probability of getting exactly x successes in n trials is given by:
n
P( X  x)    p x q n  x , x  0,1,2,...n .
 x
Where p is the probability of success
q=1-p is the probability of failure
n is number of trials
x is number of successes.
This is called the Binomial Distribution. The mean of a binomial distribution is E(X)=np and
variance is V(X)=npq.
Examples:1 Suppose a coin is tossed 10 times. What is the probability of getting
a) Exactly 3 heads
b) No head
c) At most 3 heads
d) At least 3 heads
e) More than 3 heads
Find the average and variance of the number of heads.
1. The probability of a man kicking into the goal is 2/3. If a person kicks 5 times, what is
the probability of scoring
a) At least one goal.
b) At most 3 goals.
Find the average, variance and standard deviation of the number of goals.
2. If the mean and variance of the binomial distribution are 4 and 2 respectively. find the
probability of:
A. Exactly two successes appear.
B. Less than two successes appear.
C. More than six successes appear.
D. At least two successes appear
Solution:

Let X be the number of heads observed in tossing a fair coin 10 times, Rx= {0, 1, 2,…, 10}

Prepared by:-MILLION W. Page 62


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

p  P( Success)  P ( Head )  1 / 2 )q  1  p  1 / 2

X ~ Bin(n  10, p  0.50)

n
 P( X  x)    p x q n  x , x  0,1,2,...,10
 x
10 
  0.5 x 0.510 x
x
10 
  0.510
x

10  1 
10

a) P( X  3)    
 3  2 

10  1 
10

b) P( X  3)    
 0  2 
c) P( X  3)  P( X  0)  P( X  1)  P( X  2)  P( X  3)
d) P( X  3)  P( X  3)  P( X  4)  ...  P( X  10)  1  P( X  3)
e) P( X  3)  P( X  4)  P( X  5)  ...  P( X  10)  1  P( X  3)

The Poisson distribution


The Poisson distribution is discrete probability distribution. It differs from binomial distribution
in the sense that it is not possible to count the number of failures even though the number of
successes is known.
Properties of Poisson distribution:
1. The probability of success, p, is very small.
2. The experiment is performed indefinitely (n is very large).
3. The average number of events per unit of time (  ) is known.
Thus, the random variable X (number of successes) has a Poisson distribution with parameter  ,
X~Poisson ( ) and the probability of getting x successes is given by
e  x
P( X  x)  , x  0,1,2,.... .
x!
where  is the average number of events per unit of time.
If X is a Poisson random variable, then E(X) =  and V(X)=  .

Prepared by:-MILLION W. Page 63


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

Examples:
1. On average a typist commits 3 errors per page. Find the probability that she will make
a) No mistake.
b) More than one mistake.
2. Customer arrive at a photocopying machine at an average rate of two every 10 minutes.
What is the probability that there will be
a) No arrivals during any period of ten minutes.
b) Exactly one arrival during these time period.
c) More than two arrivals during this time period.
Solution:

Let X be the number of errors committed,  3


3 x e 3
X  poisson3  p X  x  
x!
30 e 3
a) P X  3  P X  0 
0!
b) P X  1  P X  2  P( X  3)  ...  1  P( X  1)
5.6.2.2. Continuous Probability Distribution
A continuous probability distribution is represented by the probability density function (pdf),
having the following characteristics: suppose X is continuous on an interval [a, b].
i. f(x)≥0, for all x Є(a,b)
b
ii.  f ( x)dx  1
a
b
iii. P(a  X  b)   f ( x)dx
a

Examples:
1. Show that each of the following function is pdf.
1,0  x  1
a. f ( x)  
0, otherwise
e  x , x  0
f ( x)  
b.
0, otherwise
2. Find the value of b for the following function to be a pdf.

Prepared by:-MILLION W. Page 64


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

bx 2 ,0  x  1
f ( x)  
0, otherwise

Normal Distribution

The most often used continuous probability distribution is the normal distribution. This
distribution plays a very important role in statistical theory and practice, particularly in the area
of statistical inference and statistical quality control. Its importance is due to the fact that in
practice, the experimental results, very often seem to follow the normal distribution or bell
shaped curve.
A random variable X is said to have a normal distribution if its probability density function is
given by

1  x 2
1   
2  
f ( x)  e ,    x  ,      ,   0
 2
Where   E ( X ),  2  Variance ( X )
 and  2 are the Parameters of the Normal Distributi on.

Properties of Normal Distribution:

1. It is bell shaped and is symmetrical about its mean and it is mesokurtic. The maximum

1
ordinate is at x   and is given by f ( x) 
 2
2. It is asymptotic to the axis, i.e., it extends indefinitely in either direction from the mean.
3. It is a continuous distribution.
4. It is a family of curves, i.e., every unique pair of mean and standard deviation defines a
different normal distribution. Thus, the normal distribution is completely described by two
parameters: mean and standard deviation. .
5. Total area under the curve sums to 1, i.e., the area of the distribution on each side of the
mean is 0.5.
6. It is uni -modal, i.e., values mound up only in the center of the curve. i.e.
mean=median=mode

Prepared by:-MILLION W. Page 65


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

The probability that a random variable will have a value between any two points is equal to the
area under the curve between those points

1
1 2z 2

 f ( z)  e
2
Note: To facilitate the use of normal distribution, the following distribution known as the
standard normal distribution was derived by using the transformation

X 
Z

Properties of the Standard Normal Distribution:

Same as a normal distribution, but also...

 Mean is zero
 Variance is one
 Standard Deviation is one
 The total area under the (standard) normal curve is 1. Hence, the area to the right and left
of the center value (µ=0) of the standard normal distribution is 0.5 (as it is symmetric
about 0).
Examples:

1. Find the area under the standard normal distribution which lies
a) Between Z  0 and Z  0.96

Solution:

Area  P(0  Z  0.96)  0.3315

b) Between Z  1.45 and Z  0

Solution:

Prepared by:-MILLION W. Page 66


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

Area  P (1.45  Z  0)
 P (0  Z  1.45)
 0.4265

c) To the right of Z  0.35

Solution:

Area  P( Z  0.35)
 P(0.35  Z  0)  P( Z  0)
 P(0  Z  0.35)  P( Z  0)
 0.1368  0.50  0.6368

d) To the left of Z  0.35

Solution:

Area  P( Z  0.35)
 1  P( Z  0.35)
 1  0.6368  0.3632

e) Between Z  0.67 and Z  0.75

Solution:

Prepared by:-MILLION W. Page 67


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

Area  P(0.67  Z  0.75)


 P(0.67  Z  0)  P(0  Z  0.75)
 P(0  Z  0.67)  P(0  Z  0.75)
 0.2486  0.2734  0.5220

f) Between Z  0.25 and Z  1.25

Solution:

Area  P (0.25  Z  1.25)


 P (0  Z  1.25)  P (0  Z  0.25)
 0.3934  0.0987  0.2957

2. Find the value of Z if


a) The normal curve area between 0 and z(positive) is 0.4726

Solution

P (0  Z  z )  0.4726 and from table


P (0  Z  1.92)  0.4726
 z  1.92.....uniqueness of Areea .

b) The area to the left of z is 0.9868

Solution

P ( Z  z )  0.9868
 P ( Z  0)  P (0  Z  z )
 0.50  P (0  Z  z )
 P (0  Z  z )  0.9868  0.50  0.4868
and from table
P (0  Z  2.2)  0.4868
 z  2.2

Prepared by:-MILLION W. Page 68


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

3. A random variable X has a normal distribution with mean 80 and standard deviation 4.8.
What is the probability that it will take a value

a) Less than 87.2


b) Greater than 76.4
c) Between 81.2 and 86.0

Solution

X is normal with mean,   80, s tan dard deviation ,   4.8

X  87.2  
a) P( X  87.2)  P(  )
 
87.2  80
 P( Z  )
4.8
 P( Z  1.5)
 P( Z  0)  P(0  Z  1.5)
 0.50  0.4332  0.9332

X   76.4  
b) P( X  76.4)  P(  )
 
76.4  80
 P( Z  )
4.8
 P( Z  0.75)
 P( Z  0)  P(0  Z  0.75)
 0.50  0.2734  0.7734

81.2   X  86.0  
c) P(81.2  X  86.0)  P(   )
  
81.2  80 86.0  80
 P( Z )
4.8 4.8
 P(0.25  Z  1.25)
 P(0  Z  1.25)  P(0  Z  1.25)
 0.3934  0.0987  0.2957

Prepared by:-MILLION W. Page 69


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

4. A normal distribution has mean 62.4.Find its standard deviation if 20.0% of the area under
the normal curve lies to the right of 72.9

Solution

X  72.9  
P( X  72.9)  0.2005  P(  )  0.2005
 
72.9  62.4 10.5
 P( Z  )  P( Z  )  0.2005
 
10.5
 P (0  Z  )  0.50  0.2005  0.2995

And from table P(0  Z  0.84)  0.2995
10.5
  0.84    12.5

5. A random variable has a normal distribution with   5 .Find its mean if the probability
that the random variable will assume a value less than 52.5 is 0.6915.

Solution

52.5  
P( Z  z )  P( Z  )  0.6915
5
 P(0  Z  z )  0.6915  0.50  0.1915.
But from the table
 P(0  Z  0.5)  0.1915
52.5  
z  0.5
5
   50

Prepared by:-MILLION W. Page 70


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

CHAPTER SIX
6. Sampling Techniques
6.1.Basic concepts: population, sample, parameter, statistic, sampling frame, sampling unit
Population: It is the totality of set of subjects or things possessing certain common
characteristics that we are interested in studying. It is a collection of all the units under
investigation over a given space or time. Population should be defined on the basis of the
objective of the study by the investigator. Examples; total household population of villages,
country; the total number of plants in a field, total number of patients of a certain disease.

A sample: consists of elements selected from a population with statistical methods for the
purpose of investigation and with the aim of estimating the characteristics of the population. It is
a subgroup or part of the population selected by some method in order estimate population
characteristics.

Elementary unit (unit of analysis): an element or group of elements on which information is


required or it is the object that we observe or measure. Thus, persons, vehicles, households,
farms, animals, steel cables etc are examples of elementary units.

Sampling units: for the purpose of sample selection, the population is divided in to a finite
number of distinct, non-overlapping and identifiable units called sampling units for example in a
cluster sampling, clusters are sampling units and subjects in the cluster are elementary units.

Frame: once a population has been defined, the next step is to establish a means to access it. A
frame provides this means to access it. In its simplest form, a frame is a list of elements covering
the survey population, and serves as a base for sample selection.

Population Parameters: These are facts about the population. Since parameters are descriptions
of the population, a population can have money parameters.

Parameter: is a measure computed from all the observations in the population. Example:
population mean and population variance.

Statistic: it is a characteristic or a fact about a sample or is a descriptive measure computed from


sample observation. Sample statistics (plural of statistic) provides information about the
population. Example sample mean and sample variance.

Sampling: is a statistical process in which one can select and examine a sample instead of
considering the whole population. OR It is a valid statistical procedure of drawing a sample from
the population.
A sampling frame is a list of units or elements that defines the target.
In practice a sample can only be a collection of elements from sampling units drawn from
a sampling frame. Many times the Sampling frame and the Sampling unit are derived

Prepared by:-MILLION W. Page 71


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

from Administrative data. The definition of the unit may be based on some natural criteria
in Administrative data, e.g., Household, persons, units of product, tickets, etc.

6.2. Reasons for Sampling


Why is Sampling Important for Researchers?

In many cases sampling is the only way to determine something about the population. Some of
the major reasons why sampling is necessary are:

The destructive nature of certain tests.

The physical impossibility of checking all items in the population.

The cost of studying all the items in a population is often prohibitive.

The adequacy of sample results.

To contact the whole population would often be time consuming.

6.3. Types of Sampling Techniques


What constitutes an appropriate sample depends upon the research question(s), the research
objectives, the researcher understands of the phenomenon under study and practical constraints.
These considerations will influence whether the researcher chooses to employ probabilistic or
non-probabilistic sampling techniques. Probabilistic Sampling techniques are employed to
generate a formal or statistically representative sample. It is often used for quantitative research
when the researcher has a well-defined population to draw a sample from.
On the other hand, a non-probabilistic sampling technique is the method of choice when the
population is not created equal and some participants are more desirable in advancing the
research project´s objectives. Non-probability sampling techniques are the best approach for
qualitative research. Because the researcher seeks a strategically chosen sample, generalizability
is more of a theoretical or conceptual issue, and it is not possible to generalize back to the
population.
6.3.1. Non-probability sampling: Basic concepts and definitions
Nonprobability sampling refers to sampling techniques for which a person‟s (or event‟s or
researcher‟s focus) likelihood of being selected for membership in the sample is unknown.
Any sampling plan where it is not possible to do this is called 'non-probability sampling`.
Small-scale surveys commonly employ non-probability samples. They are usually less

Prepared by:-MILLION W. Page 72


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

complicated to set up and are acceptable when there is no intention or need to make a
statistical generalization to any population beyond the sample surveyed. Non probability
sampling is well suited for exploratory research intended to generate new ideas that will be
systematically tested later. However, if the goal is to learn about a large population, it is
imperative to avoid judgment of non-probabilistic samples in survey research. Examples of
non-probability sampling are: Convenience sampling, Quota sampling, Purposive sampling
and Snowball sampling, etc.
i. Quota sampling: Here the strategy is to obtain representative of the various elements of a
population, usually in the relative proportions in which they occur in the population. Quota
sampling is a special form of stratified sampling. According to this method, the population
is first divided into different strata. Then the number to be selected from each stratum is
decided. This number is known as quota.
ii. Purposive sampling: In purposive sampling, sampling is done with a purpose in mind. We
usually would have one or more specific predefined groups we are seeking. The principle
of selection in purposive sampling is the researcher's judgment as to typicality or interest.
A sample is built up which enables the researcher to satisfy his/her specific needs in a
research project. Accordingly, when the researcher deliberately or purposively selects
certain units for study from the population it is known as purposive selection. In this type
of selection the choice of the selector is supreme and nothing is left to chance.
iii. Convenience sampling: In many research contexts, we sample simply by asking for
volunteers. It involves choosing the nearest and almost convenient persons to act as
respondents. The process continues until the required sample size is reached. It is
sometimes used as a cheap and dirty way of doing a sample survey. You do not know
whether or not findings are representative. This is probably one of the most widely used
and least satisfactory methods of sampling. According to this system, a sample is selected
according to convenience of the field workers or researchers. The convenience may be in
respect of availability of source list and accessibility of the units. It is used when universe
or population is not clearly defined, sampling unit is not clear or a complete source list is
not available.
iv. Snowball sampling: Here the researcher identifies one or more individuals from the
population of interest. After they have been interviewed, they are used as informants to

Prepared by:-MILLION W. Page 73


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

identify other members of the population, who are themselves used as informants, and so
on. Snowball sampling is useful when there is difficulty in identifying members of the
population, e.g. when this is a clandestine group.
6.3.2. Probability Sampling: Basic Concepts and Definitions
Probability sampling refers to sampling techniques for which a person‟s (or event‟s) likelihood
of being selected for membership in the sample is known. The reason is that, in most cases,
researchers who use probability sampling techniques are aiming to identify a representative
sample from which to collect data. A representative sample is one that resembles the population
from which it was drawn in all the ways that are important for the research being conducted.
Obtaining a representative sample is important in probability sampling because a key goal of
studies that rely on probability samples is generalizability. In order to achieve generalizability, a
core principle of probability sampling is that all elements in the researcher‟s target population
have an equal chance of being selected for inclusion in the study. In research, this is the principle
of random selection. Random selection is a mathematical process that must meet two criteria.
The first criterion is that chance governs the selection process. The second is that every sampling
element has an equal probability of being selected.
There are a variety of probabilities sampling that researchers may use.
Simple random samples are the most basic type of probability sample, but their use is not
particularly common. Part of the reason for this may be the work involved in generating a simple
random sample. To draw a simple random sample, a researcher starts with a list of every single
member, or element, of his or her population of interest. This list is sometimes referred to as
a sampling frame. Once that list has been created, the researcher numbers each element
sequentially and then randomly selects the elements from which he or she will collect data. To
randomly select elements, researchers use a table of numbers that have been generated
randomly.
If properly conducted, this gives each person an equal chance of being included in the sample,
and also makes all possible combination of persons for a particular sample size equally likely.
So, random sampling is the form applied when the method of selection assures each element or
individual in the population an equal chance of being chosen. It is more suitable in more
homogeneous and comparatively larger groups. A random sample can be drawn either by lottery
method or by using Random number table. If the population is small we can easily choose a

Prepared by:-MILLION W. Page 74


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

SRS by lottery method: the units to be included in the sample are chosen by a lottery. This
lottery method can only be used if the population is not very large. If we have a large population
we can perform the same procedure using a computer or a table of a random numbers.

Lottery method
List the N individual elements of a finite population, and then take a random sample by choosing
the elements to be included in the sample one at a time without replacement, make sure that in
each of the successive drawings each of the remaining elements of the population has the same
chance of being selected. For instance; to take random sample of size 12 from a population of
N=247 we could write each of the 247 figures on a slip of paper. Mix up them thoroughly in a
bag, a box, or a hat and draw (without looking) twelve slips one at a time without replacement.
Table of random numbers

Random number tables are constructed in such a way that every number occurs with equal
chance. Further, the occurrence of any one number in a position is independent of any of the
other numbers that appear in the table. To use a table of random numbers: number N elements in
the population from 1 to N. Then turn to the table of random numbers and select a starting
number in the table. Proceeding from this number either across the row or down the column,
select and record n numbers that are less than or equal to N, from the table. The numbers in the
table may have many digits. But, we consider the first m digits, where m is the number of digits
in N.

Example: The money section of USA Today gives the 1,900 most active New York Stock
Exchange issues. The random numbers in the table below can be used to randomly select 10 of
these issues. Imagine that the issues are numbered from 0001 to 1900. Suppose we randomly
decide to start in row 1 and columns 21 through 24. The four-digit number located here is 0345.
Reading down these four columns and discarding any number exceeding 1900, we obtain the
following eight random numbers between 0001 and 1900: 0345, 1304, 0990, 1580, 1461, 1064,
0676, and 0347. To obtain our other two numbers, we proceed to row 1 and columns 26 through
29. Reading down this column, we find 1149 and 1074. To obtain the 10 stock issues, we read
down the columns and select the ones located in positions 345, 347,676,990, 1064, 1074, 1149,
1304, 1461, and 1580.

Prepared by:-MILLION W. Page 75


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

Systematic random Sampling

Stratified sampling
This involves dividing the population into a number of groups or strata, and a sample is selected
from each stratum. The elements in a stratum are supposed to be homogeneous with respect to a
given characteristic, but have different characteristic with the elements in the other strata.

Prepared by:-MILLION W. Page 76


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

After the population has been divided into strata, either a proportional or non-proportional
sample can be selected. As the name implies, a proportional sampling procedure requires that the
number of items in each stratum be in the same proportion as found in the population. In non-
proportional stratified sample, the number of items studied in each stratum is disproportionate to
the respective numbers in the population. We then weight the sample results according to the
stratum‟s proportion of the total population.

Example: suppose you want to take a sample of 200 learners from a college called
CAES to study their performance. Suppose, further, that there are six departments with
the respective number of learners as shown in Table below.

Table: Number of learners by department in CAES College

Department Total number of learners

Agricultural Economics 96

Animal science 51

Plant science 180

Dry land science 150

ABVM 81

RDAI 42

TOTAL 600

If stratified random sampling with proportional allocation is to be used for data


collection, determine the sample size to be taken from each department.
Solution:
Let N= the total number of elements in the population all the strata taken together
Ni= population size in stratum i
n= total sample size required for the study

The sample size in stratum i, ni, is given by:

( )

Prepared by:-MILLION W. Page 77


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

Accordingly, sample size for Agricultural Economics n1=( ) =32

Animal science n2=( ) =17 , Plant science n3=( ) =60

Dryland n4=( ) =50,ABVM n5=( ) =27 and

RDAI n6=( ) =14

The sum of the sample sizes becomes 200, in this case, because we round on the
decimal places to the next integer, to get benefit from the added sample size.

Cluster sampling
If the total area of interest happens to be a big one, a convenient way in which a sample can be
taken is to divide the area into a number of smaller non-overlapping areas and then to randomly
select a number of these smaller areas (usually called clusters), with the ultimate sample
consisting of all (or samples of) units in these small areas or clusters. Thus in cluster sampling
the total population is divided into a number of relatively small subdivisions which are
themselves clusters of still smaller units and then some of these clusters are randomly selected
for inclusion in the overall sample. Suppose we want to estimate the proportion of machine-parts
in an inventory which are defective. Also assume that there are 20000 machine parts in the
inventory at a given point of time, stored in 400 cases of 50 each. Now using a cluster sampling,
we would consider the 400 cases as clusters and randomly select „n’ cases and examine all the
machine-parts in each randomly selected case.
Cluster sampling, no doubt, reduces cost by concentrating surveys in selected clusters. But
certainly it is less precise than random sampling. There is also not as much information in „n’
observations within a cluster as there happens to be in „n’ randomly drawn observations. Cluster
sampling is used only because of the economic advantage it possesses; estimates based on cluster
samples are usually more reliable per unit cost.

Prepared by:-MILLION W. Page 78


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

CHAPTER SEVEN
7. SIMPLE LINEAR REGRESSION AND CORRELATION
7.1. Simple Linear Regression
Regression may be defined as the estimation of the unknown value of one variable from the
known values of one or more variables. The variable whose values are to be estimated is known
as dependent or explained variable while the variable which are used in determining the value of
the dependent variable are called independent or predictor variables. The regression study that
involves only two variables is called simple regression and the regression analysis that studies
more than two variables is called multiple linear regressions. If the relationship between the two
variables can be described by a straight line then the regression is known as linear regression
otherwise it is called non-linear. The regression analysis involving only two variables and having
a linear relationship is called Simple Linear Regression. This linear relationship between the two
variables is represented by a straight line.
Regression Line (Line of Regression): is the line that gives the best estimate of one variable for
any given value of another variable. The regression line which is used to estimate the values of Y
for any given value of X is called regression line of Y on X.
Regression Equation: is a mathematical equation that defines the relationship between two
variables.
Regression of Y on X
Model: Y= α + βX + Є
Where Y is the dependent variable
X is the independent variable
α is the intercept
β is the slope
Є is the error term
Its parameters are interpreted as follows:
 α is the value of the dependent variable when the value of the independent variable is zero.
 β is the increment in the value of the dependent variable when the value of the independent
variable increased by 1 unit. There is a direct linear relationship between the two variables if
β is positive, there is an indirect linear relationship between the two variables if β is negative,
and there is no linear relationship between the two variables if β is zero.

Prepared by:-MILLION W. Page 79


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

7.1.1. Method of Estimation


The objective in the above model is to estimate the regression parameters (α and β) using the
sample data. The most common and widely used method of estimation is called Ordinary Least
Squares (OLS) which minimizes error sum of the squares. The estimated regression model is,
therefore,
^ ^
Yˆ     X
Yˆ =Estimated value of the dependent variable.
X =Actual value of the independent variable.
^
 = is the estimated intercept.
^
 = is the estimated slope.
The estimated of the parameters can be obtained:

^ n XY   X  Y ^  ^ 
 and   Y   X
n X 2  ( X ) 2

Use of Regression analysis


Regression analysis has great practical utility in all most all scientific disciplines. Its applications
are expected to all the natural, physical and social sciences. To be specific, some of its uses may
be listed as under.
1. It helps in the formulation and determination of functional relationship: It is used for
establishing a functional relationship between two or more variables. This functional
relationship is the basis of various scientific investigations, e.g. various theories in business
and economics have been formulated by using such model.
2. It helps in establishing the cause and effect relationship of the variables: in regression
one variable is dependent and the other as independent and as such it is possible to analysis
the cause and effect relationship.
3. It helps in prediction and estimation: the basic aim of regression analysis is the
determination of functional relationship between the variables. This relationship obviously
very helpful in making prediction and estimates. Prediction of future production, price, sales,
profit, and population are necessary for efficient planning and management. Regression
analysis is widely used in the development of demand, supply, cost, and consumption curves
which are the basis of economic analysis.

Prepared by:-MILLION W. Page 80


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

7.2. Correlation Analysis


Most of the variables in economics and business area show relationship. For example, price and
supply, income and expenditure, advertising expenditure and sales. Thus in order to know the
degree or direction of such a relationship between variables, correlation analysis is important.
Correlation is a statistical tool desired towards measuring the degree of the relationship (degree
of association) between the variables. If the changes in one variable affect the change in the other
variable, then the variables are correlated. Correlation that involves only two variables is called
simple correlation.
Covariance: is a measure of the joint variation between two variables, i.e. it measures the way in
which the values of the two variables vary together. If the covariance is zero, there is no linear
relationship between the two variables. If it is negative, there is an indirect linear relationship
between them. If the covariance is positive, there is a direct linear relationship between the
variables. The sample covariance between two variables is defined as:

1   X  Y 
S xy  
n  1 
 XY  n 

Pearson’s Coefficient of Correlation (r)
The coefficient of correlation is a measure of the degree or strength of the linear association
between two variables. It is defined as a ratio of the covariance between the two variables and
the product of the standard deviations of the two variables. The sample correlation coefficient is
denoted by r and the population correlation coefficient is denoted by ρ.
S xy n XY   X  Y
r 
SxSy n X 2  ( X ) 2 n Y 2  ( Y ) 2

The value of r is always in between -1 and 1.


Spearman’s Rank Correlation

Prepared by:-MILLION W. Page 81


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

Examples
The ranks of some 10 students in two courses; Statistics and Economics are given below.
Calculate the rank correlation and interpret it.
Statistics 5 2 9 8 1 10 3 4 6 7
Economics 10 5 1 3 8 6 2 7 9 4
Interpretation of r: The value of the correlation coefficient can be positive, zero or negative,
depending on the sign of the covariance between the two variables. But, it lies the limits -1 and
+1; that is, -1≤r≤1.
 If the value of r is -1 or +1, there is a perfect negative or perfect positive linear
relationship between the variables, respectively.
 If the value of r is approximately -1 or +1, there is a strong negative or strong positive
linear relationship between the variables, respectively.
 If r is -0.5 (or approximately -0.5) or 0.5 (or approximately 0.5), there is moderate
negative or moderate positive linear relationship between the variables, respectively.
 If the value of r is near zero, there is no linear relationship between the two variables.
Properties of Correlation Analysis
1. It doesn‟t describe the cause and effect relationship.
2. It is not used for prediction and estimation.
3. It is used to study the degree or extent of relationship of the variable.
Coefficient of determination (r2)
So far, we were concerned with the problem of estimating the parameters of the regression model
and the correlation coefficient between two variables. We now consider the goodness of fit of

Prepared by:-MILLION W. Page 82


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

the estimated model to a set of data; that is, we shall find out how “well” the estimated model fits
the data.
The coefficient of determination tells how well the estimated model fits the data. For simple
linear regression (two variables case), it is defined as the square of the sample correlation
coefficient, and denoted by r2. Hence r2 measures the proportion or percentage of the variation in
the dependent variable explained by the independent variable. Generally, r2 is a nonnegative
quantity which lies in the limits 0 and 1, i.e., 0≤r2≤1. If it approaches to 1, it means a good fit and
if it approaches 0, no relationship between the variables.
Examples:
The following data are obtained in the study of age and blood pressure on six randomly selected
peoples.
Age 43 48 56 61 67 70
Blood pressure 128 120 135 143 141 152
A. Fit the regression line of blood pressure on age?
B. By how much the blood pressures change per unit change in age?
C. Compute the correlation coefficient of blood pressure and age and also coefficient of
determination and interpret them?
D. Predicate or estimate the value of blood pressure of somebody if his or her age is 80?
E. Interpret the regression coefficients?
Solution:
Since blood pressure is depends on age of individuals so the dependent variable is blood pressure
and age is independent variable.
Age(X) B.P(Y) Xi2 Yi2 Xi .Yi
43 128 1849 16384 5504
48 120 2304 14400 5760
56 135 3136 18225 7560
61 143 3721 20449 8723
67 141 4489 19881 9447
70 152 4900 23104 10640
2 2
∑ 345 ∑ 819 ∑ =20,399 ∑ =112,443 ∑ 47,634
The summarize data is
n=6 ,  xi  354 ,  yi  819 ,  xi  20,399
2

Prepared by:-MILLION W. Page 83


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

 yi 2
 112,443 ,  xiyi  47634 and  xi yi  282555
 
A. The regression line is Yˆ  B 0+ B 1X
 ∑ ∑ ∑
B 1= ∑ (∑ )
 ( ) ( )
B 1= ( ) ( )( )
=0.964
 
B 0= Y - B 1 X
= (819)/6-0.964(345)/6=81.048
 
Therefore, Yˆ  B 0+ B 1X
Yˆ  81.048+0.964Xi this is the fitted line of B.P on age.

B. Per unit change of age, the blood pressure will change by 0.964.
∑ ∑ ∑
C. r=
√[ ∑ (∑ ) ] ∑ (∑ ) ]

( ) ( )
r=
√ ( ) ( ) ( ) ( )

r =0.897, this value indicates that there is a strong relation between age and blood pressure of
individuals.
Other method of finding correlation coefficient is
S XY
r , Where, r is the correlation coefficient
S XX SYY
Sxy is the correlation between x and y
Sxx is the standard error of x
Syy is the standard error of y
And we can also determine the regression parameters using the above information
 S xy  
B 1= and then, B 0= Y - B 1 X
S XX

Exercise
1. Given the following data on supply (X) and sales (Y) of a certain commodity
Supply (X) 60 62 65 70 73 75 71
Sales (Y) 10 11 13 15 16 19 14
a) Estimate the regression equation sales on supply and interpret the coefficients.

Prepared by:-MILLION W. Page 84


INTRODUCTION TO STATISTICS HARAMAYA UNIVERSITY,2023

b) Calculate the correlation coefficient between supply and sales, and interpret it.
c) Find the coefficient of determination and interpret it.
d) Predict the amount of sales of the commodity if the supply amount is 80.
2. The following summary results are obtained from price and demand of a commodity
∑price=30 ∑demand=40 ∑(price)(demand)=214
∑(price)2=220 ∑(demand)2=340 n=5
a) Identify the dependent and independent variable.
b) Estimate the regression equation.
c) Interpret the estimated coefficients.
d) Calculate the correlation coefficient between price and demand, and interpret it.
e) Find the coefficient of determination and interpret it.
3. Given n=25, X =3.95, Y =2.03, S x2 =85.35, S y2 =98.75, S xy =90

a) Fit the regression equation Y on X.


b) Interpret the estimated coefficients.
c) Calculate the correlation coefficient and interpret it.
d) Find the coefficient of determination and interpret it.
References

1. Coolidge, F.L. (2006). Statistics: A Gentle Introduction (2nd edition).


2. David, S.M., McCabe, P. and Craig, B. (2008). Introduction to the Practice of Statistics (6th
edition). W.H. Freeman
3. Freund, J.E and Simon, G.A. (1998) Modern Elementary Statistics (9th Edition).
4. Gupta, C.B. and Gupta, V. (2004). An Introduction to Statistical Methods. Vikas Publishing
House, Pvt. Ltd, India.
5. Snedecor, G.W and Cochran, W.G. (1980). Statistical Methods (7th edition).
6. Spiegel, M.R. and Stephens, L.J. (2007). Schaum's Outline of Statistics, Schaum's Outline
Series (4th edition). McGraw-Hill.

Prepared by:-MILLION W. Page 85

You might also like