Overview of Biostatistics
Overview of Biostatistics
► Definition of statistics
► Definition and classification of
biostatistics
► History of biostatistics
► Stages in statistical investigation
► Definition of Some Basic terms
► Applications, uses and limitations of
Statistics
► Population Vs Sample
► Types of variables
► Measurement scales
► Sampling distribution
► Normal distribution
Definition Of Statistics
“Statistics is the art and science of learning from data” (p. 5). Agresti &
Hall. (2009)
3
Statistics & Decision Making
4
History of Biostatistics
Biostatistics is the branch of applied statistics directed
toward applications in the health sciences and
biology.
Study Target
Sample population population
Variable
A variable is a characteristic or attribute that can
assume different values in different persons, places, or
things.
Example:
► Age,
► Diastolic blood pressure,
► Heart rate,
► The height of adult males,
► The weights of preschool children,
► Gender of Biostatistics students,
► Marital status of instructors at University of Gondar,
► Ethnic group of patients
Types of Variables
A. Qualitative(Categorical) variable
► A variable or characteristic which cannot be
measured in quantitative form but can only be
identified by name or categories,
► for instance place of birth, ethnic group, type of drug,
stages of breast cancer (I, II, III, or IV), degree of pain
(minimal, moderate, sever or unbearable).
► The categories should be clear cut, not overlapping,
and cover all the possibilities. For example, sex
(male or female), vital status (alive or dead), disease
stage (depends on disease), ever smoked (yes or no).
Types of Variables
B. Quantitative(Numerical) variable:
► is one that can be measured and expressed numerically.
Example:
► survival time ► number of children in a
► systolic blood pressure family
► height, age, body mass
They can be of two types index.
1. Discrete Variables
► Have a set of possible values that is either finite or countably
infinite.
► The values of a discrete variable are usually whole numbers.
► Numerical discrete data occur when the observations are
integers that correspond with a count of some sort
2. continuous variables
► A continuous variable has a set of possible values including all
values in an interval of the real line.
► No gaps between possible values.
► Each observation theoretically falls somewhere along a
continuum
Types of variables
Examples of discrete variables
► Number of pregnancies,
► The number of bacteria colonies on a plate,
► The number of cells within a prescribed area upon
microscopic examination,
► The number of heart beats within a specified time
interval,
► A mother’s history of numbers of births
( parity) and pregnancies (gravidity),
► The number of episodes of illness a patient
experiences during some time period, etc.
Types of Variables
Examples of Continuous variables
► Body mass index
► Height
► Blood pressure
► Serum cholesterol level
► Weigh,
► Age etc...
Observations are not restricted to take on certain
numerical values: Often measurements (e.g.,
height, weight, age)
Continuous data are used to report a
measurement of the individual that can take on
any value within an acceptable range
Scales of
measurement
There are four types of measurement scales:
1. Nominal scales of measurement
Example:
► Socio-economic status (very low, low, medium, high,
very high)
► severity(mild, moderate, sever)
► blood pressure (very low, low, high, very high) etc.
Scale of
Measurement
3.Interval Scales of Measurement
► Possible to categorize, rank and tell the real distance
between any two measurements
► Zero is not absolute
Example:
► Body temperature in degree F. and Celsius
(measured in degrees).
► It is a meaningful difference
Scale of Measurement
4.Ratio scales of Measurement
► volume
► height
► weight
► length
► time until death, etc...
Diagram of Variables and Scale of Measurement
Primary scale of
Measurements
Statistic
s
Exercise 1
The following are list of different attributes/ variables or data.
Classify the variables/data in to different measurement scales
1. Your checking account number as a name for your
account.
2. Your score on Bio-statistics test as a measure of
your knowledge of Bios-tatistics.
3. A response to the statement ”Abortion is a woman’s
right” where ”Strongly Disagree” = 1, ”Disagree” = 2,
”No Opinion”
= 3, ”Agree” = 4, and ”Strongly Agree” = 5, as a
measure of attitude toward abortion.
4. Times for swimmers to complete a 50-meter race
5. Months of the year as September, October. . .
6. Economic status of a family when classified as low, middle
and upper classes.
7. Blood type of individuals as A, B, AB and O.
8. Regions of Ethiopia as region 1, region 2, region 3. . .
Assignment One
Categorize the following variables into nominal, ordinal, interval
or ratio
► Gender
► Grade(A, B, C, D and F )
► Rating scale(poor, good, excellent)
► Eye color
► Political affiliation
► Religious affiliation
► Ranking of tennis players
► Major field
► Nationality
► Height
► Weight
► Time
► Age
► IQ
► Temperature
► Salary
SAMPLING DISTRIBUTION
Sampling distribution of a sample statistic calculated
from a sample of n measurements is the probability
distribution of values taken by the statistic in all
possible samples of size n taken from the same
population
The distribution of different possible combinations
of samples that could be taken from a population
is known as a sampling distribution.
The more samples are taken, the average (i.e.
mean) of the sample means tends to equal the
mean of the population.
The sampling distribution of means also looks like
a normal distribution (Central Limit Theorem).
Normal Distribution
The normal curve is a symmetrical distribution of
scores with an equal number of scores above and
below the midpoint of the abscissa (the horizontal
axis, or ‘x-axis’, for the curve).
Assumptions
It needs to be sampled at random.
The samples should be unrelated to one another.
One sample should not impact the others.
Deductions
In the long run
Mean of our samples is likely end up as the
population mean.
Large sample, eventually leads to the normal
distribution of sample.
And we know that the variation in the sample means,
known as the standard error, is (more or less)