0% found this document useful (0 votes)
5 views

Overview of Biostatistics

The document provides an introduction to biostatistics, covering its definition, history, and classification into descriptive and inferential statistics. It outlines the stages of statistical investigation, key terms, and the applications and limitations of statistics in various fields, particularly health sciences. Additionally, it discusses types of variables, measurement scales, sampling distributions, and the normal distribution in relation to statistical analysis.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Overview of Biostatistics

The document provides an introduction to biostatistics, covering its definition, history, and classification into descriptive and inferential statistics. It outlines the stages of statistical investigation, key terms, and the applications and limitations of statistics in various fields, particularly health sciences. Additionally, it discusses types of variables, measurement scales, sampling distributions, and the normal distribution in relation to statistical analysis.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 36

By

Hawawu Hussein (PhD)

Presented by: ABDUL-RAHAMAN Mohammed

Tamale Technical University


Introduction to Biostatistics
Course objective:

► Definition of statistics
► Definition and classification of
biostatistics
► History of biostatistics
► Stages in statistical investigation
► Definition of Some Basic terms
► Applications, uses and limitations of
Statistics
► Population Vs Sample
► Types of variables
► Measurement scales
► Sampling distribution
► Normal distribution
Definition Of Statistics

“Statistics is the science of making effective use of numerical data


relating to groups of individuals or experiments.” (Wikipedia,
http://en.wikipedia.org/wiki/Statistics)

“Statistics is the art and science of learning from data” (p. 5). Agresti &
Hall. (2009)

Statistics: A field of study concerned with the


collection, organization and summarization of data,
and the drawing of inferences about a body of data
when only part of the data are observed.

Statistics is the body of methodology concerned with the art and


science of gathering, analyzing and using data to identify and solve
problems, and to make decisions.

3
Statistics & Decision Making

• Sound decisions involve collection of pertinent data and the


application of appropriate statistical techniques for extracting the
information contained in the data.

• Applied Statistics: the application of statistical


methods to solve real problems involving randomly
generated data and the development of new
statistical methodology motivated by real problems.

• Note: statistical methods should not replace critical thinking and


common sense

4
History of Biostatistics
Biostatistics is the branch of applied statistics directed
toward applications in the health sciences and
biology.

Biostatistics: The tools of statistics are employed in many


fields - business, education, psychology, agriculture, and
economics, to mention only few. When the data being
analyzed are derived from the public health data,
biological sciences and medicine, we use the term
biostatistics
Francis to distinguish
Galton this particular
(1822-1911) has beenapplication
called the of
statistical
‘Father oftools and concepts.
Biostatistics’
• He created the statistical concept ‘correlation’
• The first time used statistical tool to study
differences among human population.
• Invented the use of questionnaires and surveys for
data collection in human communities
Definition and classification of
Biostatistics
Definition of Biostatistics
Biostatistics is a growing field with applications in many
areas of biology including epidemiology, medical
sciences, health sciences, educational research and
environmental sciences.
Biostatistics: An application of statistical method to
biological phenomena.
Classification of Biostatistics
Descriptive statistics: A statistical method that is concerned
with the collection, organization, summarization, and
analysis of data from a sample of population.
Inferential statistics A statistical method that is
concerned with the drawing conclusions/infering about a
particular population by selecting and measuring a
random sample from the population.
Descriptive Statistics
•Statistical procedures used to summarise, organise, and
simplify data. This process should be carried out in such a way
that reflects overall findings.
► Raw data is made more manageable
► Raw data is presented in a logical form
► Patterns can be seen from organized data

Some statistical summaries which are


especially common in descriptive
analyses are: ► Histogram
► Measures of central tendency ► Quantile, Q-Q plot
► Measures of dispersion ► Scatter plot
► Measures of association ► Box plot
► Cross-tabulation, contingency table
Inferential Statistics
 This branch of statistics deals with techniques of
making conclusions about the population
 Inferential statistics builds upon descriptive statistics
 The inferences are drawn from particular properties of
sample to particular properties of population
 Inferential statistics are used to make generalizations
from a sample to a population.
 They encompasses a variety of procedures to ensure that
the inferences are sound and rational, even though they
may not always be correct
In short, inferential statistics enables us to make confident
decisions in the face of uncertainty
► Antibiotics reduce the duration of viral throat infections by 1-
2 days
► Five percent of women aged 30-49 consult their GP each
year with heavy menstrual bleeding
Statistical
Methods
Stages in statistical investigation
There are five stages or steps in any statistical investigation
1. Collection of data
► The process of obtaining measurements or counts.
2. Organization of data
► Includes editing, classifying, and tabulating the data
collected
3. Presentation of data
► overall view of what the data actually looks like
► facilitate further statistical analysis
► Can be done in the form of tables and graphs or
diagrams
4. Analysis of data
► To dig out useful information for decision making
► It involves extracting relevant information from the
data (like mean, median, mode, range,
variance. . . )
5. Interpretation of data
► Concerned with drawing conclusions from the data
collected and analyzed; and giving meaning to
Definition of Some basic terms
Population:is the complete set of possible measurements
for which inferences are to be made.

Census:a complete enumeration of the population. But in


most real problems it cannot be realized, hence we take
sample.

Sample: A sample from a population is the set of


measurements that are actually collected in the course
of an investigation.

Parameter:Characteristic or measure obtained from a


population.

Statistic:A statistic refers to a numerical quantity


computed from sample data (e.g. the mean, the median,
the maximum...).

Data:Refers to a collection of facts, values,


Definitions of Some basic terms
Statistics: is a branch of mathematics dealing with data
collection, organization, analysis, interpretation and
presentation.

Sampling:The process or method of sample selection


from the population.

Sample Size:The number of elements or observation to be


included in the sample.

Variable:It is an item of interest that can take on many


different numerical values.
Some examples of variables include:
► Diastolic blood pressure,
► heart rate, heights,
► The weights,
► Stage of bladder cancer patients,
Applications, Uses and Limitations of statistics
Applications of Statistics

Health Information Managers


 Evaluating healthcare data quality and
completeness.
 Analyzing trends in patient health outcomes for
decision-making.
 Assessing the efficiency of electronic health
record (EHR) systems.
Medical Laboratory Scientists
 Designing and analyzing clinical trials for
diagnostic tests.
 Evaluating the accuracy and precision of
laboratory methods.
 Studying the prevalence and distribution of
diseases.
Applications, Uses and Limitations of statistics
Nutrition and Dietetics
 Analyzing dietary patterns and their impact on
health outcomes.
 Designing nutritional interventions for
populations.
 Studying the relationship between nutrient intake
and chronic diseases.
Public Health
 Monitoring and controlling disease outbreaks.
 Evaluating public health interventions and
programs.
Uses of Statistics
The main function of statistics is to enlarge our knowledge
of complex phenomena. The following are some uses of
statistics:
i. It presents facts in a definite and precise form.
ii. Data reduction.
iii.Measuring the magnitude of variations in data.
iv.Furnishes a technique of comparison.
v. Estimating unknown population characteristics.
vi.Testing and formulating of hypothesis.
vii.Studying the relationship between two or more
variable.
viii.Forecasting future events
Limitations of statistics
As a science statistics has its own limitations. The
following are some of the limitations:
I Deals with only quantitative information.
II. Deals with only aggregate of facts and not with
individual data items.
III. Statistical data are only approximately and not
mathematical correct.
IV. Statistics can be easily misused and therefore
should be used be experts
Population Vs Sample
• Population: is a complete set of items or subjects
which can be studied
 Target population: A collection of items that have
something in common for which we wish to draw
conclusions at a particular time.
 Study Population: The specific population from which
data are collected.
Sample: A subset of the study population. (A smaller
part of that population)
Generalizability is a two-stage procedure: we want to generalize
conclusions from the sample to the study population and then
from the study population to the target population.
Example
In a study of the prevalence of malaria among
secondary students in Ghana a random sample of
Secondary students in Sagnarigu were taken.
Target Population: All secondary students
Study population: All secondary students in Ghana
Sample: secondary students in Sagnarigu

Study Target
Sample population population
Variable
A variable is a characteristic or attribute that can
assume different values in different persons, places, or
things.

Example:
► Age,
► Diastolic blood pressure,
► Heart rate,
► The height of adult males,
► The weights of preschool children,
► Gender of Biostatistics students,
► Marital status of instructors at University of Gondar,
► Ethnic group of patients
Types of Variables
A. Qualitative(Categorical) variable
► A variable or characteristic which cannot be
measured in quantitative form but can only be
identified by name or categories,
► for instance place of birth, ethnic group, type of drug,
stages of breast cancer (I, II, III, or IV), degree of pain
(minimal, moderate, sever or unbearable).
► The categories should be clear cut, not overlapping,
and cover all the possibilities. For example, sex
(male or female), vital status (alive or dead), disease
stage (depends on disease), ever smoked (yes or no).
Types of Variables
B. Quantitative(Numerical) variable:
► is one that can be measured and expressed numerically.
Example:
► survival time ► number of children in a
► systolic blood pressure family
► height, age, body mass
They can be of two types index.
1. Discrete Variables
► Have a set of possible values that is either finite or countably
infinite.
► The values of a discrete variable are usually whole numbers.
► Numerical discrete data occur when the observations are
integers that correspond with a count of some sort
2. continuous variables
► A continuous variable has a set of possible values including all
values in an interval of the real line.
► No gaps between possible values.
► Each observation theoretically falls somewhere along a
continuum
Types of variables
Examples of discrete variables
► Number of pregnancies,
► The number of bacteria colonies on a plate,
► The number of cells within a prescribed area upon
microscopic examination,
► The number of heart beats within a specified time
interval,
► A mother’s history of numbers of births
( parity) and pregnancies (gravidity),
► The number of episodes of illness a patient
experiences during some time period, etc.
Types of Variables
Examples of Continuous variables
► Body mass index
► Height
► Blood pressure
► Serum cholesterol level
► Weigh,
► Age etc...
Observations are not restricted to take on certain
numerical values: Often measurements (e.g.,
height, weight, age)
Continuous data are used to report a
measurement of the individual that can take on
any value within an acceptable range
Scales of
measurement
There are four types of measurement scales:
1. Nominal scales of measurement

► Only ”naming” and classifying observations is


possible. When numbers are assigned to
categories, it is only for coding purposes and it
does not provide a sense of size Example:

► Sex of a person (M, F)


► eye color (e.g. brown, blue)
► religion (Muslim, Christian)
► place of residence (urban, rural) etc
Scale of
Measurement
2.Ordinal Scales of Measurement
Categorization and ranking (ordering) observations is
possible
► We can talk of greater than or less than and it conveys
meaning to the value but;
► Impossible to express the real difference between
measurements in numerical terms

Example:
► Socio-economic status (very low, low, medium, high,
very high)
► severity(mild, moderate, sever)
► blood pressure (very low, low, high, very high) etc.
Scale of
Measurement
3.Interval Scales of Measurement
► Possible to categorize, rank and tell the real distance
between any two measurements
► Zero is not absolute
Example:
► Body temperature in degree F. and Celsius
(measured in degrees).

► It is a meaningful difference
Scale of Measurement
4.Ratio scales of Measurement

► he highest level of measurement scale, characterized


by the fact that equality of ratios as well as equality
of intervals can be determined
► There is a true zero point. i.e. zero is absolute
Example:

► volume
► height
► weight
► length
► time until death, etc...
Diagram of Variables and Scale of Measurement
Primary scale of
Measurements
Statistic
s
Exercise 1
The following are list of different attributes/ variables or data.
Classify the variables/data in to different measurement scales
1. Your checking account number as a name for your
account.
2. Your score on Bio-statistics test as a measure of
your knowledge of Bios-tatistics.
3. A response to the statement ”Abortion is a woman’s
right” where ”Strongly Disagree” = 1, ”Disagree” = 2,
”No Opinion”
= 3, ”Agree” = 4, and ”Strongly Agree” = 5, as a
measure of attitude toward abortion.
4. Times for swimmers to complete a 50-meter race
5. Months of the year as September, October. . .
6. Economic status of a family when classified as low, middle
and upper classes.
7. Blood type of individuals as A, B, AB and O.
8. Regions of Ethiopia as region 1, region 2, region 3. . .
Assignment One
Categorize the following variables into nominal, ordinal, interval
or ratio
► Gender
► Grade(A, B, C, D and F )
► Rating scale(poor, good, excellent)
► Eye color
► Political affiliation
► Religious affiliation
► Ranking of tennis players
► Major field
► Nationality
► Height
► Weight
► Time
► Age
► IQ
► Temperature
► Salary
SAMPLING DISTRIBUTION
Sampling distribution of a sample statistic calculated
from a sample of n measurements is the probability
distribution of values taken by the statistic in all
possible samples of size n taken from the same
population
The distribution of different possible combinations
of samples that could be taken from a population
is known as a sampling distribution.
The more samples are taken, the average (i.e.
mean) of the sample means tends to equal the
mean of the population.
The sampling distribution of means also looks like
a normal distribution (Central Limit Theorem).
Normal Distribution
The normal curve is a symmetrical distribution of
scores with an equal number of scores above and
below the midpoint of the abscissa (the horizontal
axis, or ‘x-axis’, for the curve).

Since the distribution of scores is symmetric, the


mean, median, and mode are all at the same point
on the abscissa. In other words, the mean = the
median = the mode.

If we divide the distribution up into standard


deviation units, a known proportion of scores lies
within each portion under the curve.
Normal Distribution (Central Limit Theorem)
“If repeated (simple random) samples of size N are drawn from a
normally distributed population, the means of such samples will
be normally distributed with mean  and standard error [i.e.
standard deviation] /N ... if the N of each sample drawn is large,
then regardless of the shape of the population distribution the
sample means will tend to distribute themselves normally with
mean  and standard error /N”
The distribution of sample means approximates a normal
distribution as the sample size gets larger, regardless of the
population's distribution.
Sample sizes equal to or greater than 30 are often considered
sufficient for the CLT to hold.

Assumptions
It needs to be sampled at random.
The samples should be unrelated to one another.
One sample should not impact the others.
Deductions
In the long run
Mean of our samples is likely end up as the
population mean.
Large sample, eventually leads to the normal
distribution of sample.
And we know that the variation in the sample means,
known as the standard error, is (more or less)

You might also like