0% found this document useful (0 votes)
0 views

Intro to Biostat (1)

The document provides an introduction to biostatistics, covering its definition, applications, and the roles of statisticians in data collection and analysis. It distinguishes between descriptive and inferential statistics, outlines the stages of statistical investigation, and defines key terms such as population, sample, and variable. Additionally, it explains different measurement scales and types of variables, emphasizing the importance of proper statistical methods in biological research.

Uploaded by

mianumer4t7
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views

Intro to Biostat (1)

The document provides an introduction to biostatistics, covering its definition, applications, and the roles of statisticians in data collection and analysis. It distinguishes between descriptive and inferential statistics, outlines the stages of statistical investigation, and defines key terms such as population, sample, and variable. Additionally, it explains different measurement scales and types of variables, emphasizing the importance of proper statistical methods in biological research.

Uploaded by

mianumer4t7
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 43

Introduction to Biostatistics

Introduction to
Biostatistics
Course objective:

• Definition and classification of


Statistics
• Definition of Some Basic terms
• Stages in statistical investigation
• Applications, uses and limitations of
Statistics
• Types of variables and measurement
scales
Biostatistics is a growing field with
applications in many areas of biology
including epidemiology, medical
sciences, health sciences,
educational research and
environmental sciences.

Biostatistics

The application of statistics to a wide


range of topics in biology.
To guide the design of an
Role of experiment or survey prior
to data collection
statisticians
To analyze data using
proper statistical procedures
and techniques

To present and interpret the


results to researchers and
other decision makers
Classification of Biostatistics

Descriptive statistics: A statistical method that is concerned


with the collection, organization, summarization, and analysis of
data from a sample of population.

Inferential statistics: A statistical method that is concerned with


the drawing conclusions/ inferring about a particular population by
selecting and measuring a random sample from the population.
Descriptive Statistics

Statistical procedures used to summarize,


organize, and simplify data. This process
should be carried out in such a way that
reflects overall findings.
• Raw data is made more manageable
• Raw data is presented in a logical form
• Patterns can be seen from organized data
This branch of statistics deals with techniques of
making conclusions about the population

Inferential statistics builds upon descriptive


statistics

Inferential The inferences are drawn from particular


properties of sample to particular properties of
population
statistics Inferential statistics are used to make
generalizations from a sample to a population.

They encompasses a variety of procedures to


ensure that the inferences are sound and
rational, even though they may not always be
correct
In short,
inferential
statistics enables
us to make
confident
decisions in the
face of
uncertainty

E.g. Antibiotics
Inferential reduce the duration
of viral throat
infections by 1-2
statistics days

Five percent of
women aged 30-49
consult their GP
each year with
heavy menstrual
bleeding
Statistical Methods
Definition of Some basic terms
Population: is the complete set of possible measurements for which inferences
are to be made.

Sample: A sample from a population is the set of measurements that are


actually collected in the course of an investigation.

Parameter: Characteristic or measure obtained from a population.

Statistic: A statistic refers to a numerical quantity computed from sample data


(e.g. the mean, the median, the maximum...).

Data: Refers to a collection of facts, values, observations, or measurements


that the variables can assume.
Statistics: is a branch of mathematics dealing with data collection, organization, analysis,
interpretation and presentation.
Sampling: The process or method of sample selection from the population.
Sample Size: The number of elements or observation to be included in the sample.
Variable: A variable is any characteristic, number, or quantity that can be measured or counted. It is
an item of interest that can take on many different numerical values.
Some examples of variables include:
Diastolic blood pressure,
heart rate, heights,
The weights,
Stage of bladder cancer patients,
Stages in statistical investigation

There are five stages 1. Collection of data 2. Organization of 3. Presentation of


or steps in any data data
statistical
investigation Overall view of what the data
The process of Includes editing,
actually looks like
obtaining classifying, and Facilitate further statistical
measurements or tabulating the data analysis
counts. collected Can be done in the form of
tables and graphs or diagrams
4. Analysis of data
• To dig out useful information for decision
making
• It involves extracting relevant information
from the data (like mean, median, mode,
range, variance. . . )
5. Interpretation of data
• Concerned with drawing conclusions from the
data collected and analyzed; and giving
meaning to analysis results
• A difficult task and requires a high degree of
skill and experience
Variable  If, as we observe a characteristic, we find
that it takes on different values in
different persons, places, or things, we
label the characteristic a variable.
 We do this for the simple reason that the
characteristic is not the same when
observed in different possessors of it.
1. Quantitative Variables
2. Qualitative Variables
3. Random Variable
1. Discrete Random Variable
2. Continuous Random Variable
Quantitative Variables
 A quantitative variable is one that can be
measured in the usual sense.
 We can, for example, obtain measurements
on the heights of adult males, the weights of
preschool children, and the ages of patients
seen in a clinic.
 These are examples of quantitative
variables.
 Measurements made on quantitative
variables convey information regarding
amount.
Qualitative Variables
Some characteristics are not capable of being measured in the sense that height, weight, and age are measured.

Many characteristics can be categorized only, as, for example, when an ill person is given a medical diagnosis, a
person is designated as belonging to an ethnic group, or a person, place, or object is said to possess or not to
possess some characteristic of interest.

In such cases measuring consists of categorizing.

We refer to variables of this kind as qualitative variables.

Measurements made on qualitative variables convey information regarding attribute.


Random Variable
Random Variable Whenever we determine the height, weight, or age of an individual, the result
is frequently referred to as a value of the respective variable.

When the values obtained arise as a result of chance factors, so that they cannot be exactly
predicted in advance, the variable is called a random variable.

An example of a random variable is adult height.

When a child is born, we cannot predict exactly his or her height at maturity.

Attained adult height is the result of numerous genetic and environmental factors.

Values resulting from measurement procedures are often referred to as observations or


measurements.
Discrete Random Variable
Variables may be characterized further as to whether they are discrete or continuous.

A discrete variable is characterized by gaps or interruptions in the values that it can


assume.

These gaps or interruptions indicate the absence of values between particular values that
the variable can assume.

The number of daily admissions to a general hospital is a discrete random variable since the
number of admissions each day must be represented by a whole number, such as 0, 1, 2, or
3.

The number of admissions on a given day cannot be a number such as 1.5, 2.997, or 3.333.
 A continuous random variable does not
possess the gaps or interruptions
characteristic of a discrete random variable.
 A continuous random variable can assume
any value within a specified relevant interval
of values assumed by the variable.
 Examples of continuous variables include
the various measurements that can be made
on individuals such as height, weight, and
skull circumference.

Continuous
Random
Variable
No matter how close together the observed heights of two people,
for example, we can, theoretically, find another person whose
height falls somewhere in between.

Because of the limitations of available measuring instruments,


however, observations on variables that are inherently continuous
are recorded as if they were discrete.

Height, for example, is usually recorded to the nearest one-


quarter, one-half, or whole inch, whereas, with a perfect
measuring device, such a measurement could be made as precise
as desired.
Sources of Data

 The raw materials of statistics can be collected from various sources.


Broadly, the sources of data are classified in terms of whether the data
are being collected by either conducting a new study or experiment or
from an existing source already collected by some other organization
beforehand
 In other words, data may be collected for the first time by conducting
a study or experiment if the objectives of the study cannot be fulfilled
on the basis of data from the existing sources. In some cases, there is
no need to conduct a new study or experiment by collecting a new set
of data because similar studies might have been conducted earlier but
some of the analysis, required for fulfilling the objectives of a new
study, might not be performed before
 We may classify the sources of data as the following types:
1. Primary data
2. Secondary Data
Primary Data:
• Primary data refer to the data being collected for the first time by either conducting a new
study or experiment which has not been analyzed before.
• Primary data may be collected by using either observational studies for obtaining descriptive
measures or analytical studies for analyzing the underlying relationships.
 Such data are usually available from one or more of the
Primary following sources:

data 1. Routinely kept records.


2. Surveys
3. Experiments
Routinely kept
records
 It is difficult to imagine any type of
organization that does not keep records
of day-to-day transactions of its
activities.
 Hospital medical records, for example,
contain immense amounts of
information on patients.
 When the need for data arises, we
should look for them first among
routinely kept records.
Surveys

If the data needed to answer a question are not available from


routinely kept records, the logical source may be a survey.

Suppose, for example, that the administrator of a clinic wishes to


obtain information regarding the mode of transportation used by
patients to visit the clinic.

If admission forms do not contain a question on mode of


transportation, we may conduct a survey among patients to obtain
this information
Experiments

Frequently the data needed to answer a question are available


only as the result of an experiment.

A nurse may wish to know which of several strategies is best for


maximizing patient compliance.
The nurse might conduct an experiment in which the different
strategies of motivating compliance are tried with different
patients.
Subsequent evaluation of the responses to the different strategies
might enable the nurse to decide which is most effective.
Secondary Data:
Secondary data refer to a set of data collected by others
sometime in the past.

The data needed to answer a question may already exist in the


form of published reports, commercially available data banks, or
the research literature.

In other words, we may find that someone else has already


asked the same question, and the answer obtained may be
applicable to our present situation.
MEASUREMENT AND
MEASUREMENT
SCALES

 Measurement:
This may be defined as the assignment of
numbers to objects or events according to a
set of rules. The various measurement
scales result from the fact that
measurement may be carried out under
different sets of rules.
 The lowest measurement scale is the nominal
The Nominal Scale scale.
 As the name implies it consists of “naming”
observations or classifying them into various
mutually exclusive and collectively
exhaustive categories.
 The practice of using numbers to distinguish
among the various medical diagnoses
constitutes measurement on a nominal scale.
 Other examples include such dichotomies as
male–female, well–sick, under 65 years of
age–65 and over, child–adult, and married–
not married.
The Ordinal Scale

Whenever observations are not only different from category to category but can
be ranked according to some criterion, they are said to be measured on an
ordinal scale.

A patients may be characterized as unimproved, improved, and much


improved. Individuals may be classified according to socioeconomic status as
low, medium, or high. The intelligence of children may be above average,
average, or below average. In each of these examples the members of any one
category are all considered equal, but the members of one category are
considered lower, worse, or smaller than those in another category, which in
turn bears a similar relationship to another category.
The Interval Scale

The interval scale is a more sophisticated scale than the nominal or


ordinal in that with this scale not only is it possible to order
measurements, but also the distance between any two measurements
is known.

We know, say, that the difference between a measurement of 20 and a


measurement of 30 is equal to the difference between measurements
of 30 and 40.
 Temperature scales like Celsius (C) and Fahrenheit (F) are measured by
using the interval scale. In both temperature measurements, 40° is equal
to 100° minus 60°. Differences make sense. But 0 degrees does not
because, in both scales, 0 is not the absolute lowest temperature.
Temperatures like -10° F and -15° C exist and are colder than 0.

 The selected zero point is not necessarily a true zero in that it does
not have to indicate a total absence of the quantity being measured
The Ratio Scale

The highest level of measurement is the ratio scale. This


scale is characterized by the fact that equality of ratios as
well as equality of intervals may be determined.

Fundamental to the ratio scale is a true zero point.

The measurement of such familiar traits as height, weight,


and length makes use of the ratio scale.
Classification of variables by measurement scales
Types of Qualitative Random
Variables

Nominal Qualitative Variables


 A nominal variable takes attributes such as names or categories that are used
for identification only but cannot be ordered or ranked.
 Examples: sex, nationality, place of residence, blood type, etc.
It is possible to order or rank
the categories of an ordinal
Ordinal variable.

Qualitative
Examples: severity of disease,
Variables level of satisfaction about the
healthcare services provided in
a community, level of
education, etc.
Discrete Quantitative
Variables
Types of • Examples: number of children ever
born, number of accidents during
Quantitative specified time intervals, number of
obese children in a family, etc.
Random
Continuous Quantitative
Variables Variables
• Examples: height, blood sugar level,
weight, waiting time in a hospital,
etc
Determine what the key terms refer to in the
following study.
 As part of a study designed to test the safety of automobiles, the National Transportation Safety Board
collected and reviewed data about the effects of an automobile crash on test dummies. Here is the
criterion they used:

 Cars with dummies in the front seats were crashed into a wall at a speed of 35 miles per hour. We want
to know the proportion of dummies in the driver’s seat that would have had head injuries, if they had
been actual drivers. We start with a simple random sample of 75 cars.
Determine what the key terms refer to in the following
study.

 A study was conducted at a local college to analyze the average cumulative GPA’s of students who
graduated last year. Fill in the letter of the phrase that best describes each of the items below.
 1. Population_____ 2. Statistic _____ 3.Sample_____ 4. Variable _____ 5. Data _____
 a) all students who attended the college last year
 b) the cumulative GPA of one student who graduated from the college last year
 c) 3.65, 2.80, 1.50, 3.90
 d) a group of students who graduated from the college last year, randomly selected
 f) all students who graduated from the college last year
 g) the average cumulative GPA of students in the study who graduated from the college last year
Name data sets that are quantitative discrete,
quantitative continuous, and qualitative.

 You go to the supermarket and purchase three cans of soup (19


ounces tomato bisque, 14.1 ounces lentil, and 19ounces Italian
wedding), two packages of nuts (walnuts and peanuts), four
different kinds of vegetable (broccoli, cauliflower, spinach, and
carrots), and two desserts (16 ounces pistachio ice cream and 32
ounces chocolate chip cookies).
 Quantitative Discrete Data
 The three cans of soup, two packages of nuts, four kinds of vegetables and
two desserts are quantitative discrete data because you count them.
 Quantitative Continuous Data
 The weights of the soups (19 ounces, 14.1 ounces, 19 ounces) are
quantitative continuous data because you measure weights as precisely as
possible.
 Qualitative Data
 Types of soups, nuts, vegetables and desserts are qualitative data because
they are categorical.

You might also like