0% found this document useful (0 votes)

35 views172 pages

MBC Stat For Nonstat v1.0 Final

This document provides an overview of basic statistical concepts for non-statisticians. It begins with definitions of statistics and probability. Descriptive statistics such as measures of central tendency (mean, median, mode), dispersion (range, variance, standard deviation), and relative standing (percentiles) are explained. The differences between continuous and categorical data are outlined. Probability concepts like sample spaces, events, and how to compute simple probabilities are defined. The goal is to introduce key foundational statistical topics in a clear and accessible manner.

Uploaded by

Maryrose

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

35 views172 pages

MBC Stat For Nonstat v1.0 Final

Uploaded by

Maryrose

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 172

Statistics for Non-Statisticians

Kay M. Larholt, Sc.D.

Vice President, Biometrics & Clinical Operations
Abt Bio-Pharma Solutions

Topics
1) Basic Statistical Concepts
2) Study Design
3) Blinding and Randomization
4) Hypothesis testing
5) Power and Sample Size

Basic Statistical Concepts

Statistics
Per the American Heritage dictionary
The mathematics of the collection, organization,

and interpretation of numerical data, especially the

analysis of population characteristics by inference
from sampling.
Two broad areas
Descriptive Science of summarizing data
Inferential Science of interpreting data in order
to make estimates, hypothesis testing,
predictions, or decisions from the sample to
target population.

Introduction to Clinical Statistics

Statistics - The science of making decisions
in the face of uncertainty
Probability - The mathematics of uncertainty
The probability of an event is a measure of how
likely the event is to happen

Sample versus Population

Clinical Statistics
Biostatisticians are statisticians who
apply statistics to the biological
sciences.
Clinical statistics are statistics that are
applied to clinical trials

Basic Statistical Concepts

Types of data
Descriptive statistics
Graphs
Basic probability concepts
Type of probability distributions in clinical
statistics
Sample vs. population

Types of Data
Qualitative

Quantitative

GenderMale/Female

Ageinyears

EyeColorBlue,
Brown,Other
Race

Numberofchildrenin
family
Heightininches

DiabeticYes/No

AnnualSalary
WBCcount

Types of Quantitative Variables

Discrete

Continuous

Discretevariables:canonly
assumecertainvaluesand
thereareusuallygaps
betweenvalues.
Example:thenumberof
childreninafamily
(1,2,3,...)

Continuousvariables:can
assumeanyvaluewithina
specificrange.
Example:Thetimeittakes
toflyfromBostontoNew
York,priceofahouse.

Continuous Data
Data should be collected in its rawest
form. We can always categorize data later.
(We can never uncategorize data.)
e.g. If you measure prostate size as part of the
clinical trial then capture the size in mm on the CRF.

Patient
Categories
We can categorize into: 1
Between 21 and 40
0-20 mm
21-40 mm etc. later
2
Between 41 and 60

3
4
5

26
23
67

3
4
5

Patient
1

Size (mm)
24

Between 21 and 40
Between 21 and 40
Between 61 and 80

Basic Data Summarization Techniques

The objective of data summarization is to
describe the characteristics of a data set.
Ultimately, we want to make the data set
more comprehensible and meaningful.
To put data in a concise form, use

Summary descriptive statistics

Graphs
Tables

Descriptive Statistics for Continuous

Variables
Measures of central tendency
Mean, Median, Mode
Measures of dispersion
Range, Variance, Standard deviation
Measures of relative standing
Lower quartile (Q1)
Upper quartile (Q3)
Interquartile range (IQR)
:

range (IQR)

Mean
Arithmetic average: sum of all observations
divided by # of observations.
X

X
N

Example:
The average age of a group of 10 people
is 24.2 years
Who are they?

Mean
Answer:
They could be ten twenty-somethings
who go out to dinner together:
Pete aged 24, Jane aged 26, Louise aged 21, Bob aged 22, Julie aged
23, Sue aged 22, Jenn aged 27, John aged 28, Jeff aged 20 and
Mark aged 29.

The mean age for these 10 people is :

(24+26+21+22+23+22+27+28+20+29)/10
= 24.2 years

Mean

Oralternatively:
They could be Mr. & Mrs. Smith and their 8
grandchildren:

Susie aged 3, Abby aged 5, Max aged 8, Laura aged 10, Joshua aged
10, Emma aged 12, Jane aged 13, Sarah aged 18, Mrs. Smith aged
80, Mr. Smith aged 83.

The mean age for these 10 people is:

(3+5+8+10+10+12+13+18+80+83)/10=
= 24.2 years

Mean
Presenting the average alone does
not give you much information about
the data you are looking at.

Median
The midpoint of the values after they have been
ordered from the smallest to the largest, or the
largest to the smallest.
There are as many values above the median as
below it in the data array.

Median
Example
The age of the people in our data set is:

24, 26, 21, 23, 22, 27, 28, 20, 29 ( I took out one
of the 22 year olds to make this example easier)

Arranging the data in ascending order gives:

20, 21, 22, 23, 24, 26, 27, 28, 29

The median is 24

There are three kinds of lies:

lies, damned lies, and statistics.

This well-known saying is part of a phrase attributed

to Benjamin Disraeli and popularized in the U.S.
by Mark Twain

Median Home Price

Connecticut:Darien

Median home price: $1,295,000

Location: about 40 miles northeast of
midtown Manhattan
Population: 20,209, households 6,592

Properties of Mean and Median

There are unique means and medians for
each variable in the data set.
Median is not affected by extremely large
or small values and is therefore a valuable
measure of central tendency when such
values occur.
Mean is a poor measure of central
tendency in skewed distributions.

3-14

Mode
The value of the observation that appears most
frequently.
Example
The exam scores for ten students are:
81, 93, 84, 75, 68, 87, 81, 75, 81, 87.
Since the score of 81 occurs the most, the modal score is 81.

Averages and What Else?

As we have seen, just knowing the mean or
even the median of a data set does not tell
us enough about the data. We need more
information to really describe the data.

Measures of Dispersion
Once we know something about the centre
of the data we need to understand how the
data are dispersed around this centre.
How variable are the data?

Range
Maximum value in the data set minus Minimum value in the data
set
1. The age of the patients in our data set is:
21, 25, 19, 20, 22
Range = 25 19 = 6
2. The age of the patients in our data set is:
21, 45, 19, 20, 22.
Range = 45 19 = 26

When max and min are unusual values, range may be a

misleading measure of dispersion. The range only uses the 2
extreme values in the data.

Variance and Standard Deviation

The variance of a data set measures how far each
data point is from the mean of the data set.
It provides a measure of how spread out the data
points are

The Standard Deviation is the square root of the

variance

Variance and Standard Deviation

Variance:
Measure of dispersion, the square of the
deviations of the data from the mean
Standard deviation:
positive square root of the variance
Small std dev:

observations are
clustered tightly around the
mean
Large std dev:
observations are scattered widely about the
mean

Standard Deviation

xi x
s
n1

Take each observation and subtract it from the mean of the

observations
Square the answer
Sum up all the results
Divide by n-1
Take the square root

Example Standard Deviation

1. The age of the patients in our data set is:
21, 25, 19, 20, 22

19 20 21 22

Mean = 21.4, Median = 21, StdDev = 2.302

2. The age of the patients in our data set is:
21, 45, 19, 20, 22.
19

20 21 22

Mean = 25.4, Median = 21, StdDev = 11.014

Choosing an Appropriate Method of Central

Tendency
The mean is ordinarily the preferred measure of central
tendency. The mean should always be presented along
with the variance or the standard deviation
There are situations when a median might be more
appropriate:
- a skewed distribution
- a small number of subjects

Measures of Relative Standing

Descriptive measures that locate the relative

position of an observation in relation to the other
observations.

Measures of Relative Standing

The pth percentile is a number such that p% of
the observations of the data set fall below and
(100-p)% of the observations fall above it.

Lower quartile = 25th percentile (Q1)

Mid-quartile = 50th percentile (median or Q2)
Upper quartile = 75th percentile (Q3)
Interquartile range (IQR = Q3-Q1)

Measures of Relative Standing

an Example
The age of the patients in our data set is: 21, 25, 19, 20, 22
19 20 21 22

Q1 = 20, Q2 = 21, Q3 = 22, IQR = 2

The age of the patients in our data set is: 21, 45, 19, 20, 22
19

20 21 22

Q1 = 20, Q2 = 21, Q3 = 22, IQR = 2

Definitions
Statistics - The science of making decisions
in the face of uncertainty
Probability - The mathematics of
uncertainty
The probability of an event is a measure of
how likely the event is to happen

Basic Probability Concepts

Sample spaces and events

Simple probability
Joint probability

Sample Spaces
Collection of all possible outcomes
Example: All six faces of a die
Example: All 52 cards in a deck

Sample Space

Gumballsinagumballmachine
60 red
50 green
40 yellow
30 white
25 pink
20 blue
16 purple
Total: 241 gumballs

Events
Simple event

Outcome from a sample space with one characteristic

Examples: A red card from a deck of cards

A purple gumball from the gumball machine
Joint event

Involves two outcomes simultaneously

Example: An ace that is also red from a deck of cards

Events
Mutually exclusive events
Two events cannot occur together
Example: Drawing one card from a deck
A: Drawing a queen of diamonds
B: Drawing a queen of clubs
As only one of these can happen
Events A and B are mutually exclusive

Probability
1

Certain

Probability is the numerical

measure of the likelihood
that an event will occur
.5

Value is between 0 and 1

Impossible
41

Computing Probabilities
The probability of an event E:
P( E ) =

Number of event outcomes

Total number of possible outcomes in the
sample space

Assumes each of the outcomes in the sample

space is equally likely to occur

Computing Probabilities
Example:
What is the probability of rolling a 4 when you roll
a die?
# of possible outcomes in the sample space = 6
# of 4s in the sample space = 1
Prob (rolling a 4 when you roll a die) = 1/6

Computing Probabilities
Example:
What is the probability of rolling a six and a four
when you roll 2 dice?
# of possible outcomes in the sample space = 36
# of ways to roll one 6 and one 4 = 2
P(

) = 2/36 = .0555

Computing Joint Probability

The probability of a joint event, A and B:

P (A and B) = P (A B )
number of outcomes from both A and B

total number of possible outcomes in sample space

Computing Joint Probability

P (Red Card and an Ace)

2 Red Aces
Total # Cards

2/52 = 1/26

Type of Probability Distributions in Clinical

Statistics
Bernoulli
Binomial
Normal

Bernoulli Distribution
The bernoulli distribution is the coin flip
distribution.
X is bernoulli if its probability function is:

w. p. p
1
X
0 w. p. 1 p
Examples: X=1 for heads in coin toss
X=1 for male in survey
X=1 for defective in a test of product

Binomial Distribution
The binomial distribution is just n independent
bernoullis added up.
It is the number of successes in n trials.
Probability of success is usually denoted by p,
and therefore probability of failure is 1-p.
Example: Number of heads when we flip a coin 10
times. Here n = 10, p=0.5 (the probability of
getting a head when we toss the coin once).

Binomial Distribution

Thebinomialprobabilityfunction
n!
n x
PX x
p x 1 p
x! n x !
Example: X = Number of heads when we flip a
coin 10 times. Here X ~ Binomial (n = 10, p=0.5)
n! = n factorial = n.n-1.n-2..1
10!=10.9.8.7.6.5.4.3.2.1=3,628,800

Binomial Distribution
Expectation
Variance

E X np

V ( X ) np (1 p )

X = Number of heads when we flip a coin 10

times. Here X ~ Binomial (n = 10, p=0.5).
Then E(X)=5 (on average we expect to get 5
heads) and Var(X) = 2.5.

Gaussian or Normal Distribution aka Bell

Curve
Most important probability distribution in the
statistical analysis of experimental data.
Data from many different types of processes
follow a normal distribution:
Heights of American women
Returns from a diversified asset portfolio

Even when the data do not follow a normal

distribution, the normal distribution provides a
good approximation

Gaussian or Normal Distribution aka Bell

Curve
The Normal Distribution is specified by two
parameters
The mean,
The standard deviation,

Standard Normal Distribution

Characteristics of the Standard Normal

Distribution
Mean of 0 and standard deviation of 1.
It is symmetric about 0 (the mean, median
and the mode are the same).
The total area under the curve is equal to
one. One half of the total area under the
curve is on either side of zero.

Area in the Tails of Distribution

The total area under the curve that is more
than 1.96 units away from zero is equal to
5%. Because the curve is symmetrical, there
is 2.5% in each tail.

Normal Distribution
68% of observations lie within 1 std dev of
mean
95% of observations lie within 2 std dev of
mean
99% of observations lie within 3 std dev of
mean

Study Design

Sample versus Population

A population is a whole, and a sample is a
fraction of the whole.
A population is a collection of all the elements
we are studying and about which we are
trying to draw conclusions.
A sample is a collection of some, but not all,
of the elements of the population

Sample versus Population

To make generalizations from a sample, it
needs to be representative of the larger
population from which it is taken.
In the ideal scientific world, the individuals for
the sample would be randomly selected. This
requires that each member of the population
has an equal chance of being selected each
time a selection is made.

Type of Studies and Study Design

Phase I IV

Controlled vs. non-controlled studies

Single arm, parallel groups, cross-over designs,

and stratified designs

Selecting an appropriate study design

Analysis population: Intent-to-treat vs. per-protocol

Phases of Clinical Trials

Clinical trials are generally categorized into
four phases.
An investigational medicine or product may
be evaluated in two or more phases
simultaneously in different trials, and some
trials may overlap two different phases .

Phase 1 Studies Safety and Dosing

Initial safety trials in which investigators
attempt to establish the dose range tolerated
by 20-80 healthy volunteers.
Although usually conducted on healthy
volunteers, Phase 1 trials are sometimes
conducted with severely ill patients, for
example those with cancer or AIDS.

Phase 2 Studies Safety and Limited Efficacy

Pilot clinical trials to evaluate safety and efficacy
in selected populations of about 100-300
patients who have the disease or condition to be
treated, diagnosed, or prevented. Often referred
to as feasibility studies
Used as dose finding studies as different doses
and regimens are investigated

Phase 3 studies - efficacy

Large definitive studies that are carried out
once safety has been established and doses
that are likely to be effective have been found
Often called pivotal studies
FDA usually requires 2 Phase III studies for
registration

Phase 4 studies post marketing

surveillance
After the product is marketed, Phase 4
studies provide additional details about the
products safety and efficacy.
May be used to evaluate formulations,
dosages, durations of the treatment,
medicine interactions, and other factors.
Patients from various demographic groups
may be studied.

Phase 4 studies post marketing

surveillance
Important part of many Phase 4 studies:
detecting and defining previously unknown
or inadequately quantified adverse
reactions and related risk factors.
Phase 4 studies are often observational
studies rather than experimental.

Hierarchy of medical evidence

From weakest to strongest evidence
Case reports
Case series
Database studies
Observational studies
Controlled clinical trials
Randomized controlled trial

Byar, 1978

Clarke MJ Ovarian Oblation in breast cancer, 1896 to 1998: milestones along hierarchy of evidence
from case report to Cochrane review BMJ 1998; 317

Controlled studies
Studies in which a test article is compared
with a treatment that has known effects.
The control group may receive no
treatment, standard treatment or placebo.

What is a randomized clinical trial?

A prospective study in humans
Randomization
Comparable control group
Complete accounting of all cases
Carefully monitored for safety and efficacy
Adheres to regulatory requirements;
GCP,FDA, ICH guidelines

Blinded studies
Blinded study: one in which subject or the
investigator (or both) are unaware of what trial
product a subject is receiving.
Single-blind study: subjects do not know what
treatment they are receiving (active or control)
Double-blind study: neither the subjects nor the
investigators know what treatment a subject is
receiving

Analysis Populations

Intent-to-Treat Principle
Primary analysis in most randomized clinical
trials testing new therapies or devices.
Requires that any comparison among treatment
groups in a randomized clinical trials is based on
the results for all subjects in the treatment group
to which they were randomly assigned.
Full analysis: includes compliers and noncompliers

Intent-to-Treat
ITT Population includes the following:
All Randomized patients: Preserve initial
randomization
- Prevents biased comparison
- Basis for statistical tests and inference

Intent-to-Treat
Problems: Predictable or Unpredictable
Ineligible Patients allowed in the trial
Non-compliance, ie. not following the
assigned treatment
Patients refusing a trial procedure
Prohibited medication
Early withdrawal/termination
Invalid data

Intent-to-Treat
FDA guideline related to regulatory submission
states
As a general rule, even if the sponsors preferred
analysis is based on a reduced subset of the
patients with data, there should be an additional
intent-to-treat analysis using all randomized
patients.
Ref: ICH E3: Structure and Content of Clinical
Study Reports

Intent-to-Treat
When can we exclude randomized patients?
Failure to satisfy major entry criteria
Failure to take at least one dose of medication
Failure to complete procedure
Lack of any data post-randomization
Lost to follow up
Missing data randomly, not related to treatment
assignment

Intent-to-Treat
Problem: In a 6-Month study, what should be done
with the patient who drops out and provides no
further data after 2 months ?

Intent-to-Treat
Last Observation Carried Forward (LOCF)
Use last available valid observation post-baseline
on a particular variable for the missing visit through
the end of study

LOCF last observation carried forward

26
24
22

Y Data

20
18
16
14
12
10
8
Baseline

Week 1

Week 2

Week 4

Week 8

Week 12

Time

Last Observation Carried Forward (LOCF)

Biased if the early withdrawal is treatment related

Example
The primary analysis sample will be based on
the principle of intention-to-treat. All patients
who sign the written Informed Consent form,
meet the study entry criteria, and undergo
randomization will be included in the analysis,
regardless of whether or not the assigned
treatment device was implanted.

Intent-to-Treat Principle
Using the complete analysis data set:
Preserves the randomization at the time of analysis
which helps prevent bias
Provides the foundation for statistical testing.
Provides estimates of treatment effects which are
more likely to mirror those observed in clinical
practice.

Argument against ITT

An ITT, by including subjects, randomized to the
drug but who received little or no drug will dilute
the treatment effect when compared to the
placebo group

How can we improve the ITT analysis?

Careful identification of inclusion/exclusion criteria
Careful review of reasons for failure, missing data,
and exclusions
Adherence to Good Clinical Practices
Better monitoring practices to reduce the protocol
deviations and non compliance
Appropriate and detailed statistical plan and
analysis

Per-Protocol aka Evaluable patient population

Subset of ITT who are compliant with the protocol
and excluding patients who:
Major protocol violation/deviation
Use prohibited medication as per protocol
Technical or procedural failure
Lost to follow up, lack of efficacy/response
Wrong treatment assignment

Per-Protocol Population
Advantages and disadvantages:
Analysis in its pure form, completely as per the
protocol
Maximize the efficacy from new treatment
Not a conservative approach, results in bias
due to exclusion

Per-Protocol Population
Advantages and disadvantages:
May not have enough power and sample size
Both analyses are done in confirmatory trials
If the results and conclusions are the same from
two analyses, the confidence is higher.

Blinding and Randomization

Randomisation

History
The concept of randomisation was
introduced by R.A. Fisher in 1926 in the
area of agricultural research.
Previous to that clinical trials in the 18 th and
19th centuries had used controls from the
literature, other historical controls and
concurrent controls.

Randomisation

To guard against any use of judgement or

systematic arrangements i.e to avoid bias

To provide a basis for the standard methods of

statistical analysis such as significance tests

Assures that treatment groups are balanced (on

average) in all regards.

i.e. balance occurs for known prognostic

variables and for unknown or unrecorded
variables

Inferential statistics calculated from a

clinical trial make an allowance for
differences between patients and that this
allowance will be correct on average if
randomisation has been employed.

Randomisation promotes confidence that

we have acted in utmost good faith. It is
not to be used as an excuse for ignoring
the distribution of known prognostic factors.
Randomisation is essential for the effective
blinding of a clinical trial.

Non-Randomised Trials
It is difficult to obtain a reliable assessment
of treatment effect from non-randomised
studies.

Uncontrolled Trials
Medical Practice implies that a doctor
prescribes a treatment for a patient that in
his/her judgement, based on past
experience, offers the best prognosis.
Clinicians are always looking for new
therapies, improvements in therapies and
alternative therapies.

When a new treatment is proposed some

clinicians might try it on a few patients in an
uncontrolled trial.
The new treatment is studied without any
direct comparison with a similar group of
patients on more standard therapy.

Uncontrolled trials have the potential to

provide a very distorted view of therapy.
Why?

100

Laetrile
In the 1970s in the US Laetrile achieved
widespread popular support for treating
advanced cancer of all types without any formal
testing in clinical trials.
NCI tried to collect documented cases of tumour
response after Laetrile therapy. Although an
estimated 70,000 cancer patients had tried
Laetrile only 93 cases were submitted for
evaluation and 6 were judged to have a
response.

101

Laetrile
An uncontrolled trial of 178 patients found
no benefit and evidence of cyanide toxicity
The final conclusion of NCI was that
Laetrile is a toxic drug that is not effective
as a cancer treatment

102

Uncontrolled trials are much more likely to

lead to enthusiastic recommendation of the
treatment as compared with properly
controlled trials.

103

Historical Controls
Instead of randomising groups studies
compare the current patients on the new
treatment with previous patients who had
received the standard treatment.
This is a Historical Control group.

104

Major flaw: - How can we be sure that the

comparison is fair. How do we know
whether the 2 groups differ with respect to
any feature other than the treatment itself.

105

Patient Selection
Historical control group is less likely to have
clearly defined criteria for patient inclusion
because the patients on the standard
treatment were not known to be in the clinical
trial when their treatment began.
Historical controls were recruited earlier and
possibly from a different source and therefore
might be a different type of patients.
Investigator might be more restrictive in
choice of patients for new treatment

106

Concurrent Non-randomised Controls

Use some pre-determined systematic

method or investigator judgement to
assign patients to groups

107

Non-Randomised controls
Date of Birth odd/even day of birth =
new/standard treatment
Date of presentation odd/even days =
new/standard treatment
Alternate assignment odd/even patients=
new/standard treatment

108

Example
Trial of anticoagulant therapy for MI
Patients admitted on odd days of the
month received anticoagulant and patients
admitted on even days did not.

Treated

Control

589

442

109

Is it ethical to randomise?
Assuming we have sufficient supply of the
new treatment why shouldnt every new
patient be given the new treatment?

110

Tendency is to do non-randomised trial first

and then follow up with RCT.
However it is difficult to do the RCT if the
results from the non-randomised trial are
too good.

111

We assume that the new treatment has a

reasonable chance of being an improvement.

Before agreeing to enter patients into a

randomised trial the investigator must be
prepared to stay objective about the
treatments involved.

Randomised trials often produce scientific

evidence that contradicts prior beliefs.

112

Equipoise
What is equipoise and why is it
important?
A state of being equally balanced;

Clinical equipoise provides the ethical basis for

medical research involving randomly assigning
patients to different treatment arms.

113

Clinical Equipoise
Term was first used by B. Freedman in 1987, in the
article 'Equipoise and the ethics of clinical
research NEJM 1987 317(3) .
The ethics of clinical research requires equipoise
- a state of genuine uncertainty on the part of
the clinical investigator regarding the
comparative therapeutic merits of each arm in a
trial. Should the investigator discover that one
treatment is of superior therapeutic merit, he or
she is ethically obliged to offer that treatment.

114

Clinical Equipoise

Freeman suggests that as long as there is genuine

uncertainty within the expert medical community
about the preferred treatment then there can be clinical
equipoise, even if a specific investigator has a preference.

115

Randomisation

116

Randomisation
Randomised trial with two treatments, A or
B
How do we assign treatments:
Toss a coin each time: Heads = A, Tails = B
Random Numbers Table
Random Permuted Blocks

117

Flip a coin

Could flip coin for each participant

called complete randomisation or
simple randomisation
Problem: can get imbalance in groups, especially
in smaller trials
Imbalance in prognostic factors more likely
Inefficient for estimating treatment effect

118

Probability of 5 Treated and 5 Controls in

10 patients
What is the probability of getting 5 Treated patients
out of 10?
Remember the binomial distribution

119

Binomial Distribution

Thebinomialprobabilityfunction
n!
n x
PX x
p x 1 p
x! n x !
X ~ Binomial (n = 10, p=0.5)
In this case, we want x=5

120

Imbalance with 10 Participants

(#T,#C)ProbabilityEfficiency
(5,5).2461
(4,6)or(6,4).410.96
(3,7)or(7,3).234.84
(2,8)or(8,2).088.64
(1,9)or(9,1).020.36
(0,10)or(10,0).0020

121

Even if treatment balanced at end of

trial, may be unbalanced at some
time
E.g., may be balanced at end with
400 participants, but first 10 might be
CCCCTCTCTC

122

Random Permuted Blocks

To balance over time, could randomize in
blocks (called random permuted blocks)
Conceptually, for blocks of size 4: put 2 T
labels & 2 C labels in hat: for next 4
participants, draw labels at random without
replacement from hat
TTCC TCTC TCCT CTTC CTCT CCTT
all equally likely

123

TT
CC

TCTC
1234

TT
CC

CCTT
CTCT
56789101112

Forces balance after every 4

124

Randomisation by blocks 5 sites, 6 patients per

site

Patients/ 1
Sites

125

Incomplete Blocks

What happens if a site does not enroll all

the patients in a block?
What happens if multiple sites do not enroll
all the patients in a block?

126

The smaller the block size, the more often

balance is forced: e.g., in trial of 100,
blocks of size 2 force balance after every
2
A block of size 100 forces balance only
at end

127

With blocks of size 2 in an unblinded trial,

we know every second participants
assignment in advance
I can veto potential participants until I find
one I like (sick one if next assignment is
control, healthy one if next patient is
treatment)
Schulz KF Subverting Randomization in
Controlled Trials, JAMA 1995 Vol. 274

128

Even with larger blocks, in unblinded trial

you know some assignments in advance
With blocks of size 8 if first 6 are TCTTCT,
we know next 2 are C
Using a variable block size in a study
makes it harder to guess
Never include the block size in a protocol

129

Subgroup balance
Sometimes want to balance treatment
assignments within subgroups
Especially important if subgroup size is
small
E.g., with 6 diabetics in a trial, with a
complete randomisation, there is 22%
chance of 5-1 or 6-0 split!

130

Stratified Randomisation
To avoid this problem could stratify the
randomisation (use blocked randomisation
separately for factors such as diabetics &
nondiabetics)
E.g., for blocks of size 6,
Diabetics
CTTCCT

Nondiabetics
TTCTCC TCCTTC

131

Stratified Block randomisation

Typical examples of such factors are age
group, severity of condition, and
treatment centre. Stratification simply
means having separate block
randomisation schemes for each
combination of characteristics (stratum)

132

Stratified Block randomisation

For example, in a study where you

expect treatment effect to differ with
age and sex you may have four strata:
male over 65,
male under 65,
female over 65
female under 65

133

Stratification
If we believe that gender is a prognostic factor,
that is, the treatment effect for males may be
different than the treatment effect for females then
we should stratify the randomisation (and the
analysis) on gender
This does not mean that we need identical
numbers of males and females in the trial, but
rather that the males be equally distributed
between treatment and control and the females
also be equally distributed between treatment and
control

134

Stratification
Example:
In RA trials there are usually about 70% females
and 30% males.
Stratification at randomisation would help ensure
that each treatment group had about 70% females
and 30% males.
If we believe that males and females may have
different responses to treatment this would be
important.

135

Blinding

136

Blinding
Many potential problems can be avoided if
everyone involved in the study is blinded to the
actual treatment the patient is receiving.
Blinding (also called masking or concealment of
treatment) is intended to avoid bias caused by
subjective judgment in reporting, evaluation, data
processing, and analysis due to knowledge of
treatment.

137

Hierarchy of Blinding
open label: no blinding
single blind: patient blinded to treatment
double blind: patient and assessors blinded to treatment
complete blind: everyone involved in the study blinded to
treatment

138

Open Label Studies

These may be useful for
pilot studies
dose ranging studies
However knowledge of treatment can lead to:
over or under reporting of toxicity
over estimation of efficacy
Even a small fraction of patients assigned at
random to placebo will reduce these potential
problems substantially.

139

Single Blind Studies

Usually justified when it is practically
infeasible to blind the investigator
Patients should be blinded if the endpoints
are patient reported outcomes and for
safety
Where possible use blinded assessor to
elicit adverse events or patient outcomes

140

Double Blind Studies

When both the subjects and the investigators are kept
from knowing who is assigned to which treatment, the
experiment is called double blind"
Serve as a standard by which all studies are judged,
since it minimizes both potential patient biases and
potential assessor biases

141

Double Blinding:Techniques

Coded treatment groups

Sham treatments
If impossible try to use a blinded
assessor for assessing endpoints.

142

Double Blind Studies: issues

Side effects:
Side effects (observable by patient or
assessor) are much harder to blind and are
one of the major ways in which blinding is
broken
Efficacy:
A truly effective treatment can be recognized
by its efficacy in patients

143

Hypothesis Testing

144

Hypothesis Testing
Steps in hypothesis testing: state problem, define
endpoint, formulating hypothesis, - choice of statistical
test, decision rule, calculation, decision, and
interpretation
Statistical significance: types of errors, p-value, one-tail
vs. two-tail tests, confidence intervals
Significance vs. non-significance
Equivalence vs. superiority tests

145

Descriptive and inferential statistics

Descriptive statistics is devoted to the

summarization and description of data
(population or sample) .
Inferential statistics uses sample data to
make an inference about a population .

146

Objectives and Hypotheses

Objectives are questions that the trial was
designed to answer
Hypotheses are more specific than objectives
and are amenable to explicit statistical
evaluation

147

Examples of Objectives
To determine the efficacy and safety of Product
ABC in diabetic patients
To evaluate the efficacy of Product DEF in the
prevention of disease XYZ
To demonstrate that images acquired with product
GHI are comparable to images acquired with
product JKL for the diagnosis of cancer

148

How do you measure the objectives?

Endpoints need to be defined in order to
measure the objectives of a study.

149

Endpoints: Examples:

Primary Effectiveness Endpoint

Percentage of patients requiring intervention due
to pain, where an intervention is defined as :
1.
2.

Change in pain medication

Early device removal

150

Endpoints: Examples:

Primary Endpoint:
Percentage of patients with a reduction in
pain:
Reduction in the Brief Pain Inventory (BPI)
worst pain scores of 2 points at 4 weeks
over baseline.

151

Endpoints: Examples
Patient Survival
Proportion of patients surviving two years posttreatment
Average length of survival of patients posttreatment

152

Objectives and Hypotheses

Primary outcome measure
greatest importance in the study
used for sample size
More than one primary outcome measure multiplicity issues

153

Hypothesis Testing
Null Hypothesis (H0)
Status Quo
Usually Hypothesis of no difference
Hypothesis to be questioned/disproved

Alternate Hypothesis (HA)

Ultimate goal
Usually Hypothesis of difference
Hypothesis of interest

154

Hypothesis Testing
IfHois

Decision

True

False

Failto
reject

NoError

TypeIIError
()

Reject

TypeIError
()

NoError

Type I Error Societys Risk

Type II Error Sponsors Risk

155

Hypothesis testing
Null Hypothesis
No difference between Treatment and Control

Type I error aka alpha, , p-value

The probability of declaring a difference
between treatment and control groups even
though one does not exist (ie treatment is not
statistically different from control in this
experiment)
As this is societys risk it is conventionally set
at 0.05 (5%)

156

Hypothesis testing
Type II error aka beta,
The probability of not declaring a difference
between treatment and control groups even
though one does exist (ie treatment is statistically
different from control in this experiment)
1 - is the power of the study
Often set at 0.8 (80% power) however many
companies use 0.9
Underpowered studies have less probability of
showing a difference if one exists

157

Steps in Hypothesis Testing

1. Choose the null hypothesis (H0) that is to be
tested
2. Choose an alternative hypothesis (HA) that is
of interest
3. Select a test statistic, define the rejection
region for decision making about when to
reject H0
4. Draw a random sample by conducting a
clinical trial

158

Steps in Hypothesis Testing

5. Calculate the test statistic and its
corresponding p-value
6. Make conclusion according to the predetermined rule specified in step 3

159

Hypothesis Testing Normal Distribution

160

Test of Significance and p-value

Statistically significant:
Conclusion that the results of a study are
not likely to be due to chance alone.
Clinical significance is unrelated to
statistical significance

161

Test of Significance and p-value

p-value
Probability that the observed relationship (e.g.,
between variables) or a difference (e.g., between
means) in a sample occurred by pure chance and that
in the population from which the sample was drawn,
no such relationship or differences exist.
It is not the probability that given result is wrong.

162

Test of Significance and p-value

p-value
The smaller the p-value, the more likely that the
observed relation between variables in the sample is
a reliable indicator of the relation between the
respective variables in the population.

163

Test of Significance and p-value

The p-level of .05 (i.e.,1/20) indicates that there is a 5%
probability that the relation between the variables found
in our sample is by chance alone.
In other words, assuming that in the population there
was no relation between those variables whatsoever,
and we were repeating experiments like ours one after
another, we could expect that approximately in every
20 replications of the experiment there would be one in
which the relation between the variables in question
would be equal or stronger than in ours.

164

Sample versus population

165

Estimation
We use results from our sample to make
inference about the population
How reliable are the sample data at
representing the population data?
Is the sample mean a good estimation of the
population mean?

166

Confidence Intervals
The results of the analysis are estimates of
the truth in the population.
The average reduction in pain score is an
estimate based on the sample in the study.
Confidence Intervals indicate the precision
of the estimate. The wider the confidence
interval, the less precise the estimate

167

Confidence Intervals
Example:
Average reduction in pain score from baseline to month
6 was 9.7 (95% Confidence Interval: 8.3 to 11.1)
This does not mean that we are 95% sure that the
true result lies between 8.3 and 11.1, rather if we
were to repeat the study 100 times with the same
sample size and characteristics, 95 of the studies
would probably show a mean reduction in pain score
between 8.3 and 11.1

168

What have we learnt?

Statistics doesnt have to be frightening.
Statistics is all about a way of thinking
If you dont have uncertainty you dont need
statistics
p-values are probability statements that tell you
something about your experiment

169

What havent we learnt?

All the detailed theory and formulae that back up
everything we have discussed
How to be a statistician (for that you do have to go
to graduate school)
How to get the perfect answer each time we run a
clinical trial:
We are working with patients not widgets and human
beings are incredibly complex

170

References
ICH Guidelines E9, E3 and others
Statistical Issues in Drug Development Stephen
Senn 1997 John Wiley & Sons
Freeman B. Equipoise and the ethics of clinical
research NEJM 1987 317(3)
Schulz KF. Subverting Randomization in
Controlled Trials, JAMA 1995 Vol. 274

171

Thank You !
[email protected]

172

Experience Certificate of Physiotherapist
33% (3)
Experience Certificate of Physiotherapist
1 page
Basics of Statistics: Definition: Science of Collection, Presentation, Analysis, and Reasonable
100% (1)
Basics of Statistics: Definition: Science of Collection, Presentation, Analysis, and Reasonable
33 pages
Invoice
No ratings yet
Invoice
1 page
Report HR MGMT Jan2005
No ratings yet
Report HR MGMT Jan2005
89 pages
Cspu 516 Reflection Paper
No ratings yet
Cspu 516 Reflection Paper
10 pages
Rickett'S: Visual Treatment Objective (V.T.O)
No ratings yet
Rickett'S: Visual Treatment Objective (V.T.O)
4 pages
HESI Evolve MED Surg Mini Questions-Small
67% (12)
HESI Evolve MED Surg Mini Questions-Small
3 pages
Psych Nursing Careplan
100% (1)
Psych Nursing Careplan
9 pages
Quantitative Data Analysis
No ratings yet
Quantitative Data Analysis
31 pages
L2-Types of Data, Central Tendency and Dispersion-2
No ratings yet
L2-Types of Data, Central Tendency and Dispersion-2
81 pages
Introduction To Biostatistics
No ratings yet
Introduction To Biostatistics
53 pages
Unit II: Basic Data Analytic Methods
No ratings yet
Unit II: Basic Data Analytic Methods
38 pages
Basic Statistics
100% (9)
Basic Statistics
73 pages
DSHCS AhujaG
No ratings yet
DSHCS AhujaG
251 pages
Unit 4
No ratings yet
Unit 4
152 pages
Lecture 1
No ratings yet
Lecture 1
89 pages
01 Data
No ratings yet
01 Data
100 pages
Statistical Foundations - Intro 64zlf
100% (2)
Statistical Foundations - Intro 64zlf
86 pages
Introduction To Biostatistics: Data Collection Descriptive Statistics
No ratings yet
Introduction To Biostatistics: Data Collection Descriptive Statistics
33 pages
Sampling Design and Analysis MTH 494: Ossam Chohan Assistant Professor CIIT Abbottabad
No ratings yet
Sampling Design and Analysis MTH 494: Ossam Chohan Assistant Professor CIIT Abbottabad
34 pages
IL2-Describing Variation in Data
No ratings yet
IL2-Describing Variation in Data
7 pages
BOT 315 slide
No ratings yet
BOT 315 slide
20 pages
Statistics For Data Science 1
No ratings yet
Statistics For Data Science 1
65 pages
Biostatistics Notes
No ratings yet
Biostatistics Notes
47 pages
Intro SRM
No ratings yet
Intro SRM
73 pages
Dsbda Unit 2
No ratings yet
Dsbda Unit 2
155 pages
Topic 2- Descriptive_statistics
No ratings yet
Topic 2- Descriptive_statistics
36 pages
Dtatistical Measures
No ratings yet
Dtatistical Measures
54 pages
Data Management
100% (1)
Data Management
51 pages
Basic Statistics (3685) PPT - Lecture On 20-01-2019
100% (1)
Basic Statistics (3685) PPT - Lecture On 20-01-2019
64 pages
Bio Statistics 3
No ratings yet
Bio Statistics 3
13 pages
Summarizing Data
No ratings yet
Summarizing Data
49 pages
Intro to Statistics and Assignments (2)
No ratings yet
Intro to Statistics and Assignments (2)
12 pages
43hyrs Principles of Statistics 3
No ratings yet
43hyrs Principles of Statistics 3
56 pages
Class1
No ratings yet
Class1
52 pages
Ch1 Prob&Stat NEW
No ratings yet
Ch1 Prob&Stat NEW
35 pages
Basic Statistics
No ratings yet
Basic Statistics
52 pages
01_Scales of mesurement_Sumarising numeric data
No ratings yet
01_Scales of mesurement_Sumarising numeric data
26 pages
CH 3
No ratings yet
CH 3
59 pages
NITKclass 1
No ratings yet
NITKclass 1
50 pages
Week 01
No ratings yet
Week 01
71 pages
Chapter 01
No ratings yet
Chapter 01
56 pages
LEC 03 - Descriptive Statistics
No ratings yet
LEC 03 - Descriptive Statistics
42 pages
Module 2 - Statistical Foundations
No ratings yet
Module 2 - Statistical Foundations
108 pages
Stats 1 Module Updated
No ratings yet
Stats 1 Module Updated
53 pages
03 - BIOE 211 - Basic Demog and Health Indicator Formula
No ratings yet
03 - BIOE 211 - Basic Demog and Health Indicator Formula
29 pages
Safari
No ratings yet
Safari
385 pages
RM-EBBA-class-8-CH0-11-Quatitative-analysis
No ratings yet
RM-EBBA-class-8-CH0-11-Quatitative-analysis
37 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
34 pages
Stats and its Real world applications.
No ratings yet
Stats and its Real world applications.
53 pages
2 - Introduction To Statistics
No ratings yet
2 - Introduction To Statistics
97 pages
Data Management
No ratings yet
Data Management
36 pages
5 Introduction To Statistics
No ratings yet
5 Introduction To Statistics
12 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
101 pages
1-Descriptive Statistics
No ratings yet
1-Descriptive Statistics
44 pages
Brief Reminder on Statistics_rev0
No ratings yet
Brief Reminder on Statistics_rev0
128 pages
1-Descriptive Statistics
No ratings yet
1-Descriptive Statistics
44 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
38 pages
Basic Concepts of Statistics
No ratings yet
Basic Concepts of Statistics
41 pages
Chap 4 Part1 Intro Measures of Central Tendency of Ungrouped Data 1
No ratings yet
Chap 4 Part1 Intro Measures of Central Tendency of Ungrouped Data 1
74 pages
Lecture 2-Summarizing Data - HSciences Biostats - 010232en
No ratings yet
Lecture 2-Summarizing Data - HSciences Biostats - 010232en
37 pages
Biostatistics 1
No ratings yet
Biostatistics 1
19 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
26 pages
Ch.2 PPT - Descriptive Stat
No ratings yet
Ch.2 PPT - Descriptive Stat
49 pages
Data Management
No ratings yet
Data Management
48 pages
Statistics Super Review, 2nd Ed.
From Everand
Statistics Super Review, 2nd Ed.
The Editors of REA
5/5 (3)
Statistical Foundations for Psychology
From Everand
Statistical Foundations for Psychology
James C. Ware
No ratings yet
PerForM For TIP
No ratings yet
PerForM For TIP
29 pages
Case Digest Transpo
No ratings yet
Case Digest Transpo
8 pages
Apds Perform
No ratings yet
Apds Perform
16 pages
Label SF Forms
No ratings yet
Label SF Forms
1 page
List of Cases
No ratings yet
List of Cases
9 pages
30 S. 2017
No ratings yet
30 S. 2017
7 pages
Personal Finance For Mentors: 17 November 2015
No ratings yet
Personal Finance For Mentors: 17 November 2015
43 pages
Article or Study About Vincentian Spirituality
No ratings yet
Article or Study About Vincentian Spirituality
11 pages
Self-Assessment About Charity
No ratings yet
Self-Assessment About Charity
4 pages
Vincentian Philosophy
No ratings yet
Vincentian Philosophy
3 pages
Cordon Vs Balicanta
No ratings yet
Cordon Vs Balicanta
11 pages
Vincentian Philosophy
No ratings yet
Vincentian Philosophy
3 pages
Zalamea vs. Atty. de Guzman
No ratings yet
Zalamea vs. Atty. de Guzman
2 pages
As of Mar 4 Morning Ethics Digests
No ratings yet
As of Mar 4 Morning Ethics Digests
34 pages
I. Palad Vs Solis
No ratings yet
I. Palad Vs Solis
5 pages
A. Villatuya Vs Tabalingcos
No ratings yet
A. Villatuya Vs Tabalingcos
7 pages
01-Esqueda v. People GR. No. 170222 June 18, 2009
No ratings yet
01-Esqueda v. People GR. No. 170222 June 18, 2009
12 pages
Mercullo vs. Atty-Ramon
100% (1)
Mercullo vs. Atty-Ramon
2 pages
Hutchison Vs Sbma
No ratings yet
Hutchison Vs Sbma
2 pages
10-REYNOSO, IV vs. CA
50% (2)
10-REYNOSO, IV vs. CA
2 pages
006-Stockholders of F. Guanzon and Sons, Inc. vs. Register of Deeds of Manila 6 Scra 373 (1962)
No ratings yet
006-Stockholders of F. Guanzon and Sons, Inc. vs. Register of Deeds of Manila 6 Scra 373 (1962)
2 pages
02-Lasquite, Et Al. v. Victory Hills Lnc. GR. No. 175375 June 23, 2009
No ratings yet
02-Lasquite, Et Al. v. Victory Hills Lnc. GR. No. 175375 June 23, 2009
6 pages
People Vs Del Rosario
No ratings yet
People Vs Del Rosario
2 pages
Processing and Value Addition in Fruits and Vegetables Crops
100% (1)
Processing and Value Addition in Fruits and Vegetables Crops
9 pages
Impotence PC Jul 07
No ratings yet
Impotence PC Jul 07
8 pages
Occlusal Appliances Ebook
100% (1)
Occlusal Appliances Ebook
28 pages
Preview of the compassion fatigue workbook
No ratings yet
Preview of the compassion fatigue workbook
19 pages
Chapter 86:: Telogen Effluvium:: Manabu Ohyama: at - A - Glance
No ratings yet
Chapter 86:: Telogen Effluvium:: Manabu Ohyama: at - A - Glance
2 pages
Journal Appraisal - Janel
No ratings yet
Journal Appraisal - Janel
33 pages
Burn Case Study
100% (2)
Burn Case Study
9 pages
4 Water Supply Enginering Module Final (Repaired) 1
100% (2)
4 Water Supply Enginering Module Final (Repaired) 1
266 pages
Stroke Note
No ratings yet
Stroke Note
2 pages
General First Aid Quiz
No ratings yet
General First Aid Quiz
3 pages
Diathermy Pacemakers-ICDs
No ratings yet
Diathermy Pacemakers-ICDs
5 pages
Malkoff-x-Noninvasive BP-mice&rats PDF
No ratings yet
Malkoff-x-Noninvasive BP-mice&rats PDF
12 pages
Siemens Primus Brochure
No ratings yet
Siemens Primus Brochure
16 pages
Antimicrobial Activity of Several Calcium Hydroxide Preparations Root Canal Dentin
No ratings yet
Antimicrobial Activity of Several Calcium Hydroxide Preparations Root Canal Dentin
3 pages
Module 4 - Facilitate Training Session
No ratings yet
Module 4 - Facilitate Training Session
34 pages
Penatalaksanaan Sirosis Hati Dan Hepatitis
No ratings yet
Penatalaksanaan Sirosis Hati Dan Hepatitis
14 pages
HSC 405 Grant Proposal
100% (2)
HSC 405 Grant Proposal
23 pages
2012 Global Down Syndrome Foundation Press Clippings
No ratings yet
2012 Global Down Syndrome Foundation Press Clippings
28 pages
9.2 Panic Disorder: Etiology Biological Factors
No ratings yet
9.2 Panic Disorder: Etiology Biological Factors
7 pages
Detailed Lesson Plan in HEALTH
No ratings yet
Detailed Lesson Plan in HEALTH
6 pages
Cardiac Ablation Catheter
No ratings yet
Cardiac Ablation Catheter
13 pages
Herbal Remedies and Treatment
No ratings yet
Herbal Remedies and Treatment
6 pages
Introduction To Medical Robotics
No ratings yet
Introduction To Medical Robotics
83 pages
The Profession As An Art: A. Definition of Arts Arts in Health
No ratings yet
The Profession As An Art: A. Definition of Arts Arts in Health
8 pages
Brain Sciences
No ratings yet
Brain Sciences
8 pages