GraphPad Ebook - Essential Dos Don'Ts
GraphPad Ebook - Essential Dos Don'Ts
STATIST
ICS 101
CHAPTER 1
The Importance of Statistics in Science. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
CHAPTER 2
Don’t Start Without A Plan. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
CHAPTER 3
Don’t be a P-Hacker. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
CHAPTER 4
Don’t Add Subjects Until You Hit Significance.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
CHAPTER 5
Statistical Analysis with GraphPad Prism. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Don’t Be a P-Hacker
Important concepts:
Hypotheses and P values
Most statistical tests work by
generating not one, but two hypoth-
eses: the Null Hypothesis and the Three Types of P-Hacking
Alternative hypothesis. Before you
perform an experiment and record
your observations, you should 1 Changing the values analyzed.
understand these terms: The first kind of P-hacking involves changing the actual
values analyzed. Examples include ad hoc sample size
∙ Alternative Hypothesis (Ha): the
selection, switching to an alternate control group (if you
hypothesis that the observations
don’t like the first results and your experiment involved
are due to some real effect
two or more control groups), trying various combinations
∙ Null Hypothesis (H 0): the hypothe- of independent variables to include in a multiple regres-
sis that the observations are due sion (whether the selection is manual or automatic), trying
to random chance. analyses with and without outliers, and analyzing various
subgroups of the data.
∙ Significance level (α): the prob-
ability of rejecting H0 when it’s
actually true
2 Reanalyzing a single data set with
different statistical tests.
What is a P value?
Examples: Trying a parametric and then a nonparametric
Generally, the reason you perform test. Analyzing the raw data, then analyzing the loga-
an experiment is because you’re rithms of the data.
interested in Ha (for example, that
Include more
Stop variables in
Reporting the model.
Results
Adjust data
(e.g. divide by
body weight)
YES
Transform
Do one or more
Begin Analyze data P < 0.05? NO the data
of the following
(i.e. logarithms)
Remove
suspicious
outliers
Pick a different
group to use as
outliers
Compare a
different outcome
variable
Use a different
statistical test
hit significance may be The graph below illustrates this point via simulation. We began by
simulating two groups of data by drawing values from a Gaussian
tempting, but it is also distribution (mean=40, SD=15, but these values are arbitrary).
misleading. Both groups were simulated using exactly the same distribution,
and so have the exact same true mean value. We picked N=5
Here’s a common scenario. Rather than in each group and computed an unpaired t test (comparing the
choosing a sample size before beginning means of two groups) and recorded the P value. Then we added
a study, you simply repeat the statistical one subject to each group (so N=6) and recomputed the t test
analyses as you collect more data, and and P value. We repeated this until N=100 in each group. Then we
then: repeated the entire simulation three times. Because these sim-
ulations were done comparing two groups with identical popula-
∙ If the result is not statistically signif-
tion means, any “statistically significant” result we obtain must
icant, collect some more data, and
be a coincidence -- a Type I error.
reanalyze
The graph plots P value on the Y axis vs. sample size (per group)
∙ If the result is statistically significant,
on the X axis. The green shaded area at the bottom of the graph
stop the study
shows P values less than 0.05, so deemed “statistically significant”.
The problem with this approach is that
you’ll keep going if you don’t like the result,
but stop if you do like the result. The con-
Experiment 1 (purple) reached a P value of its curve. We would have stopped the
less than 0.05 when N=7, but the P value second (blue) experiment when N=61, and
is higher than 0.05 for all other sample the third (orange) experiment when N=92.
sizes. Experiment 2 (blue) reached a P In all three cases, we would have declared
value less than 0.05 when N=61 and also at the results to be "statistically significant".
N=88 and 89. Experiment 3 (orange) curve
Since these simulations were created for
hit a P value less than 0.05 when N=92
values where the true mean in both groups
and remained lower than this value until
was identical, any declaration of "statis-
N=100.
tical significance" is a Type I error. If the
If we followed the sequential approach, null hypothesis is true (the two population
we would have declared the results in means are identical) we expect to see this
all three experiments to be "statistically kind of Type I error in 5% of experiments (if
significant". We would have stopped when we use the traditional definition of α=0.05
N=7 in the first (purple) experiment, so so P values less than 0.05 are declared to
would never have seen the dotted parts be significant).
DO
It is important that you choose a sample size and stick with it. You’ll fool yourself if you
stop when you like the results, but keep going when you don’t. The alternative is using
specialized sequential or adaptive methods that take into account the fact that you
analyze the data as you go. To learn more about these techniques, research ‘sequential’
or ‘adaptive’ methods in advanced statistics books.
www.graphpad.com