0 - Research Methods 1
0 - Research Methods 1
Sample!
Population!
• If we spent enough time, we could figure out the • Our goal with inferential statistics is to be able to draw
proportion of people in the whole crowd who were valid conclusions about a large or even infinite population
wearing red shirts.! even though we’ve only measured from a small subset of
that population (a sample of the population).!
• That would again be a descriptive statistic, because it
describes the people in the entire crowd.! • However, these conclusions are only probabalistic.!
© S. J. Luck
All rights reserved
1
Percent
Correct
on
Test
Subject
S1
68
Subject
S2
51
Subject
S3
60
Subject
S4
70
Subject
S5
58
Subject
S6
67
Subject
S7
60
Subject
S8
64
Memory experiment with a sample of 8 subjects who are Sample
Mean
62.3
randomly selected from the 200 students in a cognitive
psychology class.!
Our goal is to estimate the average memory ability of the Goal: estimate the population mean!
whole class, but we don’t have time to do this with all 200,
so we just take a sample of 8 students from the class.! The sample mean is an estimate of the population mean!
- Number of subjects (N)!
They come to the lab, view the 100 words, and then come - Variability among subjects (standard deviation)!
back a week later for the memory test.!
Variability:!
Percent
Correct
on
Sample Size:! Percent
Correct
on
• In this sample, all subjects got
Test
Test
• The letter N denotes the between 51 and 70 percent
Subject
S1
68
Subject
S1
68
Subject
S2
51
number of subjects in a Subject
S2
51
correct on the memory test.!
Subject
S3
60
sample.! Subject
S3
60
Subject
S4
70
Subject
S4
70
• Given that the sample mean was
Subject
S5
58
• In our imaginary experiment, Subject
S5
58
62.3% correct and all 8 subjects
Subject
S6
67
N is 8.! Subject
S6
67
scored between 51 and 70
Subject
S7
60
Subject
S7
60
Subject
S8
64
• All else being equal, the Subject
S8
64
percent correct, is it very likely
Sample
Mean
62.3
sample mean becomes a Sample
Mean
62.3
that the actual population mean
N
8
better and better estimate
N
8
is 90% correct?!
of the population mean as N • In other words, if we took the time to test all 200
increases.! subjects, is it very likely that the mean of all 200
students would be 90% correct?!
• It can be time-consuming and expensive to test a large
sample, so most experiments in cognitive psychology use • If the average across the whole class is 90% correct, it’s
sample sizes of between 8 and 50 subjects.! pretty unlikely that, just by chance, we would select 8
people who all scored 70% or less.!
© S. J. Luck
All rights reserved
2
Percent
Correct
on
Difference
from
Absolute
Value
of
Percent
Correct
on
Difference
from
Absolute
Value
of
Test
Sample
Mean
Difference
Test
Sample
Mean
Difference
Subject
S1
68
5.8
5.8
Subject
S1
68
5.8
5.8
Subject
S2
51
-‐11.3
11.3
Subject
S2
51
-‐11.3
11.3
Subject
S3
60
-‐2.3
2.3
Subject
S3
60
-‐2.3
2.3
Subject
S4
70
7.8
7.8
Subject
S4
70
7.8
7.8
Subject
S5
58
-‐4.3
4.3
Subject
S5
58
-‐4.3
4.3
Subject
S6
67
4.8
4.8
Subject
S6
67
4.8
4.8
Subject
S7
60
-‐2.3
2.3
Subject
S7
60
-‐2.3
2.3
Subject
S8
64
1.8
1.8
Subject
S8
64
1.8
1.8
Sample
Mean
62.3
Sample
Mean
62.3
N
8
N
8
• The standard way to quantify variability is to look at how • For example, the score for subject 1, 68, is 5.8 higher than the
sample mean of 62.3.!
different each individual is from the average across the
individuals.! • The score for subject 2, 51, is 11.3 lower than the sample mean.!
• The average of all these differences is zero, because some are
• In our example experiment, we can do this by computing positive and some are negative, so we can’t just average together
the difference between each subject’s test score and the these differences to get a measure of the variability across subjects.!
sample mean.!
Percent
Correct
on
Difference
from
Absolute
Value
of
Percent
Correct
on
Difference
from
Absolute
Value
of
Test
Sample
Mean
Difference
Test
Sample
Mean
Difference
Subject
S1
68
5.8
5.8
Subject
S1
68
5.8
5.8
Subject
S2
51
-‐11.3
11.3
Subject
S2
51
-‐11.3
11.3
Subject
S3
60
-‐2.3
2.3
Subject
S3
60
-‐2.3
2.3
Subject
S4
70
7.8
7.8
Subject
S4
70
7.8
7.8
Subject
S5
58
-‐4.3
4.3
Subject
S5
58
-‐4.3
4.3
Subject
S6
67
4.8
4.8
Subject
S6
67
4.8
4.8
Subject
S7
60
-‐2.3
2.3
Subject
S7
60
-‐2.3
2.3
Subject
S8
64
1.8
1.8
Subject
S8
64
1.8
1.8
Sample
Mean
62.3
Sample
Mean
62.3
N
8
Average
DeviaIon
=
5.0
N
8
Average
DeviaIon
=
5.0
• Instead, we could just take the absolute value of each difference, Std
Dev
6.3
making the negative numbers into positive numbers.!
• Instead of the average deviation, statisticians use something called
• We could then average together these difference scores.! the standard deviation.!
• This would give us what’s called the average deviation, which in this • It’s like the average deviation, but computed slightly differently (don’t
case is 5.! worry about the actual formula). !
• This means that, on average, the single-subject scores are 5 away
from the sample mean.!
Percent
Correct
on
Difference
from
Absolute
Value
of
Estimating the Population Mean!
Test
Sample
Mean
Difference
Subject
S1
68
5.8
5.8
• In our imaginary memory experiment, we might want
Subject
S2
51
-‐11.3
11.3
to know if subjects remembered anything at all a week
Subject
S3
60
-‐2.3
2.3
Subject
S4
70
7.8
7.8
after they learned the word list.!
Subject
S5
58
-‐4.3
4.3
• If they didn’t remember any of the words, and they
Subject
S6
67
4.8
4.8
Subject
S7
60
-‐2.3
2.3
had just guessed for each word pair, they should have
Subject
S8
64
1.8
1.8
gotten a score of about 50% correct.!
Sample
Mean
62.3
N
8
Average
DeviaIon
=
5.0
• Some people might be lucky guessers and get above
Std
Dev
6.3
50%, and others might be unlikely and get less than
50%.!
• The smaller the standard deviation, the more accurately
we can use the sample mean as an estimate of the • But we’d expect that the population mean would be
population mean.! 50% if none of the subjects remembered any of the
words from the original list, and they were just
• To have a really good estimate of the population mean, guessing.!
we need both a small standard deviation and a large N.!
© S. J. Luck
All rights reserved
3
Percent
Correct
on
Difference
from
Absolute
Value
of
Test
Sample
Mean
Difference
Subject
S1
68
5.8
5.8
Subject
S2
51
-‐11.3
11.3
Subject
S3
60
-‐2.3
2.3
Subject
S4
70
7.8
7.8
Subject
S5
58
-‐4.3
4.3
Subject
S6
67
4.8
4.8
Subject
S7
60
-‐2.3
2.3
Subject
S8
64
1.8
1.8
Sample
Mean
62.3
N
8
Std
Dev
6.3
Chance
50
• But it seems possible that, if the average across all 200
students was 50%, just by chance we might get 8
• In other words, we want to know if the population mean is subjects with an average score of 62.3%, even if they
greater than would be expected by chance (50%).! were guessing.!
• Our sample mean was 62.3% correct, which is quite a bit • And if we randomly chose 8 different people, we might
higher than 50%, but it seems possible that we might get get an average of something like 41% for those 8.!
8 subjects with an average score of 62.3%, even if they
were guessing.! • So we need a way !
One-Sample t Test!
PSC100Y
Introduction to Cognitive Psychology • Fish or Soap?!
• Rug or Candy?!
• Cake or Guitar?!
• Window or Saw?!
• Toothbrush or Finger?!
• Car or Water?!
• Printer or Salt?!
Research Methods 1-3! • Movie or Grass?!
One Sample t-Test Overview!
One-Sample t Test!
Percent
Correct
on
Difference
from
Absolute
Value
of
Test
Sample
Mean
Difference
Null Hypothesis (H0): Population mean = 50%! Subject
S1
68
5.8
5.8
Subject
S2
51
-‐11.3
11.3
Alternative Hypothesis (H1): Population mean ≠ 50%! Subject
S3
60
-‐2.3
2.3
Subject
S4
70
7.8
7.8
Use sample data to compute t value and p value! Subject
S5
58
-‐4.3
4.3
Subject
S6
67
4.8
4.8
Reject null hypothesis (accept alternative) if p < .05! Subject
S7
60
-‐2.3
2.3
Subject
S8
64
1.8
1.8
Sample
Mean
62.3
N
8
Std
Dev
6.3
Chance
50
t
5.54
p
0.001
© S. J. Luck
All rights reserved
4
One-Sample t Test! One-Sample t Test!
Null Hypothesis (H0): Population mean = 50%! Null Hypothesis (H0): Population mean = 50%!
Alternative Hypothesis (H1): Population mean ≠ 50%! Alternative Hypothesis (H1): Population mean ≠ 50%!
Use sample data to compute t value and p value! Use sample data to compute t value and p value!
Reject null hypothesis (accept alternative) if p < .05! Reject null hypothesis (accept alternative) if p < .05!
Population mean = 50%! Population mean ≠ 50%! Population mean = 50%! Population mean ≠ 50%!
(H0 is true)! (H1 is true)! (H0 is true)! (H1 is true)!
! !
Sample mean is not TRUTH!! Type II error! Sample mean is not TRUTH!! Type II error!
significantly different (True negative)! (False negative)! significantly different (True negative)! (False negative)!
from 50%! from 50%!
Sample mean is Type I error! TRUTH!! Sample mean is Type I error! TRUTH!!
significantly different (False positive)! (True positive)! significantly different (False positive)! (True positive)!
from 50%! ! from 50%! !
One-Sample t Test!
True Negative!
Null Hypothesis (H0): Population mean = 50%!
• True Negative: When the null hypothesis is true and we
accepted it.! Alternative Hypothesis (H1): Population mean ≠ 50%!
• When the null hypothesis is true, we will get a true Use sample data to compute t value and p value!
negative in 95% of experiments.!
Reject null hypothesis (accept alternative) if p < .05!
One-Sample t Test!
False Positive!
Null Hypothesis (H0): Population mean = 50%!
• In 5% of experiments in which the null hypothesis is true,
just by chance we may get a sample with unusually high Alternative Hypothesis (H1): Population mean ≠ 50%!
or unusually low scores, leading to a large t value and a p
value that’s < .05.! Use sample data to compute t value and p value!
© S. J. Luck
All rights reserved
5
One-Sample t Test!
True Positive!
Null Hypothesis (H0): Population mean = 50%!
• True positive: The null hypothesis is false (the alternative
Alternative Hypothesis (H1): Population mean ≠ 50%! hypothesis is true), and we conclude that null hypothesis
is false.!
Use sample data to compute t value and p value!
• In other words, the population mean was greater than
Reject null hypothesis (accept alternative) if p < .05!
50%, and we concluded that it was greater than 50%
Population mean = 50%! Population mean ≠ 50%!
because our p value was less than .05.!
(H0 is true)! (H1 is true)!
!
Sample mean is not TRUTH!! Type II error!
significantly different (True negative)! (False negative)!
from 50%!
Sample mean is Type I error! TRUTH!!
significantly different (False positive)! (True positive)!
from 50%! !
One-Sample t Test!
False Negative!
Null Hypothesis (H0): Population mean = 50%! • False negative: Even though the null hypothesis is false,
the p value is greater than .05, and you have to accept
Alternative Hypothesis (H1): Population mean ≠ 50%! the null hypothesis.!
Use sample data to compute t value and p value! !
• In other words, you’ve accepted the hypothesis that
Reject null hypothesis (accept alternative) if p < .05! subjects are at chance, but this is an incorrect
conclusion.!
Population mean = 50%! Population mean ≠ 50%!
(H0 is true)! (H1 is true)!
!
Sample mean is not TRUTH!! Type II error!
significantly different (True negative)! (False negative)!
from 50%!
Sample mean is Type I error! TRUTH!!
significantly different (False positive)! (True positive)!
from 50%! !
© S. J. Luck
All rights reserved
6
p > .05!
p < .05! PSC100Y
Introduction to Cognitive Psychology
• So we use this p < .05 criterion, which means that we have false
positives only 5% of the time.!
Research Methods 1-4!
• But the flip side is that we get a fairly large number of false
negatives.! One Sample t-Test Equation!
One-Sample t Test!
One Sample t Test!
One-Sample t Test!
One Sample t Test!
© S. J. Luck
All rights reserved
7
One-Sample t Test!
One Sample t Test!
• To see how this works, let’s look at a different
set of data values with the same mean but
• More variability among the individual test greater variability.!
scores means that we’re less certain about
Percent
Correct
on
Percent
Correct
on
what the population mean is,! Test
Test
• This means that we’re less certain that the Subject
S1
68
Subject
S1
78
Subject
S2
51
Subject
S2
41
population mean differs from chance.! Subject
S3
60
Subject
S3
40
• So, more variability means a smaller t.! Subject
S4
70
Subject
S4
75
Subject
S5
58
Subject
S5
48
Subject
S6
67
Subject
S6
77
Subject
S7
60
Subject
S7
55
Subject
S8
64
Subject
S8
84
Sample
Mean
62.3
Sample
Mean
62.3
N
8
N
8
Std
Dev
6.3
Std
Dev
18.1
Chance
50
Chance
50
t
5.54
t
1.91
p
0.001
p
0.098
One-Sample t Test!
One Sample t Test!
One-Sample t Test!
One Sample t Test - Summary!
© S. J. Luck
All rights reserved
8
One Sample t Test - Summary! PSC100Y
Introduction to Cognitive Psychology
• The denominator is the standard deviation
divided by the sqrt of the number of subjects.!
• If there is more variability, the stddev is
bigger, and the t value is smaller.!
• So, more variability means less confidence, a
smaller t value, and a larger p value.!
• The standard deviation gets divided by N, so a
larger N gives you a smaller denominator and a
larger t value.!
• This makes sense, because your sample mean
is a better estimate of the population mean
when you have more subjects.! Research Methods 1-5!
! Two Sample t-Test Overview!
Comparing 2 Conditions!
Condition 1: Rote Memorization!
Condition 2: Elaborative Encoding!
Rote Memorization Group! Elaborative Encoding Group!
• Fish! • Snow!
• Candy! • Home!
• Guitar! • Sky!
• Window! • Heart!
• In this imaginary experiment, we’re going to randomly
• Finger! • Radio! sample 8 subjects from our class of 200 students and
assign them to the rote memorization group!
• Car! • Watch! • And we’re going to randomly sample 8 different
• Printer! • Book! subjects for the elaborative encoding group.!
• Grass! • Towel!
© S. J. Luck
All rights reserved
9
Summary of Results!
Percent
Correct
Percent
Correct
• After a week, the elaborative encoding group scored an on
Test
on
Test
average of 75.4% correct on the memory test.! Elabora6ve
Rote
• The sample mean for the rote memorization group is Encoding
Memoriza6on
Subject
E1
66
Subject
R1
77
63.4% correct.!
Subject
E2
72
Subject
R2
73
• The p value was < .05, so this is a statistically significant Subject
E3
95
Subject
R3
75
difference between the two groups.! Subject
E4
66
Subject
R4
70
Subject
E5
85
Subject
R5
70
• This would allow us to conclude that the population mean Subject
E6
83
Subject
R6
76
for the elaborative encoding condition is truly greater Subject
E7
75
Subject
R7
67
than the population mean for the rote memorization Subject
E8
61
Subject
R8
63
Sample
Mean
75.4
Sample
Mean
71.4
condition.!
N
8
N
8
• When we use a t test to conclude that two populations Std
Dev
11.5
Std
Dev
4.8
differ, we will be right 95% of the time.! t
=
0.91
p
=
0.380
• A difference like this will occur by chance only 5% of the
time.!
• Imagine that we had gotten the results from the • The population mean might be quite a bit higher for the
previous slide.! elaborative encoding condition than for the rote
memorization condition, but due to random chance, our
• The sample mean is now only a little higher for samples might not have done a good job of reflecting the
the elaborative encoding group than for the rote population means.!
memorization group.!
• As a result, the t value is a little smaller, and the • In other words, this might be a false negative.!
p value is no longer < .05.!
• We know that the probability of a false positive is only
• We could no longer conclude that elaborative 5%, but we don’t usually know the probability of a false
encoding leads to better memory than rote negative.!
memorization.!
• False negatives are pretty common, so we just can’t draw
a strong conclusion from the lack of a significant effect.!
© S. J. Luck
All rights reserved
10
Dependent and Independent Variables!
Dependent Variables!
Dependent Variable!
Percent
Correct
Percent
Correct
• The dependent variable is what you measure
on
Test
on
Test
from the subject.!
Elabora6ve
Rote
• It’s called a dependent variable because it
Encoding
Memoriza6on
Subject
E1
66
Subject
R1
77
depends on which level of the independent
Subject
E2
72
Subject
R2
73
variable a given subject is in.!
Subject
E3
95
Subject
R3
75
• In our memory experiment, the dependent
Subject
E4
66
Subject
R4
70
Subject
E5
85
Subject
R5
70
variable is the percent correct on the memory
Subject
E6
83
Subject
R6
76
test.!
Subject
E7
75
Subject
R7
67
Subject
E8
61
Subject
R8
63
Sample
Mean
75.4
Sample
Mean
71.4
N
8
N
8
Std
Dev
11.5
Std
Dev
4.8
© S. J. Luck
All rights reserved
11
Rote Memorization Elaborative Encoding
PSC100Y Population! Population!
Introduction to Cognitive Psychology
t Test for Two Independent Samples! t Test for Two Independent Samples!
Null Hypothesis (H0): ! !Population mean for Condition 1 = Null Hypothesis (H0): ! !Population mean for Condition 1 =
! ! ! !Population mean for Condition 2! ! ! ! !Population mean for Condition 2!
Alternative Hypothesis (H1): !Population mean for Condition 1 ≠ Alternative Hypothesis (H1): !Population mean for Condition 1 ≠
! ! ! !Population mean for Condition 2! ! ! ! !Population mean for Condition 2!
Use sample data to compute t value and p value! Use sample data to compute t value and p value!
Reject null hypothesis (accept alternative) if p < .05! Reject null hypothesis (accept alternative) if p < .05!
Population means are Population means differ Population means are Population means differ
equal to each other! from each other! equal to each other! from each other!
(H0 is true)! (H1 is true)! (H0 is true)! (H1 is true)!
Samples means are not TRUTH!! Type II error! Samples means are not TRUTH!! Type II error!
significantly different (True negative)! (False negative)! significantly different (True negative)! (False negative)!
from each other! Probability = ???! from each other! Probability = ???!
Samples means are Type I error! TRUTH!! Samples means are Type I error! TRUTH!!
significantly different (False positive)! (True positive)! significantly different (False positive)! (True positive)!
from each other! Probability = 5%! from each other! Probability = 5%!
© S. J. Luck
All rights reserved
12
t Test for Two Independent Samples! t Test for Two Independent Samples!
Null Hypothesis (H0): ! !Population mean for Condition 1 = Null Hypothesis (H0): ! !Population mean for Condition 1 =
! ! ! !Population mean for Condition 2! ! ! ! !Population mean for Condition 2!
Alternative Hypothesis (H1): !Population mean for Condition 1 ≠ Alternative Hypothesis (H1): !Population mean for Condition 1 ≠
! ! ! !Population mean for Condition 2! ! ! ! !Population mean for Condition 2!
Use sample data to compute t value and p value! Use sample data to compute t value and p value!
Reject null hypothesis (accept alternative) if p < .05! Reject null hypothesis (accept alternative) if p < .05!
Population means are Population means differ Population means are Population means differ
equal to each other! from each other! equal to each other! from each other!
(H0 is true)! (H1 is true)! (H0 is true)! (H1 is true)!
Samples means are not TRUTH!! Type II error! Samples means are not TRUTH!! Type II error!
significantly different (True negative)! (False negative)! significantly different (True negative)! (False negative)!
from each other! Probability = ???! from each other! Probability = ???!
Samples means are Type I error! TRUTH!! Samples means are Type I error! TRUTH!!
significantly different (False positive)! (True positive)! significantly different (False positive)! (True positive)!
from each other! Probability = 5%! from each other! Probability = 5%!
t Test for Two Independent Samples! t Test for Two Independent Samples!
Null Hypothesis (H0): ! !Population mean for Condition 1 = Null Hypothesis (H0): ! !Population mean for Condition 1 =
! ! ! !Population mean for Condition 2! ! ! ! !Population mean for Condition 2!
Alternative Hypothesis (H1): !Population mean for Condition 1 ≠ Alternative Hypothesis (H1): !Population mean for Condition 1 ≠
! ! ! !Population mean for Condition 2! ! ! ! !Population mean for Condition 2!
Use sample data to compute t value and p value! Use sample data to compute t value and p value!
Reject null hypothesis (accept alternative) if p < .05! Reject null hypothesis (accept alternative) if p < .05!
Population means are Population means differ Population means are Population means differ
equal to each other! from each other! equal to each other! from each other!
(H0 is true)! (H1 is true)! (H0 is true)! (H1 is true)!
Samples means are not TRUTH!! Type II error! Samples means are not TRUTH!! Type II error!
significantly different (True negative)! (False negative)! significantly different (True negative)! (False negative)!
from each other! Probability = ???! from each other! Probability = ???!
Samples means are Type I error! TRUTH!! Samples means are Type I error! TRUTH!!
significantly different (False positive)! (True positive)! significantly different (False positive)! (True positive)!
from each other! Probability = 5%! from each other! Probability = 5%!
© S. J. Luck
All rights reserved
13
t Test for Two Independent Samples!
PSC100Y
Null Hypothesis (H0): ! !Population mean for Condition 1 =
! ! ! !Population mean for Condition 2! Introduction to Cognitive Psychology
Alternative Hypothesis (H1): !Population mean for Condition 1 ≠
! ! ! !Population mean for Condition 2!
© S. J. Luck
All rights reserved
14
Population!
• The goal of experiments in cognitive psychology is to
draw generalizable conclusions about people.!
• These conclusions don’t necessary apply to every
individual person in the population, but we’d like to think
that our experimental results apply to people on average.!
• But we have limited time and resources, so we can’t test
everybody. We can just test a sample of the population.!
• Inferential statistics allow us to draw conclusions from the
population on the basis of a sample of the population.!
© S. J. Luck
All rights reserved
15