0% found this document useful (0 votes)
15 views

Comparing Two Groups (Part 1)

- The document discusses comparing the means of two groups using a t-test, specifically comparing a sample mean to a population mean (Case 1 study). - It provides the formula for calculating the t-value and explains how to determine the degrees of freedom. - It describes using a t-distribution table to determine the critical t-value needed to reject the null hypothesis at specific probability levels (e.g. .05), based on the degrees of freedom. Rejecting the null would indicate the sample mean is significantly different from the population mean.

Uploaded by

dwi
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

Comparing Two Groups (Part 1)

- The document discusses comparing the means of two groups using a t-test, specifically comparing a sample mean to a population mean (Case 1 study). - It provides the formula for calculating the t-value and explains how to determine the degrees of freedom. - It describes using a t-distribution table to determine the critical t-value needed to reject the null hypothesis at specific probability levels (e.g. .05), based on the degrees of freedom. Rejecting the null would indicate the sample mean is significantly different from the population mean.

Uploaded by

dwi
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 24

Comparing Two Groups:

Between-Groups Designs
Dadang Sudana
(Source: Hatch & Farhadi, 1988; Hatch & lazaraton, 1991)
Parametric Comparison of Two Groups: t-test

• We will discuss tests involving one independent variable with two levels
and one dependent variable. That is, we will compare the performance of
two groups on some dependent variable.
• The measurement of the dependent variable will be continuous (i.e.,
interval scores or ordinal scales).
• The comparison will be of two different groups (a comparison of
independent groups, a between-groups design).
• There are several options available to us for comparing two groups.
• The choice has to do with the type of measurement and the best estimate of
central tendency for the data.
• We will begin with the t-test, a procedure that tests the difference between
two groups for normally distributed interval data (where x and s.d. are
appropriate measures of central tendency and variability of the scores).
• Then we will turn to procedures used when the median is the best measure of
central tendency or where certain assumptions of the t-test cannot be met.
• In research, we are seldom interested in the score of an individual student;
rather, we are interested in the performance of a group.
• When we have collected data from a group, found the and s.d. (and
determined that these are accurate descriptions of the data), we still want
to know whether that is exceptional in any way.
• To make this judgment, we need to compare the mean with that of some
other group.
• In Case l studies, we compare the group mean with that of the population
from which it was drawn.
• We want to know whether the group is different from that of the
population at large.
• In Case 2 studies we have means from two groups (perhaps an
experimental group and a control group). We want to know whether the
means of these two groups truly differ.
Sampling Distribution of Means
• Assume there are 36 Ss represented in the for your class. We need to
compare this mean with that of many, many s from other groups of 36
Ss.
• Imagine that we could draw samples of 36 Ss from many, many other
German I classes and that we gave them the test.
• Once we got all these sample s, we could turn them into a distribution.
The normal distribution is made up of individual scores.
• This distribution would differ in that, instead of being made up of
individual scores, it is composed of class means.
• As we gathered more and more data, we would expect a curve to evolve
that would be symmetric.
• This symmetric curve is, however, not called a normal distribution but
rather a sampling distribution of means.
• The reason the mean for the groups is called the population mean is that it is drawn from a
large enough number of sample groups (selected at random from the population) that it forms a
normal distribution.
• Its central point should be equal to that of the population.
• The sampling distribution of means has three basic characteristics.
l. For 30 or more samples (with 30 or more Ss per sample). it is normally distributed.
2. Its mean is equal to the mean of the population.
3. Its standard deviation, called standard error of means, is equal to the standard deviation
of the population divided by the square root of the sample size.
• While you already have the published population mean for the German
test from the publishers, you do not know the standard error of means for
the test.
• You don't have time to find 30 other classes with 36 Ss each. There is a
much easier way. We will estimate it, and that estimate will be quite
precise.
• When we carry out research, we gather data on a sample. The information
that we present is called a statistic.
• We use the sample statistic to estimate the same information in the
population.
• While a statistic is used to talk about the sample, you may see the term
parameter used to talk about the population. (Again, parameter is a lexical
shift when applied to linguistics. The two have little to do with each
other.)
• Perhaps the following diagram will make the picture clear:
Sample Population

Statistic Estimates Parameter


• Let's see how this works for estimating the standard error of means, the
parameter (the Greek symbol is a small sigma):
• Since we will use our sample data to estimate the parameter, we use our
sample statistics for the formula:

• The formula says that we can find the standard deviation of the means of all
the groups (called now the standard error of means) by dividing the standard
deviation of our sample group by the square root of the sample size.
• Sample statistics are the best available information for estimating the
population parameters. The correspondence between population parameter
symbols and sample statistic symbols is:
Case I Comparisons
• A Case l study compares a sample mean with an established population
mean.
• The Ho for a Case 1 t-test would be: There is no effect of group on the
dependent variable. That is, the test scores result in no difference between
the sample mean and the mean of the population.
• To discover whether the null hypothesis is, in fact, true, we follow a
familiar procedure. First we ask how far our sample is from . Then
we check to see how many "ruler lengths" that difference is from the
mean.
• The ruler, this time, is the standard deviation of the means rather than the
standard deviation, right
• The formula for our observed t value is:
• If the for our class was 80 and the published for the test was 65, you
can fill in the values for the top part of the formula:

• To find the value for , refer back to our formula for the standard error
of
• means. The standard deviation for the scores of your German students was
30
• That's the end of our calculations, but what does this t value tell us (or the chair of your
German department)?
• Visualize yourself as one of many, many teachers who have given the German vocabulary test
to groups of exactly 36 students.
• All of the means from these groups have been gathered and they form a sampling distribution
of means.
• The task was to find exactly how well the of your class fits in this distribution. Can you say
that they are really spectacularly better, i.e., your class falls so far out under the right tail of
the distribution that they "don't belong," "are not typical"?
• Can you reject the null hypothesis?
• Before we can answer, there is one more concept we need to present--the
concept of degrees of freedom (df).
• You already know that the t value is influenced by the sample size.
Sample size relates to degrees of freedom.
• You'll remember in several of our formulas, we averaged not by dividing
by N but by N- I.
• This, too, is related to degrees of freedom.
• We can now find the df for our study and check the probability to
determine whether we can or cannot reject the null hypothesis.
• To obtain the df we use N- I. The N of our sample was 36 so df = 35.
• To find the probability, we will consult the appropriate distribution in
table 2, appendix C. Now we can find the answer to our question.
• In the table, notice that the probability levels are given across the top of the table. In the first
column are the degrees of freedom.
• If the df were 1, you would look at the values in the first row to determine whether you could or
could not reject the null hypothesis.
• For example, if you chose an .05 level for rejecting the null hypothesis, you would look across
the first row to the second column.
• The number 12.706 is the t critical value that you need to meet or exceed in order to reject the
null hypothesis.
• If you chose an .01 (rejection point), you would need at value of 63.657 or better.
• If your study had 2 degrees of freedom, you would need a t value of 4.303 or better to reject the
null hypothesis at the .05 level.
• Let's see how this works with our study. We have 35 degrees of freedom.
Look down the df column for the appropriate number of degrees of freedom.
• Unfortunately, there is no 35 in this particular table. If there is no number
for the df of your study, move to the next lower value on the chart (as the
more conservative estimate).
• The closest df row to 35 df in this table is 30. Assuming you chose an .05 ,
look at the critical value oft given in the column labeled .05.
• The intersection of df row and p column shows the critical value needed to
reject the null hypothesis.
• The probability levels given in this table are for two-tailed, nondirectional
hypotheses.
• If your particular study has led you to state a directional, one-tailed hypothesis,
you need to "double" these values.
• That is, a one-tailed hypothesis puts the rejection under only one tail (rather than
splitting it between the two tails).
• So, to find a one-tailed critical value for an .05 level, use the .10 column. To set
an .01 rejection level, use the column labeled .02.
• Thus, for I df and a one-tailed .05 rejection area, you need a t critical value of
6.314 or better.
• For an .01 level, the one-tailed critical value of t would be 31.821.
• If your study had 16 df and you wished to reject a one-tailed hypothesis at
the .05 level, you would need at value of 1.746 or greater. If your study
had 29 df, a t critical of 2.462 would be needed for rejection at the .0 1
level.
• For the more normal two-tailed hypothesis at the .01 level, the values
needed would be 2.921 for 16 df, and 2.756 for 29 df
• The table, then, shows that you need to obtain or exceed a t value of 2.042
to reject the null hypothesis.
• For an .01 level, you would need to meet or exceed a t value of 2.750.
These cut-off values are called the critical value of t or t critical.
• When the observed value of t meets or exceeds the critical value for the
level selected (.05 or .01), you can reject the null hypothesis. (The critical
value works for either positive or negative t values.)
• We can reject the null hypothesis in this case because our observed t value
of 3.0 exceeds the critical value (t critical) of 2.042 needed for a
probability level of .05.
• In the research report, this information would be given as the simple
statement:
• t = 3.0, df = 35, p < .05. We can reject the H0 and conclude that, indeed,
our German class excelled!

You might also like