0% found this document useful (0 votes)
111 views10 pages

What Is The Sign Test?: Step 1

The sign test is a non-parametric test that compares the medians of two groups without assuming a particular data distribution. It involves subtracting one group's values from the other and counting the number of positive and negative differences. The null hypothesis is that the median difference is zero. To perform the test, the differences are categorized as positive or negative and the number of each is compared to the expected values under the null hypothesis using a binomial distribution. A low p-value leads to rejection of the null in favor of an alternative hypothesis about the medians differing.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
111 views10 pages

What Is The Sign Test?: Step 1

The sign test is a non-parametric test that compares the medians of two groups without assuming a particular data distribution. It involves subtracting one group's values from the other and counting the number of positive and negative differences. The null hypothesis is that the median difference is zero. To perform the test, the differences are categorized as positive or negative and the number of each is compared to the expected values under the null hypothesis using a binomial distribution. A low p-value leads to rejection of the null in favor of an alternative hypothesis about the medians differing.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

What is the Sign Test?

The sign test compares the sizes of two groups. It is a non-parametric or “distribution free” test, which means the
test doesn’t assume the data comes from a particular distribution, like the normal distribution. The sign test is an
alternative to a one sample t test or a paired t test. It can also be used for ordered (ranked) categorical data.
The null hypothesis for the sign test is that the difference between medians is zero.
For a one sample sign test, where the median for a single sample is analyzed, see: One Sample Median Tests.

How to Calculate a Paired/Matched Sample Sign Test


Assumptions for the test (your data should meet these requirements before running the test) are:

1. The data should be from two samples.


2. The two dependent samples should be paired or matched. For example, depression scores from before
a medical procedure and after.
To set up the test, put your two sets of sample data into a table (I used Excel). This set of data represents test
scores at the end of Spring and the beginning of the Fall semesters. The hypothesis is that the summer break means
a significant drop in test scores.
• H0: No difference in median of the signed differences.
• H1: Median of the signed differences is less than zero.
Step 1: Subtract set 2 from set 1 and put the result in the third column.

Step 2: Add a fourth column indicating the sign of the number in column 3.
Step 3: Count the number of positives and negatives.
• 4 positives.
• 12 negatives.
12 negatives seems like a lot, but we can’t say for sure that it’s significant (i.e. that it didn’t happen by chance) until
we run the sign test.
Step 3: Add up the number of items in your sample and subtract any you had a difference of zero for (in column 3).
The sample size in this question was 17, with one zero, so n = 16.

Step 4: Find the p-value using a binomial distribution table or use a binomial calculator. I used the calculator,
putting in:
• .5 for the probability. The null hypothesis is that there are an equal number of signs (i.e. 50/50).
Therefore, the test is a simple binomial experiment with a .5 chance of the sign being negative and .5 of
it being positive (assuming the null hypothesis is true).
• 16 for the number of trials.
• 4 for the number of successes. “Successes” here is the smaller of either the positive or negative
signs from Step 2.
The p-value is 0.038, which is smaller than the alpha level of 0.05. We can reject the null hypothesis and say there
is a significant difference.
Wilcoxon Signed Rank Test

Another popular nonparametric test for matched or paired data is called the Wilcoxon Signed Rank Test. Like the
Sign Test, it is based on difference scores, but in addition to analyzing the signs of the differences, it also takes into
account the magnitude of the observed differences.

Let's use the Wilcoxon Signed Rank Test to re-analyze the data in Example 4 on page 5 of this module. Recall that
this study assessed the effectiveness of a new drug designed to reduce repetitive behaviors in children affected with
autism. A total of 8 children with autism enroll in the study and the amount of time that each child is engaged in
repetitive behavior during three hour observation periods are measured both before treatment and then again after
taking the new medication for a period of 1 week. The data are shown below.

Child Before Treatment After 1 Week of Treatment

1 85 75

2 70 50

3 40 50

4 65 40

5 80 20

6 75 65

7 55 40

8 20 25

First, we compute difference scores for each child.

Difference
Child Before Treatment After 1 Week of Treatment
(Before-After)

1 85 75 10

2 70 50 20

3 40 50 -10

4 65 40 25

5 80 20 60

6 75 65 10

7 55 40 15

8 20 25 -5
The next step is to rank the difference scores. We first order the absolute values of the difference scores and assign
rank from 1 through n to the smallest through largest absolute values of the difference scores, and assign the mean
rank when there are ties in the absolute values of the difference scores.

Observed Differences Ordered Absolute Values of Differences Ranks

10 -5 1

20 10 3

-10 -10 3

25 10 3

60 15 5

10 20 6

15 25 7

-5 60 8

The final step is to attach the signs ("+" or "-") of the observed differences to each rank as shown below.

Observed Differences Ordered Absolute Values of Difference Scores Ranks Signed Ranks

10 -5 1 -1

20 10 3 3

-10 -10 3 -3

25 10 3 3

60 15 5 5

10 20 6 6

15 25 7 7

-5 60 8 8

Similar to the Sign Test, hypotheses for the Wilcoxon Signed Rank Test concern the population median of the
difference scores. The research hypothesis can be one- or two-sided. Here we consider a one-sided test.

H0: The median difference is zero versus

H1: The median difference is positive α=0.05


Test Statistic for the Wilcoxon Signed Rank Test

The test statistic for the Wilcoxon Signed Rank Test is W, defined as the smaller of W+ (sum of the positive ranks)
and W- (sum of the negative ranks). If the null hypothesis is true, we expect to see similar numbers of lower and
higher ranks that are both positive and negative (i.e., W+ and W- would be similar). If the research hypothesis is true
we expect to see more higher and positive ranks (in this example, more children with substantial improvement in
repetitive behavior after treatment as compared to before, i.e., W+ much larger than W-).

In this example, W+ = 32 and W- = 4. Recall that the sum of the ranks (ignoring the signs) will always equal n(n+1)/2.
As a check on our assignment of ranks, we have n(n+1)/2 = 8(9)/2 = 36 which is equal to 32+4. The test statistic is
W = 4.

Next we must determine whether the observed test statistic W supports the null or research hypothesis. This is done
following the same approach used in parametric testing. Specifically, we determine a critical value of W such that if
the observed value of W is less than or equal to the critical value, we reject H0 in favor of H1, and if the observed value
of W exceeds the critical value, we do not reject H0.

Table of Critical Values of W

The critical value of W can be found in the table below:


To determine the appropriate one-sided critical value we need sample size (n=8) and our one-sided level of
significance (α=0.05). For this example, the critical value of W is 6 and the decision rule is to reject H0 if W < 6. Thus,
we reject H0, because 4 < 6. We have statistically significant evidence at α =0.05, to show that the median difference
is positive (i.e., that repetitive behavior improves.)

Note that when we analyzed the data previously using the Sign Test, we failed to find statistical significance.
However, when we use the Wilcoxon Signed Rank Test, we conclude that the treatment result in a statistically
significant improvement at α=0.05. The discrepant results are due to the fact that the Sign Test uses very little
information in the data and is a less powerful test.

Example:

A study is run to evaluate the effectiveness of an exercise program in reducing systolic blood pressure in patients
with pre-hypertension (defined as a systolic blood pressure between 120-139 mmHg or a diastolic blood pressure
between 80-89 mmHg). A total of 15 patients with pre-hypertension enroll in the study, and their systolic blood
pressures are measured. Each patient then participates in an exercise training program where they learn proper
techniques and execution of a series of exercises. Patients are instructed to do the exercise program 3 times per week
for 6 weeks. After 6 weeks, systolic blood pressures are again measured. The data are shown below.

Systolic Blood Pressure Systolic Blood Pressure


Patient
Before Exercise Program After Exercise Program

1 125 118

2 132 134

3 138 130

4 120 124

5 125 105

6 127 130

7 136 130

8 139 132

9 131 123

10 132 128

11 135 126

12 136 140

13 128 135

14 127 126

15 130 132
Is there is a difference in systolic blood pressures after participating in the exercise program as compared to before?
• Step1. Set up hypotheses and determine level of significance.
H0: The median difference is zero versus
H1: The median difference is not zero α=0.05

• Step 2. Select the appropriate test statistic.


The test statistic for the Wilcoxon Signed Rank Test is W, defined as the smaller of W+ and W- which are the sums of
the positive and negative ranks, respectively.

• Step 3. Set up the decision rule.


The critical value of W can be found in the table of critical values. To determine the appropriate critical value from
Table 7 we need sample size (n=15) and our two-sided level of significance (α=0.05). The critical value for this two-
sided test with n=15 and α=0.05 is 25 and the decision rule is as follows: Reject H0 if W < 25.

• Step 4. Compute the test statistic.


Because the before and after systolic blood pressures measures are paired, we compute difference scores for each
patient.
Systolic Blood Pressure Systolic Blood Pressure Difference
Patient Before Exercise Program After Exercise Program (Before-After)

1 125 118 7

2 132 134 -2

3 138 130 8

4 120 124 -4

5 125 105 20

6 127 130 -3

7 136 130 6

8 139 132 7

9 131 123 8

10 132 128 4

11 135 126 9

12 136 140 -4

13 128 135 -7

14 127 126 1

15 130 132 -2
The next step is to rank the ordered absolute values of the difference scores using the approach outlined in Section
10.1. Specifically, we assign ranks from 1 through n to the smallest through largest absolute values of the difference
scores, respectively, and assign the mean rank when there are ties in the absolute values of the difference scores.

Ordered Absolute
Observed Differences Ranks
Values of Differences

7 1 1

-2 -2 2.5

8 -2 2.5

-4 -3 4

20 -4 6

-3 -4 6

6 4 6

7 6 8

8 -7 10

4 7 10

9 7 10

-4 8 12.5

-7 8 12.5

1 9 14

-2 20 15
The final step is to attach the signs ("+" or "-") of the observed differences to each rank as shown below.

Ordered Absolute Signed


Observed
Values of Ranks Ranks
Differences
Differences

7 1 1 1

-2 -2 2.5 -2.5

8 -2 2.5 -2.5

-4 -3 4 -4

20 -4 6 -6

-3 -4 6 -6

6 4 6 6

7 6 8 8

8 -7 10 -10

4 7 10 10

In this example, W+ = 89 and W- = 31. Recall that the sum of the ranks (ignoring the signs) will always equal
n(n+1)/2. As a check on our assignment of ranks, we have n(n+1)/2 = 15(16)/2 = 120 which is equal to 89 + 31. The
test statistic is W = 31.

• Step 5. Conclusion.

We do not reject H0 because 31 > 25. Therefore, we do not have statistically significant evidence at α=0.05, to show
that the median difference in systolic blood pressures is not zero (i.e., that there is a significant difference in systolic
blood pressures after the exercise program as compared to before).
What is Friedman’s Test?
Friedman’s test is a non-parametric test for finding differences in treatments across multiple attempts.
Nonparametric means the test doesn’t assume your data comes from a particular distribution (like the normal
distribution). Basically, it’s used in place of the ANOVA test when you don’t know the distribution of your data.
Friedman’s test is an extension of the sign test, used when there are multiple treatments. In fact, if there are only two
treatments the two tests are identical.

Running the test


Your data should meet the following requirements:

• Data should be ordinal (e.g. the Likert scale) or continuous,


• Data comes from a single group, measured on at least three different occasions,
• The sample was created with a random sampling method,
• Blocks are mutually independent (i.e. all of the pairs are independent — one doesn’t affect the other),
• Observations are ranked within blocks with no ties.
The null hypothesis for the test is that the treatments all have identical effects, or that the samples differ in some
way. For example, they have different centers, spreads, or shapes. The alternate hypothesis is that the treatments do
have different effects.

1. Prepare your data for the test.


Step 1: Sort your data into blocks (columns in a spreadsheet).for this example, we have 12 patients getting three
different treatments.

Step 2: Rank each column separately. The smallest score should get a rank of 1. I am ranking across rows here so
each patient is being ranked a 1, 2, or 3 for each treatment.
Step 3: Sum the ranks (find a total for each column).

2. Run the Test


Note: This test isn’t usually run by hand, as the calculations are time consuming and labor-intensive. Nearly all
popular statistical software packages can run this test. However, I’m including the manual steps here for reference.
Step 4: Calculate the test statistic. You’ll need:
1. n: the number of subjects (12)
2. k: the number of treatments (3)
3. R: The total ranks for each of the three columns (32, 27, 13).
Insert these into the following formula and solve:

Step 5: Find the FM critical value from the table of critical values for Friedman (see table below).
Use the k=3 table (as that is how many treatments we have) and an alpha level of 5%. You could choose a higher or
lower alpha level, but 5% if fairly common — so use the 5% table if you don’t know your alpha level.
Looking up n-12 in that table, we find a FM critical value of 6.17.
Step 6: Compare the calculated FM test statistic (Step 4) to the FM critical value (Step 5). Reject the null hypothesis
if the calculated F value is larger than the FM critical value.:
• Calculated FM Test Statistic = 15.526.
• FM Critical value from table = 6.17.
The calculated FM statistic is larger, so you would reject the null hypothesis.

You might also like