Chapter - 11 Non Parametric Tests 11.0. Objectives
Chapter - 11 Non Parametric Tests 11.0. Objectives
11.0. Objectives:
Having gone through the lesson the learner will be able to understand
1. Types and application of non parametric statistics
conditions about the parameter of the population from which a sample is taken. Such
statistical tests are considered to be more powerful than non-parametric tests and
assumptions are based upon the nature of the population distribution as well as
upon the type of measurement scales used in quantifying the data. A non-
parametric statistical test is one, which does not specify any conditions about the
parameter of the population from which the sample is drawn. Since these statistical
tests do not make any specified and precise assumption about the form of the
distribution of the population, these are also known as distribution free statistics. The
non-parametric statistics do not specify any rigid conditions like parametric statistical
tests although certain assumptions are associated with them. Non-parametric tests,
the variables under study should be continuous and the observations should be
independent. Some of the important non-parametric tests are Chi Square, Sign test,
are as follows.
124
1. Simplicity and facilitation in derivation: Most of the non-parametric statistics
can be derived by using simple computational formulas. This advantage does not lie
with most of the parametric statistics, the derivat'lon of which requires an advanced
knowledge mathematics.
parametric statistics are based upon fewer and less rigid and elaborate assumptions
regarding the form of population distribution, they can be easily applied to much
wider situations.
statistics assumptions are fewer and less elaborate than in the case of parametric
violation. Not only this, these violations are easier to check and can be readily and
measurement based upon a nominal scale and ordinal scale whereas parametric
statistics require measurement based upon the interval scale and/or ratio scale. As
treatments associated with either nominal scale or ordinal scale are easier than
treatments associated with either interval scale or ratio scale, the parametric
statistics have a better case for applicability than the non-parametric statistics.
6. Impact of sample size: When sample size is 10 or less than 10, non-
parametric statistics are easier, quicker and more efficient than the parametric
statistics. If the assumptions of parametric statistics are violated for such small
cases, the result is likely to get badly affected. Therefore, for this sample size, non-
125
parametric statistics are always superior to the parametric statistics. The reader
should note that as the sample size increases, non-parametric statistics become
the parametric tests. If the data are such that they meet all assumptions of non-
parametric statistics are applied to the data, which fulfil all assumptions of
parametric tests, the distribution-free statistics become more eficient with a small
sample size but they become less and less efficient as sample size increases.
statistical efficiency than parametric statistics when sample size is large, preferably
above 30.
Siegel & Castellan (1988) consider the use of non-parametric statistics as simply
"wasteful of data".
3. It is also said that the probability tables for testing the significance of non-
126
The chi-square test is used when the data are expressed in terms of frequencies of
any continuous data can be reduced to the categories in such a ways that they can
be treated as discrete data and then, the application of chi-square is justified. Chi-
square is not stable when computed from a table in which any observed frequency is
less than 5, unless a correction for continuity called Yates correction is made.
The additive property of Chi-square: When several X 2 s have been computed from
independent experiments (i.e., from tables based upon different samples), these
may be summed to give a new chi-square with df = the sum of the separate dfs. The
experiments will often yield a conclusive result, when separate experiments, taken
(fo - fe)2
X2 =
Σ fe
127
Table -11.1:
Strongly Strongly
approve Approve Indifferent Disapprove disapprove
Observed (fo) 23 18 24 17 18 100
Expected (fe) 20 20 20 20 20 100
(fo - fe) 3 2 4 3 2
(fo - fe)2 9 4 16 9 4
(fo - fe)2 .45 .20 .80 .45 .20
fe
(fo - fe)2
X2 =
Σ fe
X2 = 2.10 df = 4 P lies between .70 and .80 Which shows that the chi-
rather than quantitative measures as its data. It is particularly useful for research in
possible to rank with respect to each other the two members of each pair.
The sing test is applicable to the case of two related samples when the
experimenter wishes to establish that two conditions are different. The only
assumption underlying this test is that the variable under consideration has a
continuous distribution.
128
Example: Sign test applied to data consisting of 10 pairs of scores obtained under
two conditions of spelling (1) words in context (2) words spelled separately.
(1) (2) (3=1-2)
15 12 + Signs
18 15 + + 7
9 10 - - 2
15 16 - 0 1
18 18 0 10
12 10 +
15 12 +
16 13 +
14 12 +
22 19 +
Tables are available which give the number of signs necessary for
The test we have just discussed, the sign test, utilizes information simply
about the direction of the differences within pairs. If the relative magnitude as well as
the direction of the differences is considered, a more powerful test can be made.
The Wilcoxon matched pairs signed ranks test does just that: it gives more weight to
a pair which shown a large difference between the two conditions than to a pair
The Wilcoxon test is a most useful test for the behavioural scientist. With
behavioural data, it is not uncommon that the researcher can (a) tell which member
of a pair is “greater than” which i.e., tell the sign of the difference between any pair,
and (b) rank the differences in order of absolute size. That is, he can make the
129
judgement of “greater than” between any pair’s two performance, and also can make
that judgement between any two difference scores arising from any two pairs. With
Method: Let d = the difference score for any matched pair, representing the
difference between the pair’s scores under the two treatments. Each pair has one d.
To use the Wilcoxon test, rank all the d’s without regard to sign: give rank of 1 to the
smallest d, the rank 2 to the next smallest, etc. when one ranks scores without
Then to each rank affix the sing of the difference. That is, indicate which
ranks arose from negative d’s and which ranks arose from positive d’s.
Occasionally the two scores of any pair are equal. That is, no difference
between the two treatments is observed for that pair, so that d=0. Such pairs are
dropped from the analysis. This is the same practice that we follow with the sign test.
N= the number of matched pairs minus the number of Pairs whose d=0.
Another sort of tie can occur. Two or more d’s can be of the same size. We
assign such tied cases the same rank. The rank assigned is the average of the
ranks which would have been assigned if the d’s had differed slightly. Thus three
pairs might yield d’s of –1, -1, and +1. Each pair would be assigned the rank of 2, for
(1+2+3)/3= 2. Then the next d in order would receive the rank of 4, because ranks
Let T= the smaller sum of like signed ranks. That it, T is either the sum of the
positive ranks or the sum of the negative ranks, whichever sum is smaller. Table of
signed ranks gives various values of T and their associated levels of significance.
That is, if an observed T is equal to or less than the value in the body of signed rank
130
table under a particular significance level for the observed value of N, the null
When N is larger than 25, signed rank table of significance cannot be used.
However, it can be shown that such cases the sum of the ranks, T, is practically
N(N+1)
T-
4
z=
N(N+1)(2N+1)
24
size) come from the same population or from populations having the same median.
In the median test the null hypothesis is that there is no difference between the two
sets of scores because they have been taken from the same population.
131
To perform the median test, we first determine the median score for the
combined group (i.e. the median for all scores in both samples). Then we
dichotomize both sets of scores at that combined median and cast these data in a
2x2 table and calculate the chi-square. Above Vs below the common median
constitutes one category and group-I Vs group-II the other category. However when
N1 and N2 are small the chi-square test is not accurate and the exact method of
drug upon hand tremor. Fourteen psychiatric patients are given the drug, and 18
other patients matched for age and sex are given placebo (i.e., a harmless dose).
Since the medication is in pill form the patients do not know whether they are getting
the drug or not. The first group is the experimental, the second the control group.
scores of the two groups: a + sign indicates a score above the common median, a –
sign a score below the common median. Does the drug increase steadiness as
shown by lower scores in the experimental group? As we are concerned only if the
Table: Median test applied to experimental and control groups. Plus signs indicate
scores above the median, minus signs scores below the common median.
132
N=14 Sign N=18 Sign
Experimental control
53 + 48 -
39 - 65 +
63 + 66 +
36 - 38 -
47 - 36 -
58 + 45 -
44 - 59 +
38 - 53 +
59 + 58 +
36 - 42 -
42 - 70 +
43 - 71 +
46 - 65 +
46 - 46 -
55 +
61 +
62 +
53 +
Common median = 49.5
The common median is 49.5. in the experimental group 4 scores are above
and 10 below the common median instead of the 7 above and 7 below to be
expected by chance. In the control group, 12 scores are above and 6 below the
common median instead of the expected 9 in each category. These frequencies are
entered in the following table and X2 is computed by formula with correction for
continuity.
133
11.5. Mann Whitney U test
When at least ordinal measurement has been achieved, the Mann Whitney U
test may be used to test whether two independent groups have been drawn from the
same population. This is one of the most powerful of the nonparametric tests, and it
is a most useful alternative to the parametric t test when the researcher wishes to
avoid the t test’s assumptions, or when the measurement in the research is weaker
Let n1 = the number of cases in the smaller of two independent groups, and n 2
= the number of cases in the larger. To apply the U test, we first combine the
observations or scores from both groups, and rank these in order of increasing size.
In this ranking, algebraic size is considered, i.e, the lowest ranks are assigned to the
U = n1n2 + N1(n1+1) - R1
2
And
U = n1n2 + N2(n2+1) - R2
2
Where R1= sum of the ranks assigned to group whose sample size is n1.
R2= sum of the ranks assigned to group whose sample size is n2.
The above formulas yield different U’s. It is the smaller of these that we want.
The larger value is U’. The significance of the U is obtained by using the tables of
significance. Two types of tables are available where neither n1 or 22 is larger than
8, and when n2 is between 9 and 20. When n 2 > 20 the significance is determined
134
by converting U into z scores. A z score of 1.96 or above indicates significance
n1n2
U-
2
z=
(n1)(n2)(n1+n2+1)
12
For example, we might have used this method in finding the value of U for the data
given in the example for small samples above. The E and C scores fore that
example are given again in the following table, with their ranks.
For those data, R1=19 and R2= 26, and it will be remembered that n 1=4 and n2=5. By
applying formulas
135
11.6. Unit end questions
1. Differentiate parametric tests with non parametric tests
2. Describe Chi-square test
3. Differentiate the conditions under which Wilcoxon sing rank test and Manwitly
U test are used.
11.7. Source
1. Sidney Siegel , Nonparametric statistics for the behavioural sciences,. New
Delhi: Mc.Graw-Hill.
2. Singh A.K. (1997) Tests measurements and Research Methods in
Behavioural science Patna: Bharati Bhavan Publishers and Distributors.
136