Normal distribution check
Normal distribution check
stcp-marshall-normalityX
It is very unlikely that a histogram of sample data will produce a perfectly smooth normal curve like
the one displayed over the histogram, especially if the sample size is small. As long as the data is
approximately normally distributed, with a peak in the middle and fairly symmetrical, the
assumption of normality has been met.
The normal Q-Q plot is an alternative graphical method of assessing normality to the histogram
and is easier to use when there are small sample sizes. The scatter should lie as close to the line
as possible with no obvious pattern coming away from the line for the data to be considered
normally distributed. Below are the same examples of normally distributed and skewed data.
Q-Q plot of approximately normally distributed data Q-Q plot of skewed data
skewed
For both of these examples, the sample size is 35 so the Shapiro-Wilk test should be used. For
the skewed data, p = 0.002 suggesting strong evidence of non-normality. For the approximately
normally distributed data, p = 0.585, so normality can be assumed and provided any other test
assumptions are satisfied, an appropriate parametric test can be used.
What if the data is not normally distributed?
If the checks suggest that the data is not normally distributed, there are two options:
• Transform the dependent variable (repeating the normality checks on the transformed data):
Common transformations include taking the log or square root of the dependent variable.
• Use a non-parametric test: Non-parametric tests are often called distribution free tests and
can be used instead of their parametric equivalent.
Note: The residuals are the differences between the observed and expected values.
Excel will not perform non-parametric tests even with the data analysis toolpak add in. Both AI-
therapy and the ‘Real-Statistics’ add in will though.
Select ‘Descriptive Statistics and Normality’ and click OK to open the following dialog box.
If your data is skewed and a non-parametric test is needed, comparisons of two sets of data can
be accessed at https://www.ai-therapy.com/psychology-statistics/hypothesis-testing/two-samples
and for comparisons of more than two sets at:
https://www.ai-therapy.com/psychology-statistics/hypothesis-testing/anova