5 Statistical Analysis
5 Statistical Analysis
5
Statistical Analysis
Statistics, Mathematics, and Measurement
A Statistical Flow Chart
In the first four chapters of the text, we have focused on concerns of research
design: the scientific method, types of research, proposal elements, measurement
types, defining variables, and problem and hypothesis statements. But designing a
plan to gather research data is only half the
picture. When we complete the gathering
portion of a study, we have nothing more
than a group of numbers. The informa-
tion is meaningless until the numbers
are reduced, condensed, summarized,
analyzed and interpreted.
Statistical analysis converts num-
bers into meaningful conclusions in
accordance with the purposes of a
study. We will spend chapters 15-26
mastering the most popular statistical
tools. But you must understand something of
statistics now in order to properly plan how you
should collect your data. That is, the proper development of a research proposal is
dependent on what kind of data you will collect and what statistical procedures exist
to analyze that data.
The fields of research design and statistical analysis are distinct and separate
disciplines. In fact, in most graduate schools, you would take one or more courses in
research design and other courses in statistics. My experience with four different
graduate programs has been that little effort is made to bridge the two disciplines. Yet,
the fields of research and statistics have a symbiotic relationship. They depend on each
other. One cannot have a good research design with a bad plan for analysis. And the
best statistical computer program is powerless to derive real meaning from badly
collected data. So before we get too far into the proposal writing process, some time
must be given to establishing a sense of direction in the far-ranging field of statistics.
Descriptive Statistics
Descriptive statistical procedures are used to describe a group of numbers. These
tools reduce raw data to a more meaningful form. You’ve used descriptive statistics
when averaging test grades during the semester to determine what grade you’ll get.
The single average, say, a 94, represents all the grades you’ve earned in the course
throughout the entire semester. (Whether this 94 translates to an “A” or a “C” de-
pends on factors outside of statistics!). Descriptive statistics are covered in chapters 15
(mean and standard deviation) and 22 (correlation).
Inferential statistics
Inferential statistics are used to infer findings from a smaller group to a larger
one. You will recall the brief discussion of “population” and “sample” in chapter 2.
When the group we want to study is too large to study as a whole, we can draw a
sample of subjects from the group and study them. Descriptive statistics about the
sample is not our interest. We want to develop conclusions about the large group as a
whole. Procedures that allow us to make inferences from samples to populations are
called inferential statistics.
For example, there are over 36,000 pastors in the Southern Baptist Convention. It
is impossible to interview or survey or test all 36,000 subjects. Round-trip postage
alone would cost over $21,000. But we could randomly select, say, one percent (1%) or
360 pastors for the study, analyze the data of the 360, and infer the characteristics of
the 36,000. Inferential procedures are covered in chapters 16, 17, 18, 19, 20, and 21.
Studying
1 Similarities 2
Relationships Among Variables Differences
Between or Between
Variables Differences Groups
Between Groups?
Interval/Ratio?
I/R Interval/Ratio?
Ordinal? I/R O
N Ordinal?
Nominal?
O
3 4 5 6 7
2 Vars 2 Ranks 2 Dicho* 1 Group 2 Groups
3+ Vars 3+ Ranks 1 Var 2 Groups 3+ Groups
2 Vars 2+ Groups
· Point Biserial
1 Dichotomous and 1 Interval/Ratio 6a · One-sample z-test
Sample mean and Population mean - σ known, or n>30
· One-sample t-test
Sample mean and Population mean - σ unknown
χ²)
5b ·Chi-Square (χ
Goodness of Fit 7c · Kruskal-Wallis H test
Equal E · Proportional E Rankings Divided into 3+ Matched Groups
ASSOCIATION DIFFERENCE
*Dichotomous - two and only two categories
-3a-
Int/ratio Correlation with 2 Variables
The two procedures we will study are Pearson’s Product Moment Correlation
Coefficient (rxy or simply r) and simple linear regression. Pearson’s r directly mea-
sures the degree of association between two interval/ratio variables. See Chapter 22.
Simple linear regression computes an equation of a line which allows researchers to
predict one interval/ratio variable from another. See Chapter 26.
-3b-
Interval\ratio Correlation with 3+ Variables
The procedure we will study which analyzes three or more interval/ratio vari-
ables simultaneously is multiple linear regression. This procedure is quickly becom-
ing the dominant statistical procedure in the social sciences. With this procedure, you
develop “models” which relate two or more “predictor variables” to a single predicted
variable. We will confine our study to understanding the printouts of a statistical
computer program called SYSTAT. See Chapter 26.
-4a-
Ordinal Correlation with 2 Variables
The two procedures which compute a correlation coefficient between two ordinal
variables are Spearman’s rho (rs) and Kendall’s tau (τ). Spearman’s rho should be
used when you have ten or more pairs of rankings; Kendall’s tau when you have less
than ten. Both measures give you the same information. If you had pastors and minis-
ters of education rank order seven statements of “characteristics of Christian leader-
ship,” you would compute the degree of agreement between the rankings of the two
groups with Kendall’s tau. See Chapter 22.
-4b-
Ordinal Correlation with 3+ Variables
Kendall’s Coefficient of Concordance (W) measures the degree of agreement in
ranking from more than two groups. Using our example above, you could compute
the degree of agreement in rankings of pastors, ministers of education and seminary
professors using Kendall’s W. See Chapter 22.
-5a-
Nominal Correlation with Dichotomous Variables
When you have two variables which can take two and only two values (“dichoto-
mous variables”), use the Phi Coefficient. When you have one dichotomous and one
rank variable, use Rank Biserial. When you have one dichotomous and one interval/
ratio variable, use Point Biserial. See Chapter 22.
-5b-
Nominal Correlation wth 3+ Categories
Procedures which determine whether two nominal variables are independent (not
χ²) tests. The word "chi"
related) or not independent (related) are called Chi-square (χ
is pronounced "ki" as in "kite."
The Chi-square Goodness of Fit test compares observed category counts (30
males [75%], 10 females [25%]) with expected counts based on school enrollment [85%
male, 15% female] to determine if class enrollment “fits well” the expected enrollment.
The Chi-square Test of Independence compares two nominal variables to determine
if they are independent. Are “educational philosophy” (5 categories) and “leadership
style” (5 categories) independent of each other?
When you want to determine the strength of the relationship between the two
nominal variables, use Cramer’s Phi (φc). This procedure computes a Pearson’s r type
coefficient from the computed χ² value. See Chapter 23.
-6a-
1-Sample Parametric Tests of Difference
The first type of interval/ratio difference procedures computes whether data from
one sample is significantly different from the population from which it was drawn. If
you have more than 30 subjects in the sample, use the one-sample z-test. If you have
fewer than 30 subjects, use the one-sample t-test. Here’s an example: You know the
average income of all Southern Baptist pastors in Texas. You collect information on
income of a sample of Southern Baptist pastors who are also seminary graduates. Is
there a significant difference in average income between the sample and the popula-
tion? See Chapter 19.
-6b-
2-Sample Parametric Tests of Difference
The second type computes whether data from two samples is significantly differ-
ent. There are two different procedures which are used. The first is used when the two
samples are randomly selected independently of each other: a sample of Texas pastors
and a second sample of Texas ministers of youth. For this situation, use the Indepen-
dent Samples t-test. See Chapter 20.
The second procedure is used when pairs are sampled. Examples of sampling
pairs include husbands and their wives, pastors and their deacon chairmen, fathers
and their sons, counselors and their clients, and so forth. If you have two groups of
paired subjects (husbands and their wives), use the Matched Samples t-test. See
Chapter 20.
-6c-
n-Sample Parametric Tests of Difference
The third type computes “significant difference” across three or more groups.
Again, procedures depend on whether the groups are matched (correlated, related) or
independent.
If the groups are independent, and you are examining one independent (“group-
ing”) variable, use One-Way Analysis of Variance. For example, is there a significant
difference in Integration of Faith with Life between Southern Baptists, Episcopalians,
and members of the Assemblies of God. See Chapter 21.
If you are studying two or more independent variables simultaneously, use n-Way
Analysis of Variance (also called Factorial ANOVA), where n is the number of inde-
pendent variables. The importance of Factorial ANOVA is in the ability to study
-7a-
The Wilcoxin Matched Pairs Test (Wilcoxin T) is analogous to the Matched Sam-
ples t-test.
-7b-
The Wilcoxin Test Rank Sum Test and the Mann-Whitney U Test are analogous
to the Independent Samples t-test.
-7c-
The Kruskal-Wallis H Test is analogous to the One-Way ANOVA.
Summary
In this chapter we introduced you to statistical analysis. We linked statistics to the
process of research design. We looked at two major divisions of statistics. We sepa-
rated the practical application of statistical procedures from the need for higher level
mathematics skills. We differentiated statistical differences by measurement type. And
finally, we laid out a mental map of statistical procedures we will be studying so that
you can determine which procedures might be of use to you in your own proposal.
Vocabulary
correlation coefficient a number which reflects the degree of association between two variables
Cramer’s Phi measures strength of correlation between two nominal variables
descriptive statistics measures population or sample variables
Factorial ANOVA two-way, three-way ANOVA”
Goodness of Fit compares observed counts with expected counts on 1 nominal variable
Indep't Samples t-test tests whether the average scores of two groups are statistically different
Inferential statistics INFERS population measures from the analysis of samples
Kendall’s tau correlation coefficient between two sets of ranks (n < 10)
Kendall’s W correlation coefficient among three or more sets of ranks
Kruskal-Wallis H Test non-parametric equivalent of ANOVA
Linear Regression establishes the relationship between one variable and one predictor variable
Mann-Whitney U Test non-parametric equivalent of the independent t-test
Matched Samples t-test tests whether the paired scores of two groups are statistically different
Multiple Regression establishes the relationship between one variable and multiple predictor variables
one-sample z-test tests whether a sample mean is different from its population mean (n > 30)
one-sample t-test tests whether a sample average is different from its population average
One-Way ANOVA tests whether average scores of three or more groups are statistically different
Pearson’s r correlation coefficient between two interval/ratio variables
Study Questions
1. Differentiate between descriptive and inferential statistics.
2. Consider your own proposal. Review the types of data (Chapter 3). List several statistical
procedures you might consider for your proposal. Scan the chapters in this text which
deal with the procedures you’ve selected.
3. Give one example of each data type (Review Chapter 3). Identify one statistical procedure
for each example you give.
Identify which statistical procedure should be used for the following kinds of studies. Write the
letter of the procedure in the blank.
____ 1. Difference between fathers and their adult sons on a Business Ethics test.
____ 3. Analysis of six predictor variables for job satisfaction in the ministry.
____ 4. Difference in Bible Knowledge test scores across three groups of youth
ministers.
____ 6. Relationship between number of years in the ministry and job satisfaction
score.