Bivariate Analysis: Measures of Association
Measures of Association
Refers to bivariate statistical techniques used to measure the strength of a relationship between two variables.
The chi-square (2) test provides information about whether two or more less-than interval variables are interrelated. Correlation analysis is most appropriate for interval or ratio variables. Regression can accommodate either less-than interval independent variables, but the dependent variable must be continuous. CovarianceExtent to which two variables are associated systematically with each other.
Common Bivariate Tests
Type of Measurement Measure of Association Chi-Square Phi Coefficient Contingency Coefficient Chi-square Rank Correlation Correlation Coefficient Bivariate Regression
Nominal
Ordinal Scales
Interval and Ratio Scales
Walkups First Laws of Statistics
Law No. 1 Everything correlates with everything, especially when the same individual defines the variables to be correlated. Law No. 2 It wont help very much to find a good correlation between the variable you are interested in and some other variable that you dont understand any better.
Walkups First Laws of Statistics
Law No. 3
Unless you can think of a logical reason why two variables should be connected as cause and effect, it doesnt help much to find a correlation between them.
In Columbus, Ohio, the mean monthly rainfall correlates very nicely with the number of letters in the names of the months!
Correlation
.. is the measure of association
between two at least interval scaled variables such as age and income, sales and selling expenses.
Correlation
... is a mathematical relationship. It can
which ranges from +1 to -1.
never prove a casual connection. It does however give support to an explanation based on logic.
The correlation coefficient (r) for two variables
(X,Y) is
rxy
Simple Correlation Coefficient
rxy ryx
rxy. . (R) is a The correlation coefficient. .
measure of strength and direction of association
X X Y Y Xi X Yi Y
i i 2
-1
+1
R ranges between -1 (perfect negative linear relationship) to +1 (perfect positive linear relationship). R near zero reflects the absence of linear association
Simple Correlation Coefficient
Y
rxy
Correlation Patterns
NO CORRELATION R=0
Correlation Patterns
Negative correlation . . . The variables move in opposite directions. A high value on one variable will be associated with a low value on a 2nd variable
PERFECT NEGATIVE CORRELATION R = -1.0
Positive Correlation
. . . As one variable (x) increases or decreases, the second variable (y) increases or decreases. The variables move in the same direction. Market Share (y)
Brand Awareness (x) Positive correlation: As brand awareness , market increases
Correlation Coefficient
There is linear correlation
rxy
There is linear correlation
No linear correlation
-1
{
0
Decision points
Correlation Coefficient
SAMPLE SIZE For P<.05 n
rxy
Decision Points
For
Statistical Significance
At P<.05
5 6 7 8 9 10 100
.878 .811 .754 .707 .666 .632 .196
Statistical Significance
Ho: r = 0
t = xy
Square root of n-2 divided by 1-r squared
Correlation Coefficient Interpretation
Strongly Disagree
Strongly Agree
Neutral
Good Taste
Strongly Agree
3
Neutral
4*
Strongly Disagree
High price
4*
Statistical Results: r = -.61, p = .07, n =100 As the taste of seven up increases, the price
Strongly Disagree Neutral Strongly Agree
Simple Correlation Coefficient
Calculation of r
rxy
6.3389 17.837 5.589
.635
6.3389 99.712
Pg 629
Coefficient of Determination
Coefficient of Determination (R2)
A measure obtained by squaring the correlation coefficient; the proportion of the total variance of a variable accounted for by another value of another variable. Measures that part of the total variance of Y that is accounted for by knowing the value of X.
Explained variance R Total Variance
2
EXHIBIT 23.3
Correlation Analysis of Number of Hours Worked in Manufacturing Industries with Unemployment Rate
Correlation Matrix
Correlation matrix -The standard form for
reporting correlation coefficients for more than two variables.
The Significance of the Correlation- The
procedure for determining statistical significance of a correlation coefficient is the t-test.
Correlation Matrix for 3 variables
The standard form for reporting correlation results.
Var1 Var1 Var2 Var3 1.0 0.45 0.31 Var2 0.45 1.0 0.10 Var3 0.31 0.10 1.0
EXHIBIT 23.4
Pearson Product-Moment Correlation Matrix for Salesperson Examplea
aNumbers bp<
below the diagonal are for the sample; those above the diagonal are omitted.
.001. cp< .01. dp< .05.
Correlation Does Not Mean Causation
When two variables covary, they display concomitant variation. Systematic covariation (a high correlation) does not in and of itself establish causality Roosters crow and the rising of the sun
Rooster does not cause the sun to rise. Variables covary because they are both influenced by a third variable
Teachers salaries and the consumption of liquor
Excel Spreadsheet The Correlation coefficient for two variables, X and Y is computed by the following excel instruction. Fx = correl (col2:col22:col3,col32)
Where Xs data is in column 2 and Ys data is in column 3.
Correlation
Correlation Coefficient, r = .75
Correlation: Player Salary and Ticket Price
30 20 10 0 -10 -20
1995 1996 1997 1998 1999 2000 2001
Change in Ticket Price Change in Player Salary
Regression Analysis