Categorical Predictor S
Categorical Predictor S
Predictors
Shankar Venkatagiri
Salaries by Gender
Scenario 1
Salary in $'000s
170
Filename:
Salaries.R
160
150
140
130
120
110
Female
Male
Open up GenderSalaries.csv
Conditions
150
110
norm quantiles
130
Males$Salary
160
140
120
Females$Salary
170
norm quantiles
Results
Filename:
Salaries.R
Attribution
Q: Could there be lurking variables - some other explanation
for the salary difference?
Years of experience?
25
10
15
20
25
170
20
140
110
15
YearsExp
15
Frequency
0.4
0.8
Group
110
130
150
170
0.0
0.4
Frequency
25
0.0
Frequency
15
r = 0.36
10
Salary
25
0 5
Scatterplot Matrix
Frequency
GroupYears
0.8
x
Confounding
Exp=5
Two Sample t-test of managers with 5 years of experience
data: Females5$Salary and Males5$Salary
t = -0.0885, df = 22, p-value = 0.9303
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval for the difference:
-9.228310 8.472755
sample estimates:
mean of x mean of y
137.7333 138.1111
Exp=5
Two Sample t-test of managers with 5 years of experience
data: Females5$Salary and Males5$Salary
t = -0.0885, df = 22, p-value = 0.9303
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval for the difference:
-9.228310 8.472755
sample estimates:
mean of x mean of y
137.7333 138.1111
Subsets
Filename:
Salaries.R
Group-wise
WOMEN: lm(formula = Salary ~ YearsExp, data = Females)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 130.9888
3.3241
39.41
YearsExp
0.3882
3.03
1.1760
0.1387,
Adjusted R-squared:
0.1236
p-value: 0.003677
***************************************************************
MEN: lm(formula = Salary ~ YearsExp, data = Males)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 135.6001
2.9015
46.735
YearsExp
0.2236
3.404
0.00092 ***
0.7611
0.08497
p-value: 0.0009203
Contrast
Combining
Combining
Combining
Full model
lm(Salary ~ YearsExp + factor(Group)+ factor(Group)* YearsExp,
data = Salaries)
Estimate Std. Error t value Pr(>|t|)
(Intercept)
130.9888
3.4902 37.531 < 2e-16
YearsExp
1.1760
0.4076
2.885 0.00442
factor(Group)1
4.6113
4.4970
1.025 0.30663
YearsExp:factor(Group)1 -0.4149
0.4625 -0.897 0.37088
Residual standard error: 11.79 on 170 degrees of freedom
Multiple R-squared: 0.1352, Adjusted R-squared: 0.1199
F-statistic: 8.859 on 3 and 170 DF, p-value: 1.728e-05
Case of dj vu?
SPSS: Dummy
SPSS: Dummy
Interaction
Interaction
Interpret
Ok to Regress?
Variance
Proceed
Principle of Marginality
Drop
VIFs
No interaction
lm(Salary ~ YearsExp + factor(Group), data = Salaries)
Estimate Std. Error t value Pr(>|t|)
(Intercept)
133.4676
2.1315
62.616
YearsExp
0.8537
0.1925
factor(Group)1
1.0242
2.0576
0.498
0.619
p-value: 6.047e-06
SPSS
Q: Is D significant??
Conclusion
R
lm(formula = Salary ~ YearsExp, data = Salaries)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 133.7420
2.0545
YearsExp
0.1761
0.8920
65.098
p-value: 1.039e-06
Scenario 2
Open up BankSalaries.csv
Difference
Salaries by Gender
norm quantiles
Females$Salary
90000
80000
70000
60000
50000
40000
norm quantiles
30000
Female
Male
Dummy
37209.9
894.5
8295.5
1564.5
41.597
p-value: 2.935e-07
Regress
Regress
34528.3
1138.0
30.342
280.0
102.5
2.733
-4098.3
1665.8
-2.460
1247.8
136.7
9.130
0.6333
Visualise
Improved
Final