S Doc1
S Doc1
Q1. To decide whether a company is discriminating against women, the following data were
collected from the companys records: Salary is the annual salary in thousands of dollars,
Qualification is an index of employee qualification, and Gender (1, if the employee is a man,
and 0, if the employee is a woman). Two linear models were fit into the data and the
regression outputs are shown in the table below. Suppose that the usual regression
assumptions hold.
(a) Are men paid more than equally qualified women?
(b) Are men less qualified than equally paid women?
(c) Do you detect any inconsistency in the above results? Explain.
(d) Which model would you advocate if you were the defense lawyer? Explain.
Model 1: Dependent Variable is Salary
Variable Coefficient s.e. t-Test p-Value
Constant 20009.5 0.8244 24271 <0.0001
Qualification 0.935253 0.0500 18.7 <0.0001
Gender 0.224337 0.4681 0.479 0.6329
Model 2: Dependent Variable is Qualifiction
Variable Coefficient s.e. t-Test p-Value
Constant -16744.4 896.4 -18.7 <0.0001
Qualification 0.850979 0.4349 1.96 0.0532
Salary 0.836991 0.0448 18.7 <0.0001
Ans 1.
(a) Using Model 1, the regression equation for Salary (response variable) and Qualification and
Gender (predictor variables) is found to be:
Salary = 20009.5 + 0.935253 Q + 0.224337 G + (1)
(
) (
Gender is categorical:
0 ----------- Women
1 ----------- Men
Keeping the qualification constant, the salary for women and men can be determined as:
Salary for women: E[Salary| Qualification, Gender = 0]
= (
+ (
Q + (
G ..(From (1))
= 20009.5 + 0.935253 Q .. (Substituting G=0 in (1))
Salary for men: E[Salary| Qualification, Gender = 1]
= (
+ (
Q + (
G ..(From (1))
= ((
+(
) +(
) (
Gender is categorical:
0 ----------- Women
1 ----------- Men
Keeping the salary constant, the qualification for women and men can be determined as:
Qualification for women: E[Q| Salary, Gender = 0]
= (
+ (
G + (
S ..(From (2))
= - 16744.4 + 0.836991 S .. (Substituting G=0 in (2))
Qualification for men: E[Q| Salary, Gender = 0]
= (
+ (
G + (
S ..(From (2))
= - 16744.4 + 0.850979 + 0.836991 S.. (Substituting G=1 in (2))
= -16743.55 + 0.836991 S
When salaries are the same, difference in qualification between women and men
= -16744.4- (-16743.55)
= - 0.85
Thus, on an average, men are 0.85 times less qualified than equally paid women.
Q2. The table below shows the regression output of a multiple regression model relating the
beginning salaries in dollars of employees in a given company to the following predictor
variables:
Gender An indicator variable ( 1= Man and 0 = Woman)
Education Years of schooling at the time of hire
Experience Number of months of previous work experience
Months Number of months with the company
In (a) to (b) below, specify the null and the alternative hypotheses, the test used and your
conclusion using a 5% level of significance
(a) Conduct the F-test for the overall fit of the regression.
(b) Is there a positive linear relationship between Salary and Experience, after accounting
for the effect of the variables Gender, Education and Months?
(c) What salary would you forecast for a man with 12 years of education, 10 months of
experience, and 15 months with the company?
(d) What salary would you forecast, on an average, for men with 12 years of education, 10
months of experience and 15 months with the company?
(e) What salary would you forecast, on an average, for women with 12 years of education,
10 months of experience and 15 months with the company?
Regression output when salary is related to four predictor variables
ANOVA Table
Source Sum of Squares df Mean Square F-Test
Regression 23665352 4 5916338 22.98
Residuals 22657938 88 257477
Ans 2.
(a) To test the overall fit of the model:
RM:
: Y=
+
FM:
: Y=
+ +
+
(Alternatively, we can have the following hypotheses:
H
0
:
1
=
2
=
3
=
4
H
1
: not H
0
)
The table provides the value of F-test = 22.98
Critical F-value with 4 and 88 degrees of freedom at 0.05% level of significance is 2.475.
Since the observed F-value is greater than the critical value, null hypothesis is rejected. Thus, at
least one of the is not zero.
The goodness of fit can also be seen by the value of R
2
= 0.515, which implies that the predictor
variables considered in our analysis can predict around 51% of the response variable.
(b) To test a positive linear relationship between Salary and Experience, we can have the
following null and alternate hypotheses:
Coefficients Table
Variable Coefficient s.e. t-Test p-value
Constant 3526.4 327.7 10.76 0.000
Gender 722.5 117.8 6.13 0.000
Education 90.02 24.69 3.65 0.000
Experience 1.2690 0.5877 2.16 0.034
Months 23.406 5.201 4.50 0.000
N=43 R
2
= 0.515
2
= 0.489 = 507.4 df= 88
RM:
: Y=
+
FM:
: Y=
+ +
+
(Alternatively, we can have the following hypotheses:
H
0
:
3
= 0
H
1
: not H
0
)
The t-value for Experience = 2.16 as well as the p-value < 0.05. Moreover, the coefficient of
experience is +1.2690.
Hence, we can say that there is a positive linear relationship between Salary and Experience.
(c)
+ +
+
Y
= 5692.92
(d) We need to find the confidence interval for
0
, which is given by:
[
0
- t
np1,/2
* s.e (
0
) ,
0
+ t
np1,/2
* s.e (
0
)]
0
is found by substituting the given values in equation (1)
0
= 3526.4+722.5 (1) + 90.02 (12) + 1.2690 (10) + 23.406 (15)
0
= 5692.92
t- statistic (from the table) with n-p-1 = 93-4-1= 88 degrees of freedom at = 0.05/2 = 0.025 is:
t-value = 1.987
s.e.
0
=
(yy)
2
np1
= MSE
= 507.42
Confidence interval: [5692.92 (1.987 * 507.42), 5692.92 + (1.987 * 507.42)]
[4684.68, 6701.16]
Thus, the average salary for men, for the given data, will be between $4684.68 and
$6701.16.
(e) We need to find the confidence interval for
0
, which is given by:
[
0
- t
np1,/2
* s.e (
0
) ,
0
+ t
np1,/2
* s.e (
0
)]
0
is found by substituting the given values in equation (1)
0
= 3526.4+722.5 (0) + 90.02 (12) + 1.2690 (10) + 23.406 (15)
0
= 4970.42
t- statistic (from the table) with n-p-1 = 93-4-1= 88 degrees of freedom at = 0.05/2 = 0.025 is:
t-value = 1.987
s.e.
0
=
(yy)
2
np1
= MSE
= 507.42
Confidence interval: [4970.42 (1.987 * 507.42), 4970.42+ (1.987 * 507.42)]
[3962.18, 5978.66]
Thus, the average salary for women, for the given data, will be between $3962.18 and
$5978.66.