0% found this document useful (0 votes)
1K views14 pages

BUS 173: Multiple Regression Project

The document discusses a project analyzing the effects of various factors on employee starting salary. It aims to examine how age, education, gender, work experience, and minority classification impact salary. It provides the estimated regression equation relating these independent variables to the dependent variable of starting salary. It also explains the meaning of the coefficients in the equation and asks several questions about interpreting the results, such as determining the predicted salary for a specific employee and identifying significant factors based on p-values.

Uploaded by

Nahian Sattar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1K views14 pages

BUS 173: Multiple Regression Project

The document discusses a project analyzing the effects of various factors on employee starting salary. It aims to examine how age, education, gender, work experience, and minority classification impact salary. It provides the estimated regression equation relating these independent variables to the dependent variable of starting salary. It also explains the meaning of the coefficients in the equation and asks several questions about interpreting the results, such as determining the predicted salary for a specific employee and identifying significant factors based on p-values.

Uploaded by

Nahian Sattar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
  • Acknowledgement
  • Project Outline
  • Answer to question number 1
  • Answer to question number 2
  • Answer to question number 3
  • Answer to question number 4
  • Answer to question number 5
  • Answer to question number 6
  • Answer to question number 7
  • Statistical Data Output
  • Regression
  • ANOVA

lOMoARcPSD|8215923

BUS 173 Zkh3

Applied Statistics (North South University)

StuDocu is not sponsored or endorsed by any college or university


Downloaded by Nahian Sattar (nahiansattar99@[Link])
lOMoARcPSD|8215923

PROJECT ON MULTIPLE REGRESSIONS

Instructed by:

Dr. Zakir Hossain (ZKH3)

Course Instructor

Department of Management

North South University

BUS 173: Applied Statistics

Instructed to:

MD. Mazbaul Islam | 1620236030

Sayed Rubayet Hossain | 1821513030

Sazzat Hossain | 1610721030

Md. Rezowan | 1731468030

Rafsan Jahangir | 1731307030

Section 17

Team 4

North South University

Submission date: December 14, 2019

Page | 1

Downloaded by Nahian Sattar (nahiansattar99@[Link])


lOMoARcPSD|8215923

December 14, 2019

Dr. Zakir Hossain (ZKH3)

Course Instructor
Department of Management

North South University

Subject: Submission of report.

Dear Sir,

We are pleased to submit the following report of the project on multiple regressions as a part
of the applied statistics course. The following report contains the overall development of the
topic.

The main purpose of this report is to be able to apply the theories learnt so far in the course.
We appreciate the opportunity of doing this report. We are certain that this report will merit
your approval.

Sincerely yours,

Section 17, Team 4.

MD. Mazbaul Islam (1620236030)

Sayed Rubayet Hossain (1821513030)

Sazzat Hossain (1610721030)

Md. Rezowan (1731468030)

Rafsan Jahangir (1731307030)

Page | 2

Downloaded by Nahian Sattar (nahiansattar99@[Link])


lOMoARcPSD|8215923

Acknowledgement

We would like to express our gratitude to our respected faculty Dr. Zakir Hossain (ZKH3) for
providing us with the opportunity to write the report on such a topic and guiding us
throughout the course making this report on multiple regressions. We would also like to
thank our respected faculty for giving us his valuable time and effort into making sure that we
learn something fun and effective that we will be able to execute in real life for the long run.

Page | 3

Downloaded by Nahian Sattar (nahiansattar99@[Link])


lOMoARcPSD|8215923

Table of Contents

Acknowledgement ................................................................................................................ 3

Project outline: ...................................................................................................................... 5

Answer to question number 1: ............................................................................................... 6

Answer to question number 2: ............................................................................................... 6

Answer to question number 3: ............................................................................................... 6

Answer to question number 4: ............................................................................................... 8

Answer to question number 5: ............................................................................................... 8

Answer to question number 6: ............................................................................................... 8

Answer to question number 7: ............................................................................................. 10

Statistical Data Output: ....................................................................................................... 11

Regression .......................................................................................................................... 11

Notes .............................................................................................................................. 11
Notes .............................................................................................................................. 12
Variables Entered/Removed ............................................................................................ 12
Model Summary .............................................................................................................. 12
ANOVA ............................................................................................................................ 13
Coefficients ..................................................................................................................... 13

Page | 4

Downloaded by Nahian Sattar (nahiansattar99@[Link])


lOMoARcPSD|8215923

Project outline:

BUS173.17 - Project (ZKH3)


Fall 2019

The goal of the project is to examine the effect of age, education, gender, work experience
and minority classification on the beginning salary of the employees in a company. Different
data sets are given to different groups (individual) of students.

[Use SPSS/MINITAB/any other computer packages for data analysis]


1. Introduction: Write a paragraph about introduction of the project including data and
variables.
2. Find the estimated regression equation of the beginning salary on the age, education,
gender, and work experience and minority classification.
3. Explain the meaning of the estimated regression coefficients (slope) values.
4. What is the value of coefficient of multiple determination? Interpret this value.
5. What is the predicted beginning salary for the employee who is non-white male of age
30 years, and has 20 years of education with 12 years of work experience?
6. Determine the 95% confidence intervals for the regression coefficients.
7. Discussion and conclusion: Write two paragraphs: one for association between
dependent and independent variables [use sign of estimated coefficients] and the other
about the significant (important) factors for determining the beginning salary of the
employees on the basis of your results obtained from the analysis [use p-values].
………………………………………………….

Page | 5

Downloaded by Nahian Sattar (nahiansattar99@[Link])


lOMoARcPSD|8215923

Answer to question number 1:

We want to investigate the effect of age(x1), education(x2), gender(x3), work experience(x4)


̂) of the employees in a company.
and minority classification(x5) on the beginning salary (𝒚
A sample size of 230 was set with the age, education, gender, work experience, minority
classification being the independent variable and the beginning salary being the
dependent variable.

Answer to question number 2:

The estimated regression equation of the beginning salary on age, education, gender, work
experience, and minority classification.

̂= a+b1x1+b2 x2+b3x3+b4x4+b5x5
𝒚

Beginning Salary = -673.716 + (29.535×age) + (526.941×education) + (-1597.772×gender) +


(16.183×work experience) + (-998.747×Minority Classification)

Answer to question number 3:

Explanation on the estimated regression coefficient values:

First of all, x1 represent age, x2 represent education, x3 represent gender, x4 represent work
experience and x5 represent minority classification.

For the value of a= = -673.716 in the estimated regression equation gives the value of y if
x1=0, x2=0,x3=0, x4=0 and x5=0 , that is on average the beginning salary of employees with
no age issues, no education biasness, no gender biasness, no work experience issues and no
effect on minority will earn = -673.716 dollars.

Age of employee (years)

Education (years)

White (0)

Minority Classification

Non-white (1)

Page | 6

Downloaded by Nahian Sattar (nahiansattar99@[Link])


lOMoARcPSD|8215923

Work Experience (years)

Male (0)

Gender

Female (1)

*0 and 1 are the reference group.

*Negative sign before b3 means female employees get $1597.772 less than male employees
on average.

*Negative sign before b5 means non- whites get $998.747 less than whites on average.

Therefore,

The value of b1= 29.535 in the estimated regression model gives the change in y for a unit
change in x1 when x2, x3, x4, and x5 are held constant.

The value of b2= 526.941in the estimated regression model gives the change in y for a unit
change in x2 when x1, x3, x4, and x5 are held constant.

The value of b3 = -1597.772 means that female people get $1597.772 less than male people
on average when x1, x2,x4, and x5 are held constant.

The value of b4= 16.183 in the estimated regression model gives the change in y for a unit
change in x4 when x1, x2,x3, and x5 are held constant.

the value of b5= -998.747 means that non- whites get $998.747 less than whites on average
when x1,x2, x3, and x4 are held constant.

[All the values are found from the Coefficients table of our Analysis]

Page | 7

Downloaded by Nahian Sattar (nahiansattar99@[Link])


lOMoARcPSD|8215923

Answer to question number 4:

The coefficient of multiple determinants, R2 = 0.491

0.491 (49.1%) indicates that the model explains very less of the variability of the response
data around it's Mean. In general, the higher the value of the R2 better the model fits the data.
The more variance that is accounted for by the regression model, the closer the data points
will fall to the fitted regression line. Moreover, 49.1% of the total variation in the beginning
salary is explained by age, education, gender, work experience and minority classification
and the rest 50.9 variation depends on other factors.

[Values are taken from the Model Summary table of our Analysis]

Answer to question number 5:

The formula which we get

̂= a+b1x1+b2 x2+b3x3+b4x4+b5x5…………………………………………. (1)


𝒚

Now, the predicted given salary for the employees who are non-white, male,30 years age, has
20 years of education and is 12 years of work experience will be

𝑦̂ = = -673.716 + (29.535×30) + (526.941×20) + (-1597.772×0) + (16.183×12) + (-


998.747×1) [From 1]

= -673.716+886.05+10538.82-0+194.196-998.747

= 9946.603

Answer to question number 6:

95% confidence level for regression coefficient are-

For B1(Age)

b1 ± tsb1

= 29.535± (1.600×18.459)

= 29.535± (29.5344)

= 0.0006 to 59.0694

Page | 8

Downloaded by Nahian Sattar (nahiansattar99@[Link])


lOMoARcPSD|8215923

For B2 (Education)

b2 ± tsb1

= 526.941± (9.994×52.725)

= 526.941± (526.93365)

= 0.00735 to 1053.874

For B3 (Gender)

b3 ± tsb3

= -1597.772± (-5.204×307.051)

= -1597.772±( -1597.893404)

= -3195.6654 to 0.12140

For B4 (Work experience)

b4 ± tsb4

= 16.183± (.639×25.322)

= 16.183± 16.180758

= 0.002242 to 32.363758

For B5 (Minority Classification)

B5 ± tsb5

= -998.747± (-3.149×317.155)

= -998.747± (-998.7210)

= -1997.468095 to 0.026

[All the values are found from the Coefficients table of our Analysis]

Page | 9

Downloaded by Nahian Sattar (nahiansattar99@[Link])


lOMoARcPSD|8215923

Answer to question number 7:

Sign of estimated coefficient: Here age (29.535), educational level (526.941) and work
experience (16.183) are positively associated with beginning salary .On the other hand,
gender (-1597.772) and minority classification (-998.747) are negatively associated with
beginning salary.

Significant factors: Here a= 5% (1-.95) or 0.05 and educational level (p =0.000) which is
below 0.5 that means its highly significant. Gender (p =.001) is below .005 it’s also highly
significant. On the basis of the result obtained from the analysis, we found that the P-value
are 0.000 for both sex of employees and education level and 0.002 for minority classification
which signifies that these three are significant factors for determining beginning salary. P-
value of age of employee is 0.111 and work experience is 0.523, which signifies that age of
employee and work experience are not a significant factor for determining beginning salary

Page | 10

Downloaded by Nahian Sattar (nahiansattar99@[Link])


lOMoARcPSD|8215923

Statistical Data Output:

Your temporary usage period for IBM SPSS Statistics will expire in
5868 days.

GET
FILE='C:\Users\Classroom\Desktop\report\g4_230.sav'.
DATASET NAME DataSet1 WINDOW=FRONT.
REGRESSION /MISSING LISTWISE /STATISTICS COEFF OUTS R ANOVA
/CRITERIA=PIN(.05) POUT(.10) /NOORIGIN /DEPENDENT salbeg
/METHOD=ENTER age edlevel sex work minority.

Regression
Notes
Output Created 07-DEC-2019 [Link]

Comments

Input Data C:
\Users\Classroom\Desktop
\report\g4_230.sav

Active Dataset DataSet1

File Label SPSS/PC+

Filter <none>

Weight <none>

Split File <none>

N of Rows in Working Data 230


File

Missing Value Handling Definition of Missing User-defined missing


values are treated as
missing.

Cases Used Statistics are based on


cases with no missing
values for any variable
used.

Page | 11

Downloaded by Nahian Sattar (nahiansattar99@[Link])


lOMoARcPSD|8215923

Notes
Syntax REGRESSION
/MISSING LISTWISE
/STATISTICS COEFF
OUTS R ANOVA
/CRITERIA=PIN(.05)
POUT(.10)
/NOORIGIN
/DEPENDENT salbeg
/METHOD=ENTER age
edlevel sex work minority.

Resources Processor Time [Link].02

Elapsed Time [Link].05

Memory Required 4976 bytes

Additional Memory 0 bytes


Required for Residual
Plots

[DataSet1] C:\Users\Classroom\Desktop\report\g4_230.sav

a
Variables Entered/Removed
Variables Variables
Entered Removed
Model Method
1 MINORITY . Enter
CLASSIFICA
TION,
EDUCATION
AL LEVEL,
WORK
EXPERIENC
E, SEX OF
EMPLOYEE,
AGE OF
EMPLOYEEb

a. Dependent Variable: BEGINNING SALARY


b. All requested variables entered.

Model Summary
Adjusted R Std. Error of the
Square Estimate
Model R R Square
1 .701a .491 .480 2006.395

a. Predictors: (Constant), MINORITY CLASSIFICATION, EDUCATIONAL LEVEL, WORK EXPERIENCE,


SEX OF EMPLOYEE, AGE OF EMPLOYEE
a

Page | 12

Downloaded by Nahian Sattar (nahiansattar99@[Link])


lOMoARcPSD|8215923

ANOVA
Sum of
Squares
Model df Mean Square F Sig.
1 Regression 870259235.5 5 174051847.1 43.236 .000b

Residual 901739069.8 224 4025620.848

Total 1771998305 229

a. Dependent Variable: BEGINNING SALARY


b. Predictors: (Constant), MINORITY CLASSIFICATION, EDUCATIONAL LEVEL, WORK EXPERIENCE,
SEX OF EMPLOYEE, AGE OF EMPLOYEE

a
Coefficients
Standardized
Coefficients
Unstandardized Coefficients
Beta
Model B Std. Error t Sig.
1 (Constant) -673.716 999.580 -.674 .501

AGE OF EMPLOYEE 29.535 18.459 .128 1.600 .111

EDUCATIONAL LEVEL 526.941 52.725 .552 9.994 .000

SEX OF EMPLOYEE -1597.772 307.051 -.286 -5.204 .000

WORK EXPERIENCE 16.183 25.322 .052 .639 .523

MINORITY -998.747 317.155 -.152 -3.149 .002


CLASSIFICATION

a. Dependent Variable: BEGINNING SALARY

Page | 13

Downloaded by Nahian Sattar (nahiansattar99@[Link])

Common questions

Powered by AI

Education has a highly significant positive impact on beginning salary, with a coefficient of 526.941, indicating a substantial increase in salary per additional year of education, and a p-value of 0.000, which confirms its strong significance in predicting salaries . In contrast, gender exhibits a negative impact, with female employees earning $1597.772 less than male employees, and a p-value of 0.001, indicating high significance . These outcomes could be attributed to societal norms and biases, as well as the valuing of educational credentials in the job market.

The variables that should be prioritized include educational level and gender, which both show significant impacts on beginning salary with p-values of 0.000 and 0.001 respectively . Policies should aim to address gaps in educational-access opportunities and gender-based pay disparities. Additionally, the substantial negative impact of minority classification (p-value 0.002) suggests a need for policies targeting racial pay equity . Focusing on these areas could have the most pronounced effect in reducing salary disparities.

Regression analysis, while useful for predicting beginning salaries through quantifiable predictors like age, education, and gender, assumes a linear relationship among variables, which may not capture complex or nonlinear interactions. The significant portion of unexplained variability (50.9% of variance) signals potential omission of other relevant factors, such as industry-specific trends or qualitative aspects like negotiation skills . Misspecification or multicollinearity are other challenges that could limit its predictive accuracy and generalizability.

The regression model shows a small positive coefficient of 16.183 for work experience, along with a high p-value of 0.523, indicating that work experience has a minor and statistically insignificant impact on beginning salaries . This contrasts with traditional expectations where experience is typically seen as a significant driver of salary increases, suggesting that other factors like education and gender perceptions might be overpowering its effect, challenging the conventional view that experience is directly proportional to compensation at the beginning of employment.

The negative coefficients of -1597.772 for gender and -998.747 for minority classification indicate systemic challenges where females and minority groups are paid less on average, reflecting entrenched social biases and inequities within organizational pay structures . These disparities are symptomatic of broader societal issues, such as gender-based discrimination and racial inequality, underscoring the need for comprehensive organizational policies and cultural shifts to achieve true equity and rectify historical injustices in workplace compensation.

To enhance the model's predictive accuracy, incorporating variables such as industry type, job role, geographic location, and company size could provide more comprehensive insights into salary determinants. Collecting data on qualitative factors like skill sets, negotiation abilities, and organizational culture might also capture variation not accounted for by quantitative predictors alone. Addressing multicollinearity issues by ensuring independence among predictors could further strengthen the model . Such enhancements could help better capture the complexity of salary determinants.

The coefficient of multiple determination, R2, is 0.491, which means that the model explains about 49.1% of the variability in beginning salaries . This indicates that the model has moderate explanatory power, suggesting that while it captures some of the factors influencing salaries, a significant portion (50.9%) of the variability is due to other factors not included in the model.

The estimated regression model indicates that age has a positive but relatively small influence on beginning salary with a coefficient of 29.535, meaning that for each additional year of age, the beginning salary increases by approximately $29.535 when other factors are held constant . This suggests that age alone may not be a significant determinant of starting salary, implying that companies might prioritize other factors like education or experience in their salary structures.

Confidence intervals for the regression coefficients provide a range within which we can expect the true effect of each predictor on salary to lie. For example, the confidence interval for age is between 0.0006 and 59.0694, suggesting some uncertainty in its precise impact . Decisions on salary adjustments, particularly those dependent on marginal changes in predictors, must consider these intervals to avoid overly precise estimations that could misinform policy if the true effect lies toward the interval's end bounds.

Minority classification has a significant negative impact on beginning salary, with those identified as non-white earning $998.747 less than their white counterparts, beside all other factors held constant . This suggests systemic biases within salary structures related to minority status. For workplace equity, this highlights a need for policies aimed at mitigating racial disparities in pay, ensuring fairness and equal opportunities across different societal groups.

You might also like