Trees and Neural Networks in Salary Prediction

The document outlines an assignment focused on tree-based methods and neural networks using the Hitters dataset. It includes tasks for data preparation, fitting regression and classification trees, applying bagging, random forests, and boosting, as well as fitting a neural network and comparing its performance with previous models. The assignment emphasizes model evaluation through test MSE and encourages exploration of regularization effects on the neural network.

Uploaded by

Takeshi Castle

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

50 views3 pages

Trees and Neural Networks in Salary Prediction

Uploaded by

Takeshi Castle

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

EE353 Assignment 4

Date: 1-Nov-2025
Coding Assignment: Trees and Neural Networks

Trees
1. The goal of this exercise is to explore a variety of tree-based regression and classification
methods using the Hitters dataset from the ISLP package. We will predict both the
quantitative and qualitative aspects of player salaries using regression trees, bagging,
random forests, and boosting.
1. Data Preparation.
(a) Remove all observations for which the Salary variable is missing.
(b) Create a new variable HighSalary, defined as Yes if Salary exceeds the median
salary and No otherwise.
(c) Split the data into appropriate training and test sets.
(d) For regression tasks, use log(Salary) as the response. For classification tasks,
use HighSalary as the response.
2. Regression Tree.
(a) Fit a regression tree predicting log(Salary) using the training data.
(b) Plot the tree and interpret the main splits.
(c) Compute and report the test MSE.
(d) Use cross-validation to determine the optimal tree size, and prune the tree
accordingly.
(e) Report and compare the test MSE before and after pruning.
3. Classification Tree.
(a) Fit a classification tree predicting HighSalary.
(b) Report the training and test error rates, and display a confusion matrix for the
test data.
(c) Plot the tree and discuss the key predictors.
(d) Perform cross-validation to find the optimal tree size, and prune the tree if
appropriate.
(e) Compare the classification accuracy between the pruned and unpruned trees.
4. Bagging and Random Forests.
(a) Apply bagging to predict log(Salary) using the training data. Report the test
MSE and display variable importance values.
(b) Fit a random forest model for the same prediction problem. Experiment with
√
different values of m, the number of variables considered at each split (e.g., p,
p/2, and p).
(c) Report the test MSE for each case and discuss how m affects performance.
(d) Plot and interpret the variable importance measures from the random forest.
5. Boosting.
(a) Perform boosting on the training set with 1000 trees for a range of shrinkage
parameters λ (e.g., from 0.001 to 0.5).
(b) Produce plots of training and test MSE versus λ.
(c) Report the test MSE for the best-performing model.
(d) Identify the most important predictors in the boosted model.
(e) Compare the boosting test MSE to those obtained from the regression tree,
bagging, and random forest models.
6. Comparison with Linear Methods.
(a) Fit a multiple linear regression model and a ridge or lasso regression model
(from Chapters 3 and 6) predicting log(Salary).
(b) Report their test MSE values.
(c) Compare the performance of all models — regression tree, pruned tree, bagging,
random forest, boosting, and linear models — and summarize your findings in
a short paragraph.

Neural Network
2. In this exercise, you will fit a single-layer neural network to the Hitters dataset and
compare its predictive performance with the models developed in the previous exercise.

1. Data Preparation.
(a) Use the same training and test sets created in the previous exercise.
(b) Remove all missing salary observations and use log(Salary) as the quantitative
response variable.
(c) Standardize all numeric predictors so that each has mean zero and standard
deviation one.
2. Neural Network Architecture.
(a) Construct a feed-forward neural network with a single hidden layer.
(b) Let the number of hidden units h take values in {1, 3, 5, 10, 20}.
(c) Use the ReLU activation function in the hidden layer and a linear activation in
the output layer.
(d) Train the network using the training data to predict log(Salary).
(e) Use stochastic gradient descent (SGD) with an appropriate learning rate and
either early stopping or a fixed number of epochs (e.g., 50).
3. Model Selection and Evaluation.
(a) Compute the training and test MSE for each value of h.
(b) Plot the test MSE as a function of the number of hidden units.
(c) Report the test MSE corresponding to the best-performing network.
4. Comparison with Previous Models.
(a) Compare the neural network’s test MSE with the results obtained from:
• the regression tree and pruned tree,
• bagging and random forest,
• boosting,
• and linear methods (OLS, ridge, or lasso).
(b) Summarize your observations in a short paragraph. Discuss whether the neural
network provides any improvement in predictive accuracy or captures nonlinear
patterns missed by the linear models.
5. Regularization
(a) Explore the effect of adding an ℓ2 regularization term (weight decay) to the
network.
(b) Report how regularization affects the test MSE and the stability of the model.

Machine Learning Regression Techniques
No ratings yet
Machine Learning Regression Techniques
6 pages
Machine Learning Lab Manual for B.Tech
No ratings yet
Machine Learning Lab Manual for B.Tech
19 pages
Python Machine Learning Techniques Guide
No ratings yet
Python Machine Learning Techniques Guide
13 pages
Salary Prediction Using Machine Learning
No ratings yet
Salary Prediction Using Machine Learning
11 pages
Salary Prediction Using Machine Learning
No ratings yet
Salary Prediction Using Machine Learning
10 pages
Simplified Python ML Practicals
No ratings yet
Simplified Python ML Practicals
7 pages
GTE03: Experiments in ML with MatLab
No ratings yet
GTE03: Experiments in ML with MatLab
4 pages
Machine Learning for House Price Prediction
No ratings yet
Machine Learning for House Price Prediction
15 pages
Simple Linear Regression in Python
No ratings yet
Simple Linear Regression in Python
14 pages
Linear Regression on Salary Data
No ratings yet
Linear Regression on Salary Data
5 pages
Predictive Analytics with Neural Networks
No ratings yet
Predictive Analytics with Neural Networks
20 pages
Salary Prediction with Linear Regression
No ratings yet
Salary Prediction with Linear Regression
5 pages
Employee Income Prediction with ML
No ratings yet
Employee Income Prediction with ML
12 pages
ES335 Machine Learning Course Notes
No ratings yet
ES335 Machine Learning Course Notes
22 pages
Supervised Learning Class Notes
No ratings yet
Supervised Learning Class Notes
2 pages
XGBoost for Regression Analysis
No ratings yet
XGBoost for Regression Analysis
3 pages
Machine Learning Lab Report Guidelines
No ratings yet
Machine Learning Lab Report Guidelines
7 pages
Machine Learning: Regression Techniques Guide
No ratings yet
Machine Learning: Regression Techniques Guide
158 pages
Salary Prediction Using Regression Techniques
No ratings yet
Salary Prediction Using Regression Techniques
5 pages
CSE455 Homework 1: ML Experiments
No ratings yet
CSE455 Homework 1: ML Experiments
2 pages
Machine Learning Lab Tasks Overview
No ratings yet
Machine Learning Lab Tasks Overview
21 pages
Regression Analysis with Polynomial Features
No ratings yet
Regression Analysis with Polynomial Features
25 pages
Linear Regression and Decision Trees Guide
No ratings yet
Linear Regression and Decision Trees Guide
84 pages
Machine Learning Lab Manual for CSE
No ratings yet
Machine Learning Lab Manual for CSE
50 pages
Python Machine Learning Techniques Guide
No ratings yet
Python Machine Learning Techniques Guide
24 pages
Understanding Parametric ML Methods
No ratings yet
Understanding Parametric ML Methods
15 pages
Python ML Implementation Guide
No ratings yet
Python ML Implementation Guide
19 pages
Simple Linear Regression in Python
No ratings yet
Simple Linear Regression in Python
8 pages
Polynomial Regression in ML Pipeline
No ratings yet
Polynomial Regression in ML Pipeline
58 pages
Machine Learning II Syllabus Overview
No ratings yet
Machine Learning II Syllabus Overview
10 pages
Salary Prediction Using ML Techniques
No ratings yet
Salary Prediction Using ML Techniques
4 pages
ID3 Decision Tree Classifier Guide
No ratings yet
ID3 Decision Tree Classifier Guide
8 pages
Python Linear Regression Lab Guide
No ratings yet
Python Linear Regression Lab Guide
5 pages
Decision Tree Implementation Guide
No ratings yet
Decision Tree Implementation Guide
7 pages
Single Layer Neural Networks Explained
No ratings yet
Single Layer Neural Networks Explained
9 pages
Simple Linear Regression with Python
No ratings yet
Simple Linear Regression with Python
30 pages
Salary Prediction with Linear Regression
No ratings yet
Salary Prediction with Linear Regression
9 pages
High-Frequency Stock Price Prediction Models
No ratings yet
High-Frequency Stock Price Prediction Models
62 pages
Advanced Machine Learning Techniques
No ratings yet
Advanced Machine Learning Techniques
3 pages
Simple Linear Regression in Python
No ratings yet
Simple Linear Regression in Python
45 pages
ID3 Algorithm Decision Tree Demo
No ratings yet
ID3 Algorithm Decision Tree Demo
8 pages
Cars Dataset Model Evaluation MSE Analysis
No ratings yet
Cars Dataset Model Evaluation MSE Analysis
22 pages
SML Destruction Overview 2018
No ratings yet
SML Destruction Overview 2018
8 pages
Salary Prediction
No ratings yet
Salary Prediction
4 pages
Data Cleaning and Linear Regression Guide
No ratings yet
Data Cleaning and Linear Regression Guide
3 pages
Classification Tree for Default Prediction
No ratings yet
Classification Tree for Default Prediction
36 pages
Perceptron and Neural Network Examples
No ratings yet
Perceptron and Neural Network Examples
6 pages
Machine Learning Lab Manual Guide
No ratings yet
Machine Learning Lab Manual Guide
21 pages
Polynomial Regression Model Tutorial
No ratings yet
Polynomial Regression Model Tutorial
14 pages
Python Data Science and ML Techniques
No ratings yet
Python Data Science and ML Techniques
19 pages
DTC Algorithm Implementation Guide
No ratings yet
DTC Algorithm Implementation Guide
7 pages
Salary Prediction with Machine Learning
No ratings yet
Salary Prediction with Machine Learning
33 pages
Machine Learning Experiments Syllabus
No ratings yet
Machine Learning Experiments Syllabus
43 pages
Machine Learning Classification Techniques
No ratings yet
Machine Learning Classification Techniques
30 pages
Decision Tree Regression Analysis
No ratings yet
Decision Tree Regression Analysis
21 pages
Python Linear Regression Implementation
No ratings yet
Python Linear Regression Implementation
34 pages
Simple Linear Regression with Python
No ratings yet
Simple Linear Regression with Python
7 pages
Python Clustering and Decision Trees
No ratings yet
Python Clustering and Decision Trees
64 pages
SRM Machine Learning Lab Manual
No ratings yet
SRM Machine Learning Lab Manual
42 pages
Alfaro Navarro Et Al 2015
No ratings yet
Alfaro Navarro Et Al 2015
6 pages
Understanding False Negatives in Models
No ratings yet
Understanding False Negatives in Models
9 pages
Test Bank for Statistics 6th Edition
No ratings yet
Test Bank for Statistics 6th Edition
27 pages
ESG Reporting's Impact on IPO Performance
No ratings yet
ESG Reporting's Impact on IPO Performance
16 pages
Linear Regression Basics in Econometrics
No ratings yet
Linear Regression Basics in Econometrics
25 pages
BUS336 Quiz 3: Regression Analysis Results
No ratings yet
BUS336 Quiz 3: Regression Analysis Results
56 pages
MBA in Data Science and Analytics Program
No ratings yet
MBA in Data Science and Analytics Program
13 pages
Southern and Western Europe Overview
No ratings yet
Southern and Western Europe Overview
19 pages
Inferential Statistics in Psychology Guide
No ratings yet
Inferential Statistics in Psychology Guide
96 pages
Employee Attrition Analysis Report
No ratings yet
Employee Attrition Analysis Report
30 pages
Statistical Economics PYQ
No ratings yet
Statistical Economics PYQ
6 pages
Modified Moment Estimation for Gamma Distribution
No ratings yet
Modified Moment Estimation for Gamma Distribution
9 pages
Predicting Supermarket Sales with Data Analysis
No ratings yet
Predicting Supermarket Sales with Data Analysis
12 pages
ECLAT and FPGrowth Algorithms Explained
No ratings yet
ECLAT and FPGrowth Algorithms Explained
3 pages
Understanding Semi-Variograms in Geostatistics
No ratings yet
Understanding Semi-Variograms in Geostatistics
9 pages
Two-Stage Least Squares Estimators
No ratings yet
Two-Stage Least Squares Estimators
5 pages
Covariance and Correlation of Random Variables
No ratings yet
Covariance and Correlation of Random Variables
30 pages
Multinomial Logit Model Overview
No ratings yet
Multinomial Logit Model Overview
50 pages
Statistics in Plain English Test Bank
No ratings yet
Statistics in Plain English Test Bank
17 pages
Praktikum Pengolahan Basis Data 4133
No ratings yet
Praktikum Pengolahan Basis Data 4133
6 pages
Higher Versus Lower Protein Delivery in Critically Ill Patients - A Systematic Review and Bayesian Meta-Analysis
No ratings yet
Higher Versus Lower Protein Delivery in Critically Ill Patients - A Systematic Review and Bayesian Meta-Analysis
11 pages
Clustering Techniques in Data Analysis
No ratings yet
Clustering Techniques in Data Analysis
17 pages
Correlation and Regression Problems
100% (3)
Correlation and Regression Problems
34 pages
Interpreting STATA Regression Output
No ratings yet
Interpreting STATA Regression Output
3 pages
Confidence Intervals in Statistics Assignment
No ratings yet
Confidence Intervals in Statistics Assignment
2 pages
Fisher's Exact Test Explained
No ratings yet
Fisher's Exact Test Explained
33 pages
Descriptive Statistics Analysis in SPSS
No ratings yet
Descriptive Statistics Analysis in SPSS
9 pages
Universiti Teknologi Mara Final Examination: Confidential CS/SEP 2011/QMT181/212/216
No ratings yet
Universiti Teknologi Mara Final Examination: Confidential CS/SEP 2011/QMT181/212/216
10 pages
Understanding Correlation and Coefficients
No ratings yet
Understanding Correlation and Coefficients
9 pages
Data Interpretation in Business Research
No ratings yet
Data Interpretation in Business Research
7 pages

Trees and Neural Networks in Salary Prediction

Uploaded by

Trees and Neural Networks in Salary Prediction

Uploaded by

EE353 Assignment 4

You might also like