0% found this document useful (0 votes)
87 views

Machine Learning Social Science

This document provides information about a machine learning summer workshop including the instructors, course description, recommended texts, and tentative schedule. The workshop will cover both supervised and unsupervised machine learning methods such as decision trees, random forests, neural networks, clustering, and dimensionality reduction. It will demonstrate how to apply these techniques to gain insights from social science data using R and potentially Python. The schedule outlines topics to be covered over four weeks including modeling techniques, model validation, and applying machine learning to estimate treatment effects.

Uploaded by

Asha Pillai
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
87 views

Machine Learning Social Science

This document provides information about a machine learning summer workshop including the instructors, course description, recommended texts, and tentative schedule. The workshop will cover both supervised and unsupervised machine learning methods such as decision trees, random forests, neural networks, clustering, and dimensionality reduction. It will demonstrate how to apply these techniques to gain insights from social science data using R and potentially Python. The schedule outlines topics to be covered over four weeks including modeling techniques, model validation, and applying machine learning to estimate treatment effects.

Uploaded by

Asha Pillai
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Machine Learning: Applications and Opportunities in the Social Sciences

ICPSR Summer Program in Quantitative Methods of Social Research


24 June-19 July, 2019, Ann Arbor, MI

Instructor: TA:
Christopher Hare Sam Fuller
Assistant Professor PhD Student
Department of Political Science Department of Political Science
University of California, Davis University of California, Davis
[email protected] [email protected]

Course Description:
A growing number of social scientists are taking advantage of machine learning methods
to uncover hidden structure in their data, improve model predictive power, and gain a
better understanding of complex relationships between variables. This workshops covers
the mechanics underlying machine learning methods and discusses how these techniques
can be leveraged by social scientists to gain new insight from their data. Specifically, the
workshop will cover both supervised and unsupervised methods: decision trees, random
forests, boosting, support vector machines, neural networks, deep and adversarial learning,
ensemble learning, principal components analysis, factor analysis, and manifold learning/
multidimensional scaling. We will also discuss best practices in fitting and interpreting these
models, including cross-validation techniques, bootstrapping, and presenting output. The
workshop will demonstrate how these models can be estimated in R (and, time permitting,
Python).

Recommended Texts/Readings:
1. Hastie, Trevor, Robert Tibshirani, and Jerome Friedman. 2009. The Elements of
Statistical Learning: Data Mining, Inference, and Prediction, second edition. New
York: Springer.
2. Goodfellow, Ian, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning. Cambridge,
MA: MIT Press.
3. Murphy, Kevin P. 2012. Machine Learning: A Probabilistic Perspective. Cambridge,
MA: MIT Press.
4. Kuhn, Max and Kjell Johnson. 2013. Applied Predictive Modeling. New York: Springer.
5. Berk, Richard A. 2016. Statistical Learning from a Regression Perspective, second
edition. New York: Springer.
6. Mullainathan, Susan and Jann Spiess. 2017. “Machine Learning: An Applied Econometric
Approach.” Journal of Economic Perspectives 31 (2): 87-106.
7. Künzel, Sören R., Jasjeet S. Sekhon, Peter J. Bickel, and Bin Yu. 2019. “Metalearners
for Estimating Heterogeneous Treatment Effects using Machine Learning.” Proceedings
of the National Academy of Sciences 116 (10): 4156-4165.

1
8. Grimmer, Justin, Solomon Messing, and Sean J. Westwood. 2017. “Estimating Heterogeneous
Treatment Effects and the Effects of Heterogeneous Treatments with Ensemble Methods.”
Political Analysis 25 (4): 413-434.
9. Sechidis, Konstantinos and Gavin Brown. 2018. “Simple Strategies for Semi-supervised
Feature Selection.” Machine Learning 107 (2): 357-395.
10. Szegedy, Christian et al. 2013. “Intriguing Properties of Neural Networks.” https:
//arxiv.org/abs/1312.6199.
11. R

(a) James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani. 2013. An
Introduction to Statistical Learning with Applications in R. New York: Springer.

12. Python

(a) VanderPlas, Jake. 2017. Python Data Science Handbook: Essential Tools for
Working with Data. Sebastapol, CA: O’Reilly.

Course materials: Course materials (including slides, code, and problem sets) will be
available on a private Dropbox folder.

Tentative Schedule:
This schedule is subject to change:

• Week One: Machine Learning: Theory and Concepts


Computational Learning Theory and the Development of Machine Learning
The Bias-Variance Tradeoff and Error Rates
Model Validation and Tuning
Resampling Techniques
Predictions and Counterfactuals
Quick Review of Linear Regression Models
Programming in R
Computing Performance and Practical Tips
• Week Two: Supervised and Semi-Supervised Learning
Generalized Linear Models and Extensions
Shrinkage/Regularization Methods and the Lasso
Regression Splines and Generalized Additive Models
Linear and Flexible Discriminant Analysis
Naive Bayes
Bayesian Model Averaging
Neural Networks and Generative Adversarial Networks

2
Graphical Models
Support Vector Machines and Relevance Vector Machines
k-nearest Neighbors
• Week Three: Tree-Based Methods and Learning Ensembles
Classification and Regression Trees
Ensemble Methods: Random Forests and Boosting
Assessing Variable Importance and Effects
Partial Dependency Plots and Model Visualization
Ensemble Modeling and Heterogeneous Treatment Effects
• Week Four: Unsupervised Learning
k-means Clustering
Principal Components Analysis
Manifold Learning and Multidimensional Scaling
Self-Organizing Maps
Deep Learning
Mixture Models and Latent Class Analysis
Novelty/Outlier Detection

You might also like