0% found this document useful (0 votes)
7 views

Chapter 12 - Dimension Reduction

Uploaded by

Neilsen Kort
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Chapter 12 - Dimension Reduction

Uploaded by

Neilsen Kort
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 14

Official Business

Chapter 12: Dimension


Reduction
Introduction to Data Science
Official Business

The Need for Dimension Reduction


• High Dimensionality: Refers to datasets with a large number of
predictors (e.g., 100 predictors describe a 100-dimensional space).
Managing high-dimensional data can be challenging.
Official Business

Why Reduce Dimension?


1. Multicollinearity:
• Definition: When predictors are highly correlated, leading to unstable regression models.
• Issue: Correlated predictors can cause difficulties in estimating accurate model coefficients.
2. Double-Counting:
• Example: When correlated predictors like height and weight are both used to estimate age,
it overemphasizes the physical aspect, effectively double-counting.
• Solution: Dimension reduction can help eliminate redundant predictors.
3. Curse of Dimensionality:
• Explanation: As the number of dimensions (predictors) increases, the volume of the
predictor space grows exponentially, making data sparse.
• Impact: Even large datasets can struggle with high-dimensional spaces, making it difficult
to draw meaningful conclusions.
Official Business

4. Violation of Parsimony:
• Principle: Parsimony emphasizes simplicity; models should use as few predictors as
necessary for accurate interpretation.
• Issue: Too many predictors complicate models, violating this principle.
5. Overfitting:
• Problem: Models with many predictors may perform well on training data but poorly on
new data, as they are too specific to the training set.
• Benefit of Dimension Reduction: Reduces overfitting by simplifying the model.
6. Missing the Bigger Picture:
• Example: Variables like savings account balance, checking account balance, and 401(k)
balance might be better grouped under a single component (e.g., assets).
• Dimension Reduction Goal: Helps reveal underlying relationships among variables by
grouping them.
Official Business

Variance Inflation Factor (VIF) and


Multicollinearity
Detecting Multicollinearity:
• Even if correlation among predictors is not checked before running a
regression, the results can still warn of multicollinearity using Variance
Inflation Factors (VIFs).
Official Business

What is VIF?
• The Variance Inflation Factor (VIF) measures how much the variance of a
regression coefficient is inflated due to multicollinearity.
• The VIF for the i-th predictor is given by:

• R²ᵢ: Represents the R² value obtained by regressing the predictor ​on all other predictor
variables.
• Interpretation: A high R²ᵢ indicates that is highly correlated with other predictors, leading
to a high VIF.
Official Business

Interpreting VIF Values:


• VIF ≥ 5: Indicates moderate multicollinearity.
• VIF ≥ 10: Indicates severe multicollinearity.

Example:
• A VIF of 5 corresponds to R² = 0.80, suggesting that 80% of the variance in ​is
explained by other predictors.
• A VIF of 10 also corresponds to R² = 0.80, reinforcing high multicollinearity.
Official Business

• High VIF values suggest that the model may struggle to determine the
true effect of individual predictors due to their interdependence.
o Solution: Address multicollinearity by removing or combining correlated
predictors.
Official Business

Principal Components Analysis


What to Do When Multicollinearity is Detected?
• Solution: Apply Principal Components Analysis (PCA) to address
multicollinearity.
What is PCA?
• Purpose: PCA transforms a set of correlated predictors into a smaller set of
uncorrelated linear combinations, called components.
• Dimension Reduction: Instead of working with m correlated predictors, PCA
allows you to reduce the data to k < m components that capture most of the
original variability.
Official Business

How PCA Works


1. Correlation Structure
• PCA accounts for the correlation structure among predictor variables, simplifying
the data.
2. Components
• Each component is an uncorrelated linear combination of the original predictors.
• The first component captures the most variability, followed by the second, which is
uncorrelated with the first, and so on.
3. Replacing Predictors:
• The original m variables can be replaced with the k components, reducing the
dataset's dimensionality.
• This results in a new dataset of n records on k components instead of n records on
m predictors.
Official Business

Key Points About PCA


• Focus on Predictors Only: PCA operates only on the predictor
variables and does not consider the target variable.
• Standardization Required: The predictors should be standardized or
normalized before applying PCA.
Official Business

Characteristics of Principal
Components
1. First Component: Accounts for the most variability in the dataset.
2. Second Component: Accounts for the second-most variability and
is uncorrelated with the first.
3. Subsequent Components: Continue capturing variability, each being
uncorrelated with the preceding components.
Official Business

Benefits of PCA
• Reduces Dimensionality: Helps to simplify complex datasets.
• Removes Multicollinearity: By creating uncorrelated components,
PCA addresses issues caused by correlated predictors.
Official Business

You might also like