Feature Selection

Uploaded by

ppreethu151

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

29 views

Feature Selection

Uploaded by

ppreethu151

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

Feature Selection

• Feature Selection is the process of selecting a subset of

relevant features from the dataset to be used in a
machine-learning model. It is an important step in the
feature engineering process as it can have a significant
impact on the model’s performance.
• After generating a large set of features,
• use statistical and machine learning techniques to identify the
most relevant ones.
• This includes:1. Correlation Analysis :Checking how each
feature correlate with the target variable (e.g.,user retention).
• 2. Model-Based Selection: Using algorithms like random forest
or LASSO regression
• To identify important features.
• 3. Cross-Validation :Ensuring th features selected improve
model performance on unseen data.
• Benefits of Feature Selection:
• 1. Reduces Overfitting:
• By using only the most relevant features, the model can generalize better to new
data.
• 2.Improves Model Performance:
• Selecting the right features can improve the accuracy, precision, and recall of the
model.
• 3.Decreases Computational Costs:
• A smaller number of features requires less computation and storage resources.
• 4.Improves Interpretability:
• By reducing the number of features, it is easier to understand and interpret the
results of the model.
•
• There can be various reasons to perform feature selection
• Simplification of the model.
• •Less computational time.
• •To avoid the curse of dimensionality.
• •Improve the compatibility of data with models
Roughly the feature selection techniques can
be divided into three parts.
• 1. Filters
• 2. Wrappers
• 3.Embedded methods.
Filter Methods
• Filters rank features based on a statistical measure of their
relationship with the outcome variable. This is a good initial step but
doesn’t account for interactions between features.
• The filter method filters out the irrelevant feature and redundant
columns from the model by using different metrics through
ranking.
• Advantages:
• •Simple and fast to compute. does not over fit the data
• •Provides a preliminary ranking of features.
• Disadvantages:
• •Ignores feature interactions and redundancy.
Filter Methods -example

• Linear Regression Test: Foreach feature, run a linear regression with

only that feature as a predictor.
• Rank features by p-value or R-squared.
• Steps:1.Compute correlation: Measure thecorrelation between each
feature and thetarget variable(e.g., user retention).
• 2.Rank features: Order features by their p-values or R-squared values.
• 3.Select top features: Choose asubset of top-ranked features for
further analysis.
• 1. Wrapper Method:-
• In wrapper methodology, selection of features is done by
considering it as a search problem, in which different
combinations are made, evaluated, and compared with other
combinations. It trains the algorithm by using the subset of
features iteratively. On the basis of the output of the model,
features are added or subtracted, and with this feature set, the
model has trained again.
• They consider feature interactions but can be computationally
• expensive and prone to overfitting.
• Types of Wrappers:
• 1.Forward Selection:
• 1.Start with no features.
• 2.Add features one at a time, selecting the one that improves
the model the most.
• 3.Stop when adding more features does not improve the model
• 2.Backward Elimination:
• A. Start with allfeatures.
• b.Remove features one at a time, selecting the one that
improves the model the most when removed.
• C.. Stop when removing more features degrades the model.
• 3.Combined Approach: Use a hybrid of forward selection and backward
elimination to balance feature inclusion and exclusion.

• Steps involved in Wrapper Methods

• Steps:
• •Select an algorithm:
• Choose forward selection, backward elimination,or a combined approach.
• •Evaluate subsets:Use cross-validation to evaluate the performance of
different feature subsets.
• •Optimize selection:Use criteria such as R-squared, values, AIC,or BIC to
select the best subset.
Selection criteria of features
• 1.R-squared:
• 1. Measures the proportion of variance explained by the model.
• 2.Higher R-squared indicates a better fit.
• 2.P-values:
• 1. Assess the significance of individual features.
• 2. Lower p-values indicate higher significance.
• 3.AIC (Akaike Information Criterion)
• 1. Balances model fit and complexity.
• 2. Lower AIC indicates a better model.
•
• 4. .BIC (Bayesian Information Criterion):
• 1. Similar to AIC but with a stronger penalty for model complexity.
• 2. Lower BIC indicates a better model.
• 5.Entropy
• 1.Calculate Entropy :Compute the entropy for the entire dataset.
• 2. Compute Information Gain:For each feature,calculate the information
gain resulting from splitting the data set on that feature.
• 3. Select Features: Choose features with the higest information gain, as
these contribute the most reducing uncertainty.
•
• he Akaike information criterion (AIC) is a mathematical
method for evaluating how well a model fits the data it was
generated from. In statistics, AIC is used to compare
different possible models and determine which one is the
best fit for the data. AIC is calculated from:
1.the number of independent variables used to build the
model.
2.the maximum likelihood estimate of the model (how well
the model reproduces the data).
• The best-fit model according to AIC is the one that explains
the greatest amount of variation using the fewest possible
independent variables.
• Model selection exampleIn a study of how hours spent studying and test
format (multiple choice vs. written answers) affect test scores, you
create two models:
1. Final test score in response to hours spent studying
2. Final test score in response to hours spent studying + test format
2
• You
2
find an r of 0.45 with a p value less than 0.05 for model 1, and an
r of 0.46 with a p value less than 0.05 for model 2. Model 2 fits the
data slightly better – but was it worth it to add another parameter just
to get this small increase in model fit?
• You run an AIC test to find out, which shows that model 1 has the
lower AIC score because it requires less information to predict with
almost the exact same level of precision. Another way to think of this
is that the increased precision in model 2 could have happened by
chance.
• From the AIC test, you decide that model 1 is the best model for your
study.
• Bayesian information criterion (BIC) is a criterion for model
selection among a finite set of models. It is based, in part, on
the likelihood function, and it is closely related to Akaike
information criterion (AIC).
• When fitting models, it is possible to increase the likelihood by
adding parameters, but doing so may result in overfitting. The
BIC resolves this problem by introducing a penalty term for the
number of parameters in the model. The penalty term is larger
in BIC than in AIC.
Random Forest Algorithm

• Random Forest is a popular machine learning algorithm that

belongs to the supervised learning technique. It can be
used for both Classification and Regression problems in ML.
It is based on the concept of ensemble learning, which is a
process of combining multiple classifiers to solve a complex
problem and to improve the performance of the model.
• "Random Forest is a classifier that contains a number of
decision trees on various subsets of the given dataset and
takes the average to improve the predictive accuracy of
that dataset."
• The greater number of trees in the forest leads to higher
accuracy and prevents the problem of overfitting.
• The Working process can be explained in the below steps and
diagram:
• Step-1: Select random K data points from the training set.
• Step-2: Build the decision trees associated with the selected
data points (Subsets).
• Step-3: Choose the number N for decision trees that you want
to build.
• Step-4: Repeat Step 1 & 2.
• Step-5: For new data points, find the predictions of each
decision tree, and assign the new data points to the category
that wins the majority votes.

Practical Statistics for Data Scientists
No ratings yet
Practical Statistics for Data Scientists
13 pages
07 Solutions Regression
50% (2)
07 Solutions Regression
74 pages
Module-3 DSV
No ratings yet
Module-3 DSV
20 pages
Lecture#10
No ratings yet
Lecture#10
24 pages
Module-3 - DS (Autosaved)
No ratings yet
Module-3 - DS (Autosaved)
18 pages
Kernels, Model Selection and Feature Selection
No ratings yet
Kernels, Model Selection and Feature Selection
5 pages
Feature Selection in Machine Learning
No ratings yet
Feature Selection in Machine Learning
4 pages
7 Selectia trasaturilor
No ratings yet
7 Selectia trasaturilor
54 pages
Feature Selection
No ratings yet
Feature Selection
18 pages
data mining
No ratings yet
data mining
2 pages
AI5003-AML-Week07
No ratings yet
AI5003-AML-Week07
14 pages
ML Lecture 02
No ratings yet
ML Lecture 02
40 pages
Presentation 1 (2)
No ratings yet
Presentation 1 (2)
22 pages
Featuere Selection
No ratings yet
Featuere Selection
5 pages
Feature Engg Pre Processing Python
No ratings yet
Feature Engg Pre Processing Python
68 pages
Feature Selection in PR
No ratings yet
Feature Selection in PR
6 pages
Feature Selection - Study Material
No ratings yet
Feature Selection - Study Material
6 pages
Feature Selection: Slide 1
No ratings yet
Feature Selection: Slide 1
29 pages
3.1 Dimensionality Reduction
No ratings yet
3.1 Dimensionality Reduction
24 pages
Exam PA Knowledge Based Outline
No ratings yet
Exam PA Knowledge Based Outline
22 pages
AAM UNIT 1 QB WITH ANSWER
No ratings yet
AAM UNIT 1 QB WITH ANSWER
12 pages
Feature Selection Techniques in Machine Learning
No ratings yet
Feature Selection Techniques in Machine Learning
9 pages
How To Minimize Misclassification Rate and Expected Loss For Given Model
No ratings yet
How To Minimize Misclassification Rate and Expected Loss For Given Model
7 pages
Feature selection
No ratings yet
Feature selection
13 pages
u1 p2 2
No ratings yet
u1 p2 2
66 pages
Feature Selection
No ratings yet
Feature Selection
5 pages
Lecture 03
No ratings yet
Lecture 03
33 pages
L2
No ratings yet
L2
53 pages
Model Selection and Model Averaging
No ratings yet
Model Selection and Model Averaging
16 pages
Feature Selection
No ratings yet
Feature Selection
6 pages
ML-Lecture-6-7-preprocess
No ratings yet
ML-Lecture-6-7-preprocess
43 pages
Feature Selection
No ratings yet
Feature Selection
56 pages
Chandra Shekar 2014
No ratings yet
Chandra Shekar 2014
13 pages
Feature Selection
No ratings yet
Feature Selection
61 pages
Data Reduction
No ratings yet
Data Reduction
23 pages
Data Minning Unit 2-1
No ratings yet
Data Minning Unit 2-1
10 pages
Feature Selection Technique
No ratings yet
Feature Selection Technique
7 pages
Wrapper Method
No ratings yet
Wrapper Method
58 pages
E-Note 14653 Content Document 20231228101402AM
No ratings yet
E-Note 14653 Content Document 20231228101402AM
10 pages
1 of 1
No ratings yet
1 of 1
41 pages
Feature Selection
No ratings yet
Feature Selection
36 pages
Swami Keshvanand Institute of Technology, Management &gramothan, Ramnagaria, Jagatpura, Jaipur-302017, INDIA
No ratings yet
Swami Keshvanand Institute of Technology, Management &gramothan, Ramnagaria, Jagatpura, Jaipur-302017, INDIA
53 pages
R - Topic - Questions - Deepali & Neha
No ratings yet
R - Topic - Questions - Deepali & Neha
8 pages
11.feature Selection, Extraction
No ratings yet
11.feature Selection, Extraction
38 pages
Business Intelligence DM4 Feature Selection
No ratings yet
Business Intelligence DM4 Feature Selection
19 pages
An Introduction To Feature Selection
No ratings yet
An Introduction To Feature Selection
45 pages
Module2.1 Feature Selection
No ratings yet
Module2.1 Feature Selection
46 pages
Feature selection techniques
No ratings yet
Feature selection techniques
5 pages
کتاب پنجم بارگزاری شده
No ratings yet
کتاب پنجم بارگزاری شده
35 pages
Decision Tree
No ratings yet
Decision Tree
12 pages
International Journal of Engineering Research and Development (IJERD)
No ratings yet
International Journal of Engineering Research and Development (IJERD)
5 pages
Model Selection NEW
No ratings yet
Model Selection NEW
24 pages
Pattern L1 L6
No ratings yet
Pattern L1 L6
19 pages
A Comparative Study Between Feature Selection Algorithms - Ok
No ratings yet
A Comparative Study Between Feature Selection Algorithms - Ok
10 pages
University Institute of Engineering Department of Computer Science & Engineering
No ratings yet
University Institute of Engineering Department of Computer Science & Engineering
23 pages
Feature Selection Techniques in Machine Learning - Javatpoint
No ratings yet
Feature Selection Techniques in Machine Learning - Javatpoint
9 pages
Types of Data (Qualitative and Quantitative)
No ratings yet
Types of Data (Qualitative and Quantitative)
89 pages
Conditional Likelihood Maximisation: A Unifying Framework For Information Theoretic Feature Selection
No ratings yet
Conditional Likelihood Maximisation: A Unifying Framework For Information Theoretic Feature Selection
40 pages
Module 3
No ratings yet
Module 3
33 pages
Feature engineering
No ratings yet
Feature engineering
5 pages
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
César Pérez López
No ratings yet
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
Problem: Maximum Inventory Level M 11 Units Review Period N 5 Days
No ratings yet
Problem: Maximum Inventory Level M 11 Units Review Period N 5 Days
8 pages
DOE in Minitab
No ratings yet
DOE in Minitab
35 pages
Reliability Engineering and Risk Analysis: A Practical Guide: M. Modarres, M. Kaminskiy and V. Kritsov, Marcel Dekker Inc., New York, 1999, 542 Pages
No ratings yet
Reliability Engineering and Risk Analysis: A Practical Guide: M. Modarres, M. Kaminskiy and V. Kritsov, Marcel Dekker Inc., New York, 1999, 542 Pages
3 pages
Forecasting Excel
No ratings yet
Forecasting Excel
3 pages
Simple Linear Regression - Lecture Notes
No ratings yet
Simple Linear Regression - Lecture Notes
19 pages
Time Value of Money
No ratings yet
Time Value of Money
32 pages
Exams Paper B Critical Review Syllabus May2011
No ratings yet
Exams Paper B Critical Review Syllabus May2011
7 pages
Summary MAS291
No ratings yet
Summary MAS291
7 pages
Tugasan/Assignment 5 (20 Markah/marks) : Serial No
No ratings yet
Tugasan/Assignment 5 (20 Markah/marks) : Serial No
4 pages
Linear Programming Problems
No ratings yet
Linear Programming Problems
24 pages
Wiley Chapter 4
No ratings yet
Wiley Chapter 4
4 pages
Type I & Type II Errors With Examples
No ratings yet
Type I & Type II Errors With Examples
8 pages
Chapter 7
No ratings yet
Chapter 7
19 pages
MinhNguyen Q8 16mar12 48
No ratings yet
MinhNguyen Q8 16mar12 48
167 pages
Applications of Regression Models in Epidemiology 1st Edition Erick Su?Rez - The newest ebook version is ready, download now to explore
100% (2)
Applications of Regression Models in Epidemiology 1st Edition Erick Su?Rez - The newest ebook version is ready, download now to explore
65 pages
Project Management: Lecture Note: 4 Probabilistic Time Estimates
No ratings yet
Project Management: Lecture Note: 4 Probabilistic Time Estimates
26 pages
Experimental Design and Its Role in Data Science: Tirthankar Dasgupta CS 109 / Stat 121 November 17, 2015
No ratings yet
Experimental Design and Its Role in Data Science: Tirthankar Dasgupta CS 109 / Stat 121 November 17, 2015
67 pages
(15590410 - Journal of Quantitative Analysis in Sports) Bayesian Statistics Meets Sports A Comprehensive Review
100% (1)
(15590410 - Journal of Quantitative Analysis in Sports) Bayesian Statistics Meets Sports A Comprehensive Review
24 pages
Zach Garage Solution 21 Years
0% (1)
Zach Garage Solution 21 Years
19 pages
Notes For Finance 604 & 612 Prepared by Jessica A. Wachter
No ratings yet
Notes For Finance 604 & 612 Prepared by Jessica A. Wachter
3 pages
Session 8
No ratings yet
Session 8
13 pages
Quantitative Techniques in Forecasting: Mary Lei M. Fernandez
No ratings yet
Quantitative Techniques in Forecasting: Mary Lei M. Fernandez
15 pages
Auditing: Estimation of Errors
No ratings yet
Auditing: Estimation of Errors
5 pages
Regression Analysis Using SPSS: DR Somesh K Sinha
100% (1)
Regression Analysis Using SPSS: DR Somesh K Sinha
17 pages
Module 3 - Discrete Probability Distributions
No ratings yet
Module 3 - Discrete Probability Distributions
38 pages
Problem Set 06 With Solutions
No ratings yet
Problem Set 06 With Solutions
6 pages
Tolerance Stackup Analysis 2.0
No ratings yet
Tolerance Stackup Analysis 2.0
6 pages

Feature Selection

Uploaded by

Feature Selection

Uploaded by

Feature Selection

• Feature Selection is the process of selecting a subset of

• Linear Regression Test: Foreach feature, run a linear regression with

• Steps involved in Wrapper Methods

• Random Forest is a popular machine learning algorithm that

You might also like