Module-3 - DS (Autosaved)
Module-3 - DS (Autosaved)
Extracting Meaning from Data: Motivating application: user (customer) retention. Feature
Generation (brainstorming, role of domain expertise, and place for imagination), Feature Selection
algorithms. Filters; Wrappers; Decision Trees; Random Forests. Recommendation Systems:
Building a User-Facing Data Product, Algorithmic ingredients of a Recommendation Engine,
Dimensionality Reduction, Singular Value Decomposition, Principal Component Analysis, Exercise:
build your own recommendation system.
Feature Selection
• Feature selection methods can be broadly categorized into three
types:
Wrapper Method: Based on the evaluation of the feature subset using a specific machine
learning algorithm. The feature subset that results in the best performance is selected.
Embedded Method: Based on the feature selection as part of the training process of the machine
learning algorithm.
Filter-based feature selection:
• Filter based approach selects the best feature from the original
features set based on some statistical criteria.
• The process of selecting the significant features is independent of ML
algorithms that will be used in building the model.
The outline of the filter approach is as shown below:
Feature selection techniques
• Two strategies are used by feature selection techniques:
• eliminate the correlated/redundant features
• select features which impact the target variable
• Following are filter-based features selection methods: