1
AGENDA
Problem Statement
Domain Study
Problem defining
Feature Engineering
Data Frame Creation
Data set preprocessing
2
What is a
Problem
Statement
?
3
4
5
DOMAIN
STUDY
DOMAIN STUDY
It aims to identify features common to
a domain of applications, selecting and
abstracting the objects and operations
that characterize those features.
7
F E AT U R E E N G I N E E R I N G
Feature engineering is the process of using domain
knowledge of the data to create features that make
machine learning algorithms work.
If feature engineering is done correctly, it increases
the predictive power of machine learning algorithms by
creating features from raw data that help facilitate the
machine learning process.
Feature engineering is the most important art in machine
learning which creates a huge difference between a good
model and a bad model.
S T E P S T H AT A R E I N V O LV E D W H I L E S O LV I N G A N Y
PROBLEM IN MACHINE LEARNING ARE AS FOLLOWS:
F E AT U R E S E L E C T I O N
Feature Selection is the method of reducing the
input variable to your model by using only
relevant data and getting rid of noise in data.
It is the process of automatically choosing
relevant features for your machine learning
model based on the type of problem you are trying
to solve.
B E N E F I T S O F F E AT U R E S E L E C T I O N
Click icon to add picture
F E AT U R E T R A N S F O R M AT I O N
It refers to the algorithm family that creates new
features using the existing features.
These new features may not have the same
interpretation as the original features, but they may
have more explanatory power in a different space rather
than in the original space. This can also be used
for Feature Reduction.
Icon
F E AT U R E C R E AT I O N
Creating features involves creating new
variables which will be most helpful for our
model.
These artificial features are then used by that
algorithm in order to improve its performance,
or in other words reap better results.
F E AT U R E E N C O D I N G
Transforming the categorical
values of the relevant features into
numerical ones. This process is
called feature encoding.
F E AT U R E B I N N I N G / D I S C R E T I Z AT I O N
Feature binning aggregates large amounts of point features
into dynamic polygon bins that vary through scaled levels of
detail.
Feature binning can improve both drawing performance
and data comprehension.
F E AT U R E E X T R A C T I O N
Feature extraction helps to reduce the amount of redundant
data from the data set.
It yields better results than applying machine learning directly to
the raw data.