Data Mining Process
Data Mining Process
CRISP DM process
The methodical discovery of useful relationships and patterns in data is enabled by
a set of iterative activities collectively known as data science process
Modeling
Deploying and maintaining the
model
Evaluation
Process
Business Data
Understanding Understanding 1. Prior Knowledge
Prepare Data
2. Preparation
Building Model using
Training Data
Algorithms
3. Modeling
Test Data Applying Model and
performance evaluation
4. Application
Deployment
Example: for the lending example, a simple data set of ten points
Terminologies used
A Dataset
A datapoint
An Attribute
A label
Identifiers
2. Data Preparation
Data Exploration
Data quality
Handling missing values
Data type conversion
Transformation
Outliers
Feature selection
Sampling
3. Modeling
Training Data Build model
Final Model
3.Spliting
Modeling
training and test data sets
3.Spliting
Modeling
training and test data sets
Training Data
Test Data
3. Modeling
3. Modeling
Product readiness
Technical integration
Model response time
Remodeling
Assimilation
5. Knowledge
Posterior knowledge
Kotu, V., & Deshpande, B. (2014). Predictive analytics and data mining: concepts and practice with rapidminer. Morgan Kaufmann.