0% found this document useful (0 votes)

315 views6 pages

Confusion Matrix Tutorial for Model Evaluation

This document discusses confusion matrices for evaluating classification models. It introduces confusion matrices, providing examples of binary and multi-class matrices. It then loads and explores the mushroom dataset, creating feature and target datasets. Various classifiers are defined in a dictionary to apply to the data and evaluate with confusion matrices. A function is defined to generate confusion matrices for each classifier in the dictionary when passed feature, target data and the classifier dictionary.

Uploaded by

amir

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

315 views6 pages

Confusion Matrix Tutorial for Model Evaluation

Uploaded by

amir

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

   

(../../index.html) (../../pages/about- (../../archive.html) (../../categories/index.html)

me/index.html)

Metrics - Confusion Matrix (.)

 3 years ago (.)  Harun  Source (index.ipynb) 
 Tags: confusion matrix (../../categories/confusion-matrix/)

In a typical data science project we try several models like (logistic regression, SVM, tree-classifiers
etc) on our data.
Then we measure the predicting performance of these models to find the best performing one.
Finally we decide to implement the best performing model.
In this notebook we talk about one of the classification model evaluation tools: Confusion matrix.
They can help us to see deeper how much reliable our models are.
We are going to look at the confusion matrices of a variety of Scikit-Learn models and compare them
using visual diagnostic tools from Yellowbrick in order to select the best model for our data.

In [4]: # Notebook setup

import pandas as pd

from sklearn.model_selection import train_test_split

from sklearn.pipeline import Pipeline

from sklearn.linear_model import LogisticRegression

from sklearn.svm import LinearSVC

from sklearn.ensemble import RandomForestClassifier

from sklearn.linear_model import SGDClassifier

import category_encoders as ce

from yellowbrick.classifier import ConfusionMatrix

import matplotlib.pyplot as plt

import warnings

warnings.filterwarnings("ignore")

%matplotlib inline

Confusion Matrix
Since we know the labels of the test set we can measure how succesfull are the predictions of the
model by comparing the the actual labels and predictions
We can see if our classifier identifies the samples succesfully, or it is "CONFUSED" with another label?
Confusion matrix shows the amount of confusion.
We use confusion matrices to understand which classes are most easily confused
There are two sets of labels in a confusion matrix of binary (2 class) classification:
{POSITIVE, NEGATIVE}- first, the model makes the prediction. It returns the labels 1 (POSITIVE)
or 0 (NEGATIVE).
{TRUE, FALSE}- then the model's prediction is evaluated if the prediction is made correctly
(TRUE) or incorrectly (FALSE) based on the actual known labels.

Tip: If you have difficulty to remember these terms because of the similarity of the terms just insert
the word "PREDICTED" in the middle.
For instance if you are confused with the meaning of "false positive" read it like
"false(ly) PREDICTED positive"

We create a matrix of predictions and actual labels for a binary classification.

One diagonal show the succesful predictions the other diagonal unsuccessful predictions

In multi-class classification ie if the class labels are more than 2 (not 1 or 0, positive or negative), the
confusion matrix looks like something like this

We do not use the terms like "true positive" with the confusion matrix with classes more than 2
The size of the confusion matrix is nxn where n is the number of classes
Different references may use a different convention for axes ie actual and predicted classes can take
place on different axes

Examples of confusion matrices

Lets demonstrate some confusion matrices
We will utilize the tools from yellowbrick library (http://www.scikit-
yb.org/en/latest/api/classifier/confusion_matrix.html). This is a nice library for machine learning
visualization.

Loading and Exploring the Dataset

This tutorial uses a modified version of the mushroom dataset from the UCI Machine Learning
Repository.
Even though these toy datasets are not interesting anymore because of repetitive usage, here our
focus is the classification metrics not other steps of data processing. So we try to get the advantage of
fast implementation of this dataset
Our objective is to predict if a mushroom is poisonous or edible based on its characteristics.
The data include descriptions of hypothetical samples corresponding to 23 species of mushrooms
Each species was identified as definitely edible or poisonous

In [5]: # Url of the dataset

url='https://raw.githubusercontent.com/rebeccabilbro/rebeccabilbro.github.io/master/data/a
garicus-lepiota.txt'

# Column names list

column_names=['class', 'cap-shape', 'cap-surface', 'cap-color']

# Load the data

mushrooms=pd.read_csv(url, header=None, names= column_names)

mushrooms.head(3)

Out[5]:
class cap-shape cap-surface cap-color

0 edible convex smooth yellow

1 edible bell smooth white

2 poisonous convex scaly white

In [6]: mushrooms.info()

RangeIndex: 8123 entries, 0 to 8122

Data columns (total 4 columns):

class 8123 non-null object

cap-shape 8123 non-null object

cap-surface 8123 non-null object

cap-color 8123 non-null object

dtypes: object(4)

memory usage: 253.9+ KB

In [7]: ## Count the unique values in each column

mushrooms.nunique()

Out[7]: class 2

cap-shape 6

cap-surface 4

cap-color 10

dtype: int64

We see that target and feature columns contain different categorical values.
We need to encode them into numerical types in order to fit Sklearn models.
For this purpose, we will utilize Category Encoders (http://contrib.scikit-learn.org/categorical-
encoding/index.html)
library which provides scikit-learn-compatible categorical variable encoders.
All the transformers of Category Encoders can be used in Sklearn pipelines.
Later in a separate post we will analyse the encoders

Target and Features Datasets

In [8]: # Create the features dataset (X) and target dataset (y)

features = ['cap-shape', 'cap-surface', 'cap-color']

target = 'class'

X = mushrooms[features]

y = mushrooms[target]

Classifiers Dictionary
Now, tet's create a dictionary which contains the classifiers we want to use for our classification task
Here we create the dictionary with instantiates of Sklearn estimators without hyperparameter
tuning.
In reality we need to evaluate the performance of tuned classifiers.
In [10]: # Estimators dictionary

# We can add as more classifiers to our dictionary

# This is just a sample

estimators_dct={"Logistic Legression": LogisticRegression(),

"Linear SVC" : LinearSVC(),

"Random Forest": RandomForestClassifier(n_estimators=8),

"SGD Classifier": SGDClassifier(),

confusion_matrices function
Let's define a function to get the confusion matrices of a given dictionary of models (like in the upper
cell) easily without repetion.

Our function will

take X, y datasets and an estimator dictionary
return the confusion matrices produced by the predictions of each model in the dictionary

In [9]: # set up the figure size for the confusion matrices

plt.rcParams['figure.figsize'] = (6, 4)

plt.rcParams['font.size'] = 15

def confusion_matrices(X, y, estimator_dict):

"""

Takes X, y datasets and an estimator dictionary -> returns confusion matrices of the c
lassifiers

"""

# Split the data as train and test

X_train, X_test, y_train, y_test= train_test_split(X, y, test_size=0.2, random_state=1

# Loop over the estimator_dict keys to get the each estimator

for estimator in estimator_dict.keys():

print(estimator)

# In the pipeline we use OneHotEncoder from Category Encoders

model = Pipeline([('encoder', ce.OneHotEncoder()),

('estimator', estimator_dict[estimator])])

# Instantiate the classification model and visualizer

model.fit(X_train, y_train)

# The ConfusionMatrix visualizer takes a model

cm = ConfusionMatrix(model, fontsize=13, cmap='YlOrBr')

# To create the ConfusionMatrix, we need some test data

# Score runs predict() on the data and then

# creates the confusion_matrix from scikit-learn

cm.score(X_test, y_test)

cm.poof()

In [11]: # Call the confusion_matrices function to get the confusion matrices

confusion_matrices(X, y, estimators_dct)

Logistic Legression

Linear SVC

Random Forest

SGD Classifier

ConfusionMatrix from Yellowbrick

takes a fitted scikit-learn classifier and a set of X_test and y_test values and
returns a report showing how each of the test samples predicted classes compare to their actual
classes.
Confusion matrices provide similar information as what is available in a ClassificationReport (we
will talk about it soon), but rather than top-level scores, they provide deeper insight into the classifica-
tion of individual data points.
Creates a heatmap visualization of the sklearn.metrics.confusion_matrix().
We can choose between displaying values as the percent of true (cell value divided by sum of row) or
as direct counts.

Conclusion
Even though we can get deeper insight on predictions of the classifiers by confusion matrices, still it is
not very practical to compare several models performance with each other
Since confusion matrices provides tables of actual and prediction comparision we still need some
more metrics to interpret the result more directly to choose the best model
So we will continue to work on classification metrics like precision, recall, roc, auc etc in the next
posts

Sources:
http://www.scikit-yb.org/en/latest/api/classifier/confusion_matrix.html (http://www.scikit-
yb.org/en/latest/api/classifier/confusion_matrix.html)
http://contrib.scikit-learn.org/categorical-encoding/index.html (http://contrib.scikit-
learn.org/categorical-encoding/index.html)

Previous post (../metrics/) Next post (../gps-data-analysis/)

Comments
Contents © 2019 Harun (mailto:[email protected]) - Powered by Nikola (https://getnikola.com)

Python Recommender Systems Guide
No ratings yet
Python Recommender Systems Guide
13 pages
Security Data Visualization Graphical Techniques F... - (4 Vulnerability Assessment and Exploitation)
No ratings yet
Security Data Visualization Graphical Techniques F... - (4 Vulnerability Assessment and Exploitation)
24 pages
FIND-S and Candidate-Elimination Algorithms
100% (1)
FIND-S and Candidate-Elimination Algorithms
44 pages
Simple - Linear - Regression - Ipynb - Colaboratory
No ratings yet
Simple - Linear - Regression - Ipynb - Colaboratory
2 pages
Vision AI: Understanding Images
No ratings yet
Vision AI: Understanding Images
10 pages
Palmer Penguin
No ratings yet
Palmer Penguin
50 pages
Lecture 3 - MDPs and Dynamic Programming
No ratings yet
Lecture 3 - MDPs and Dynamic Programming
66 pages
Beginner's Guide to ML Models
No ratings yet
Beginner's Guide to ML Models
12 pages
Decision Tree
No ratings yet
Decision Tree
74 pages
CNN Course: Deep Learning with Keras
No ratings yet
CNN Course: Deep Learning with Keras
2 pages
MACHINE LEARNING ALGORITHM Unit-II
No ratings yet
MACHINE LEARNING ALGORITHM Unit-II
115 pages
Crude Blending Model for Gasoline Production
No ratings yet
Crude Blending Model for Gasoline Production
4 pages
Taxi and Car Wash Simulation Analysis
No ratings yet
Taxi and Car Wash Simulation Analysis
1 page
AIML Online
No ratings yet
AIML Online
16 pages
Random Forests for Data Scientists
100% (1)
Random Forests for Data Scientists
12 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
56 pages
Intro To Machine Learning With Apache Cassandra and Apache Spark
No ratings yet
Intro To Machine Learning With Apache Cassandra and Apache Spark
80 pages
NumPy 2.0 Reference Guide
No ratings yet
NumPy 2.0 Reference Guide
1,183 pages
Evaluation of Smart Grid Technologies Based On Decision Support System
No ratings yet
Evaluation of Smart Grid Technologies Based On Decision Support System
6 pages
Unsupervised Learning in Machine Learning
No ratings yet
Unsupervised Learning in Machine Learning
11 pages
Aspiring Data Scientist Guide
No ratings yet
Aspiring Data Scientist Guide
10 pages
ML Course Outline
No ratings yet
ML Course Outline
4 pages
Project Report Hate
100% (1)
Project Report Hate
24 pages
Heuristic Search Techniques
No ratings yet
Heuristic Search Techniques
54 pages
Advanced Simpy
No ratings yet
Advanced Simpy
25 pages
Machine Learning in Mechanical Engineering
No ratings yet
Machine Learning in Mechanical Engineering
20 pages
CH 22 Analytical Decision Making
No ratings yet
CH 22 Analytical Decision Making
26 pages
Library Management - A Survey
No ratings yet
Library Management - A Survey
6 pages
Data Preprocessing & Mining Techniques
No ratings yet
Data Preprocessing & Mining Techniques
8 pages
Setup: Chapter 2 - End-To-End Machine Learning Project
No ratings yet
Setup: Chapter 2 - End-To-End Machine Learning Project
31 pages
R-Codes SCS1621
No ratings yet
R-Codes SCS1621
151 pages
Bootstrap Powerpoint
100% (1)
Bootstrap Powerpoint
20 pages
Ad3451 ML Unit 4 Notes Eduengg
No ratings yet
Ad3451 ML Unit 4 Notes Eduengg
36 pages
Data Science and Python for Business Insights
No ratings yet
Data Science and Python for Business Insights
12 pages
Linear Models & SVM in Machine Learning
100% (1)
Linear Models & SVM in Machine Learning
23 pages
20cs51i Makeup Exam September 2023 QP - Deemech
No ratings yet
20cs51i Makeup Exam September 2023 QP - Deemech
2 pages
ML Lab Experiments
No ratings yet
ML Lab Experiments
116 pages
Understanding AI Learning Systems
No ratings yet
Understanding AI Learning Systems
27 pages
Getting Started with Python for XI Class
No ratings yet
Getting Started with Python for XI Class
17 pages
PhD Proposal: ML for Fractured Media
No ratings yet
PhD Proposal: ML for Fractured Media
2 pages
Vision-Face Recognition Attendance Monitoring System For Surveillance Using Deep Learning Technology and Computer Vision
No ratings yet
Vision-Face Recognition Attendance Monitoring System For Surveillance Using Deep Learning Technology and Computer Vision
5 pages
PPT1
No ratings yet
PPT1
93 pages
MachineLearning Presentation
No ratings yet
MachineLearning Presentation
71 pages
Lab 1 - Installing Python and Setting Up The Environment
100% (1)
Lab 1 - Installing Python and Setting Up The Environment
11 pages
Ocs353 DSF Unit III Notes
No ratings yet
Ocs353 DSF Unit III Notes
11 pages
Machine Learning Lab: Regression Analysis
No ratings yet
Machine Learning Lab: Regression Analysis
15 pages
Unit VI Graphics Turtle Programming
No ratings yet
Unit VI Graphics Turtle Programming
18 pages
Decision Tree Classification on Iris Dataset
No ratings yet
Decision Tree Classification on Iris Dataset
6 pages
Python For Data Science and Machine Learning
No ratings yet
Python For Data Science and Machine Learning
3 pages
AAL Programs
No ratings yet
AAL Programs
12 pages
Python Data Analysis & Visualization
No ratings yet
Python Data Analysis & Visualization
34 pages
Multivariate Linear Regression
No ratings yet
Multivariate Linear Regression
30 pages
IT Semester Curriculum Overview
No ratings yet
IT Semester Curriculum Overview
191 pages
EM Algorithm in Machine Learning Explained
No ratings yet
EM Algorithm in Machine Learning Explained
3 pages
Ensemble Learning-Bagging-Boosting-Stacking
No ratings yet
Ensemble Learning-Bagging-Boosting-Stacking
12 pages
Data Visualization
No ratings yet
Data Visualization
35 pages
05 ANN Artificial Neural Networks
No ratings yet
05 ANN Artificial Neural Networks
216 pages
ML 4
No ratings yet
ML 4
5 pages
ADS - Phase 3
No ratings yet
ADS - Phase 3
34 pages
DT-200i Waste Compactor: User-/service Manual
No ratings yet
DT-200i Waste Compactor: User-/service Manual
37 pages
1.e.-Scotman - Uc2024-200
No ratings yet
1.e.-Scotman - Uc2024-200
2 pages
Manual VF5 EN PDF
87% (15)
Manual VF5 EN PDF
8 pages
Soil Moisture Content Measurement
No ratings yet
Soil Moisture Content Measurement
9 pages
Aga 3
No ratings yet
Aga 3
9 pages
2-1 GripTight Test Plugs
No ratings yet
2-1 GripTight Test Plugs
12 pages
Mejora de Prácticas Organizacionales en Medicina Basada en Evidencia
No ratings yet
Mejora de Prácticas Organizacionales en Medicina Basada en Evidencia
14 pages
Almacenamiento de Clinker
100% (1)
Almacenamiento de Clinker
12 pages
Amar Singh's Professional CV
No ratings yet
Amar Singh's Professional CV
4 pages
Sheet Metal Forming Processes Overview
No ratings yet
Sheet Metal Forming Processes Overview
34 pages
Power BI Dataflows Usage Guide
100% (4)
Power BI Dataflows Usage Guide
5 pages
Training Report BALCO
No ratings yet
Training Report BALCO
16 pages
Production Process of Stirred Yoghurt
No ratings yet
Production Process of Stirred Yoghurt
3 pages
Organ Requirements and Information: Entering For An Exam
No ratings yet
Organ Requirements and Information: Entering For An Exam
5 pages
Ca M249
No ratings yet
Ca M249
13 pages
RA 9292: Electronics Engineering Law
No ratings yet
RA 9292: Electronics Engineering Law
5 pages
TG Tet
No ratings yet
TG Tet
2 pages
Standard Labour Rate
100% (1)
Standard Labour Rate
10 pages
Zigbee Literature Review
No ratings yet
Zigbee Literature Review
22 pages
RAM Spreaders
No ratings yet
RAM Spreaders
70 pages
XT-2000i/XT-1800i ASTM Interface Guide
No ratings yet
XT-2000i/XT-1800i ASTM Interface Guide
43 pages
9.ELECTROMAGNETISM Phy in Sec 2024 - Grammlogic
No ratings yet
9.ELECTROMAGNETISM Phy in Sec 2024 - Grammlogic
21 pages
Sand Volleyball Court Details
No ratings yet
Sand Volleyball Court Details
1 page
7in Gravel Pack Assys
No ratings yet
7in Gravel Pack Assys
10 pages
Aeronautical Engineering Student List
No ratings yet
Aeronautical Engineering Student List
8 pages
PVC Calendering Process Guide
100% (1)
PVC Calendering Process Guide
37 pages
Psychomotor Domain Assessment Rubric
No ratings yet
Psychomotor Domain Assessment Rubric
1 page
As 4428.4-2004 Fire Detection Warning Control and Intercom Systems - Control and Indicating Equipment Interco
No ratings yet
As 4428.4-2004 Fire Detection Warning Control and Intercom Systems - Control and Indicating Equipment Interco
7 pages
Engineering Students' Fluid Lab
No ratings yet
Engineering Students' Fluid Lab
17 pages
MIS & SSAD for IT Professionals
No ratings yet
MIS & SSAD for IT Professionals
25 pages

Confusion Matrix Tutorial for Model Evaluation

Uploaded by

Confusion Matrix Tutorial for Model Evaluation

Uploaded by

   

(../../index.html) (../../pages/about- (../../archive.html) (../../categories/index.html)

Metrics - Confusion Matrix (.)

In [4]: # Notebook setup

from sklearn.model_selection import train_test_split

from sklearn.pipeline import Pipeline

from sklearn.linear_model import LogisticRegression

from sklearn.svm import LinearSVC

from sklearn.ensemble import RandomForestClassifier

from sklearn.linear_model import SGDClassifier

from yellowbrick.classifier import ConfusionMatrix

import matplotlib.pyplot as plt

We create a matrix of predictions and actual labels for a binary classification.

Examples of confusion matrices

Loading and Exploring the Dataset

In [5]: # Url of the dataset

# Column names list

column_names=['class', 'cap-shape', 'cap-surface', 'cap-color']

# Load the data

mushrooms=pd.read_csv(url, header=None, names= column_names)

0 edible convex smooth yellow

1 edible bell smooth white

2 poisonous convex scaly white

RangeIndex: 8123 entries, 0 to 8122

Data columns (total 4 columns):

class 8123 non-null object

cap-shape 8123 non-null object

cap-surface 8123 non-null object

cap-color 8123 non-null object

memory usage: 253.9+ KB

In [7]: ## Count the unique values in each column

Target and Features Datasets

features = ['cap-shape', 'cap-surface', 'cap-color']

# We can add as more classifiers to our dictionary

# This is just a sample

estimators_dct={"Logistic Legression": LogisticRegression(),

"Linear SVC" : LinearSVC(),

"Random Forest": RandomForestClassifier(n_estimators=8),

"SGD Classifier": SGDClassifier(),

Our function will

In [9]: # set up the figure size for the confusion matrices

def confusion_matrices(X, y, estimator_dict):

# Split the data as train and test

X_train, X_test, y_train, y_test= train_test_split(X, y, test_size=0.2, random_state=1

# Loop over the estimator_dict keys to get the each estimator

for estimator in estimator_dict.keys():

# In the pipeline we use OneHotEncoder from Category Encoders

model = Pipeline([('encoder', ce.OneHotEncoder()),

# Instantiate the classification model and visualizer

# The ConfusionMatrix visualizer takes a model

cm = ConfusionMatrix(model, fontsize=13, cmap='YlOrBr')

# To create the ConfusionMatrix, we need some test data

# Score runs predict() on the data and then

# creates the confusion_matrix from scikit-learn

In [11]: # Call the confusion_matrices function to get the confusion matrices

ConfusionMatrix from Yellowbrick

Previous post (../metrics/) Next post (../gps-data-analysis/)

You might also like