Multiclass Receiver Operating Characteristic (roc) in Scikit Learn
Last Updated :
26 Apr, 2025
The ROC curve is used to measure the performance of classification models. It shows the relationship between the true positive rate and the false positive rate. The ROC curve is used to compute the AUC score. The value of the AUC score ranges from 0 to 1. The higher the AUC score, the better the model. This article discusses how to use the ROC curve in scikit learn.
ROC for Multi class Classification
Now, let us understand how to use ROC for multi class classifier. So, we will build a simple logistic regression model to predict the type of iris. We will be using the iris dataset provided by sklearn. The iris dataset has 4 features and 3 target classes (Setosa, Versicolour, and Virginica).
Import Required Libraries
Python libraries make it very easy for us to handle the data and perform typical and complex tasks with a single line of code.
- Matplotlib/Seaborn – This library is used to draw visualisations.
- Sklearn – This module contains multiple libraries having pre-implemented functions to perform tasks from data preprocessing to model development and evaluation.
Python3
from sklearn.preprocessing import label_binarize
from sklearn.model_selection import train_test_split
from sklearn.multiclass import OneVsRestClassifier
from sklearn.metrics import roc_curve, auc, RocCurveDisplay
from sklearn.linear_model import LogisticRegression
from sklearn import datasets
from itertools import cycle
import matplotlib.pyplot as plt
|
Load the Dataset
We will use the iris datasets which is one of the most common benchmark dataset for the classification models. Let’s load this dataset using the sklearn.datasets.
Python3
iris_data = datasets.load_iris()
features = iris_data.data
target = iris_data.target
|
In this case, we will utilise the one vs rest strategy. So, there will be 3 cases.
Case 1:
Positive class - Setosa,
Negative Class - Versicolour and Virginica
Case 2:
Positive class - Versicolour,
Negative Class - Setosa and Virginica
Case 3:
Positive class - Virginica,
Negative Class - Versicolour and Setosa
Hence, we would have 3 ROC curves. We take the average of these 3 cases to report the final accuracy of the model. Now, Lets take a look at features.
Output:
array([[5.1, 3.5, 1.4, 0.2],
[4.9, 3. , 1.4, 0.2],
[4.7, 3.2, 1.3, 0.2],
[4.6, 3.1, 1.5, 0.2],
[5. , 3.6, 1.4, 0.2]])
Now, lets see the target values.
Output:
array([0, 0, 0, 0, 0])
First we need to use binarization on target values as shown below.
Python3
target = label_binarize(target,
classes = [ 0 , 1 , 2 ])
target[: 5 ]
|
Output:
array([[1, 0, 0],
[1, 0, 0],
[1, 0, 0],
[1, 0, 0],
[1, 0, 0]])
Model Development and Training
We will create separate models for each case.
Python3
train_X, test_X,\
train_y, test_y = train_test_split(features,
target,
test_size = 0.25 ,
random_state = 42 )
model_1 = LogisticRegression(random_state = 0 )\
.fit(train_X, train_y[:, 0 ])
model_2 = LogisticRegression(random_state = 0 )\
.fit(train_X, train_y[:, 1 ])
model_3 = LogisticRegression(random_state = 0 )\
.fit(train_X, train_y[:, 2 ])
print (f "Model Accuracy :" )
print (f "model 1 - {model_1.score(test_X, test_y[:, 0])}" )
print (f "model 2 - {model_2.score(test_X, test_y[:, 1])}" )
print (f "model 3 - {model_3.score(test_X, test_y[:, 2])}" )
|
Output:
Model Accuracy :
model 1 - 1.0
model 2 - 0.7368421052631579
model 3 - 1.0
If we take the average of these accuracies, we get an overall accuracy of 91.2%.
Computing ROC – AUC Score
Now let’s calculate the ROC – AUC score for the predictions made by the model using the one v/s all method.
Python3
model = OneVsRestClassifier(LogisticRegression(random_state = 0 ))\
.fit(train_X, train_y)
prob_test_vec = model.predict_proba(test_X)
n_classes = 3
fpr = [ 0 ] * 3
tpr = [ 0 ] * 3
thresholds = [ 0 ] * 3
auc_score = [ 0 ] * 3
for i in range (n_classes):
fpr[i], tpr[i], thresholds[i] = roc_curve(test_y[:, i],
prob_test_vec[:, i])
auc_score[i] = auc(fpr[i], tpr[i])
auc_score
|
Output:
[1.0, 0.8047138047138047, 1.0]
The AUC score with Setosa as positive class is 1, with Versicolour as positive class is 0.805, and with Virginica as positive class is 1.
After taking the average we get 93.49% accuracy.
Python3
sum (auc_score) / n_classes
|
0.9349046015712682
Visualizing ROC Curve
Now by using the predictions for the three classes we will try to visualise the roc curve for each of the classes. Like one v/s all.
Python3
fig, ax = plt.subplots(figsize = ( 10 , 10 ))
target_names = iris_data.target_names
colors = cycle([ "aqua" , "darkorange" , "cornflowerblue" ])
for class_id, color in zip ( range (n_classes), colors):
RocCurveDisplay.from_predictions(
test_y[:, class_id],
prob_test_vec[:, class_id],
name = f "ROC curve for {target_names[class_id]}" ,
color = color,
ax = ax,
)
|
Output:

ROC Curve for each of the classes.
In the above graph we are able to see only two lines but that is not the case because the roc-auc score for setosa and the virginica class is same due to which the roc curve for these two classes overlap with each other.
Similar Reads
Receiver Operating Characteristic (ROC) with Cross Validation in Scikit Learn
In this article, we will implement ROC with Cross-Validation in Scikit Learn. Before we jump into the code, let's first understand why we need ROC curve and Cross-Validation in Machine Learning model predictions. Receiver Operating Characteristic Curve (ROC Curve) To understand the ROC curve one mu
3 min read
Multiclass classification using scikit-learn
Multiclass classification is a popular problem in supervised machine learning. Problem - Given a dataset of m training examples, each of which contains information in the form of various features and a label. Each label corresponds to a class, to which the training example belongs. In multiclass cla
5 min read
Comparison of Manifold Learning methods in Scikit Learn
When working high dimensional data it is very difficult to process it in machine learning model as it is computationally very expensive where each data point has a number of properties. Reducing the amount of features in a dataset is done using the dimensionality reduction technique. One of the tech
5 min read
Clustering Performance Evaluation in Scikit Learn
In this article, we shall look at different approaches to evaluate Clustering Algorithms using Scikit Learn Python Machine Learning Library. Clustering is an Unsupervised Machine Learning algorithm that deals with grouping the dataset to its similar kind data point. Clustering is widely used for Seg
3 min read
RandomForestClassifier vs ExtraTreesClassifier in scikit learn
machine learning, ensemble methods have proven to be powerful tools for improving model performance. Two popular ensemble methods implemented in Scikit-Learn are the RandomForestClassifier and the ExtraTreesClassifier. While both methods are based on decision trees and share many similarities, they
7 min read
Multiple Linear Regression With scikit-learn
In this article, let's learn about multiple linear regression using scikit-learn in the Python programming language. Regression is a statistical method for determining the relationship between features and an outcome variable or result. Machine learning, it's utilized as a method for predictive mode
8 min read
Probability Calibration for 3-class Classification in Scikit Learn
Probability calibration is a technique to map the predicted probabilities of a model to their true probabilities. The probabilities predicted by some classification algorithms like Logistic Regression, SVM, or Random Forest may not be well calibrated, meaning they may not accurately reflect the true
4 min read
Plot Multinomial and One-vs-Rest Logistic Regression in Scikit Learn
Logistic Regression is a popular classification algorithm that is used to predict the probability of a binary or multi-class target variable. In scikit-learn, there are two types of logistic regression algorithms: Multinomial logistic regression and One-vs-Rest logistic regression. Multinomial logis
4 min read
Probability Calibration of Classifiers in Scikit Learn
In this article, we will explore the concepts and techniques related to the probability calibration of classifiers in the context of machine learning. Classifiers in machine learning frequently provide probabilities indicating how confident they are in their predictions. However, the probabilities m
4 min read
Iso-Probability Lines for Gaussian Processes Classification (GPC) in Scikit Learn
Gaussian Processes (GPs) are a powerful tool for probabilistic modeling and have been widely used in various fields such as machine learning, computer vision, and signal processing. Gaussian Processes Classification is a classification technique based on Gaussian Processes to model the probability o
11 min read