Voting Classifier using Sklearn - ML
Last Updated :
07 May, 2025
A Voting Classifier is a ensemble learning technique that combines multiple individual models to make predictions. It predicts output based on majority decision of the models. Instead of using a single model to make predictions, a Voting Classifier trains multiple models and makes the final prediction by aggregating their results. It supports two types of votings :
1. Hard Voting
In hard voting each classifier casts a "vote" for a class. The class that gets the most votes is the final prediction. For example:
- Classifier 1 predicts: Class A
- Classifier 2 predicts: Class A
- Classifier 3 predicts: Class B
Here Class A gets two votes and Class B gets one vote so the final prediction is Class A.
2. Soft Voting
In soft voting instead of choosing the class with the most votes we take the average of the predicted probabilities for each class. The class with the highest average probability is the final prediction. For example suppose three models predict the following probabilities for two classes (A and B):
- Class A: [0.30, 0.47, 0.53]
- Class B: [0.20, 0.32, 0.40]
The average probability for Class A is \frac{0.30 + 0.47 + 0.53}{3} = 0.43 and for Class B is \frac{0.20 + 0.32 + 0.40}{3} = 0.31. Since Class A has the highest average probability it will be chosen as the final prediction. To get the best results it is essential to use a variety of models in the Voting Classifier. This way errors made by one model can be corrected by the others.
Python Implementation of Voting Classifier
Step 1: Import Required Libraries
We first need to import the necessary libraries for classifier, dataset and model evaluation.
Python
from sklearn.ensemble import VotingClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC
from sklearn.tree import DecisionTreeClassifier
from sklearn.datasets import load_iris
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
Step 2: Load the Dataset
We will use the Iris dataset which is a popular dataset for classification tasks. The load_iris() function provides the dataset and we will extract features and labels.
Python
iris = load_iris()
X = iris.data[:, :4]
Y = iris.target
Step 3: Split the Data into Training and Testing Sets
We need to divide the data into training and testing sets. We'll use 80% of the data for training and 20% for testing with the help of train_test_split() function.
Python
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.20, random_state=42)
Step 4: Create Ensemble of Models
We will create a list of different classifiers to combine into our Voting Classifier. Here we are using Logistic Regression, Support Vector Classifier (SVC) and Decision Tree Classifier.
Python
models = [
('LR', LogisticRegression(solver='lbfgs', multi_class='multinomial', max_iter=200)),
('SVC', SVC(gamma='auto', probability=True)),
('DTC', DecisionTreeClassifier())
]
Step 5: Initialize and Train the Voting Classifier with Hard Voting
We will first create a Voting Classifier that uses Hard Voting. This mean each classifier will vote for a class and the class with the most votes wins. After initializing we will fit the classifier to the training data.
Python
vot_hard = VotingClassifier(estimators=models, voting='hard')
vot_hard.fit(X_train, y_train)
Step 6: Making Predictions and Evaluating
We use the trained Hard Voting classifier to predict the test set and calculate the accuracy.
Python
y_pred = vot_hard.predict(X_test)
score = accuracy_score(y_test, y_pred)
print("Hard Voting Accuracy: %d" % score)
Output:
Hard Voting Accuracy: 1
Step 7: Initialize and Train the Voting Classifier with Soft Voting
Next, we will create a Soft Voting classifier. Soft voting takes the average probability of each class from all the classifiers and selects the class with the highest average probability.
Python
vot_soft = VotingClassifier(estimators=models, voting='soft')
vot_soft.fit(X_train, y_train)
Step 8: Making Predictions and Evaluating
Finally we will use the Soft Voting classifier to predict the test set and calculate its accuracy.
Python
y_pred = vot_soft.predict(X_test)
score = accuracy_score(y_test, y_pred)
print("Soft Voting Accuracy: %d" % score)
Output:
Soft Voting Accuracy: 1
Both Hard and Soft Voting classifiers gave 100% accurate results. Hard Voting used majority votes while Soft Voting average prediction probabilities to make correct predictions.
Similar Reads
ML | Voting Classifier using Sklearn
A Voting Classifier is a ensemble learning technique that combines multiple individual models to make predictions. It predicts output based on majority decision of the models. Instead of using a single model to make predictions, a Voting Classifier trains multiple models and makes the final predicti
4 min read
Dummy Classifiers using Sklearn - ML
Dummy classifier is a classifier that classifies data with basic rules without producing any insight from the training data. It entirely disregards data trends and outputs the class label based on pre-specified strategies. A dummy classifier is designed to act as a baseline, with which more sophisti
3 min read
Classification Metrics using Sklearn
Machine learning classification is a powerful tool that helps us make predictions and decisions based on data. Whether it's determining whether an email is spam or not, diagnosing diseases from medical images, or predicting customer churn, classification algorithms are at the heart of many real-worl
14 min read
Voting Classifier
We can create prediction models using a variety of machine learning algorithms and approaches, which is an exciting subject. Scikit-Learn Voting Classifier is one such method that may dramatically improve the performance of your models. An ensemble learning approach combines many base models to get
7 min read
Ensemble Classifier | Data Mining
Ensemble learning helps improve machine learning results by combining several models. This approach allows the production of better predictive performance compared to a single model. Basic idea is to learn a set of classifiers (experts) and to allow them to vote. Advantage : Improvement in predictiv
3 min read
Easy Ensemble Classifier in Machine Learning
The Easy Ensemble Classifier (EEC) is an advanced ensemble learning algorithm specifically designed to address class imbalance issues in classification tasks. It enhances the performance of models on imbalanced datasets by leveraging oversampling and ensembling techniques to improve classification a
5 min read
Classification Using Sklearn Multi-layer Perceptron
Multi-Layer Perceptrons (MLPs) are a type of neural network commonly used for classification tasks where the relationship between features and target labels is non-linear. They are particularly effective when traditional linear models are insufficient to capture complex patterns in data. This includ
5 min read
Random Forest Classifier using Scikit-learn
Random Forest is a method that combines the predictions of multiple decision trees to produce a more accurate and stable result. It can be used for both classification and regression tasks.In classification tasks, Random Forest Classification predicts categorical outcomes based on the input data. It
5 min read
Classifier Comparison in Scikit Learn
In scikit-learn, a classifier is an estimator that is used to predict the label or class of an input sample. There are many different types of classifiers that can be used in scikit-learn, each with its own strengths and weaknesses. Let's load the iris datasets from the sklearn.datasets and then tr
3 min read
Gaussian Naive Bayes using Sklearn
In the world of machine learning, Gaussian Naive Bayes is a simple yet powerful algorithm used for classification tasks. It belongs to the Naive Bayes algorithm family, which uses Bayes' Theorem as its foundation. The goal of this post is to explain the Gaussian Naive Bayes classifier and offer a de
8 min read