Seminar Presentation

Machine learning algorithms like linear regression, random forest, gradient boosting and randomized search cross-validation were used to predict used car prices. Ensemble models like random forest and gradient boosting performed better than individual models with R2 scores of 0.9 and 0.89 respectively. Linear regression, random forest regression and gradient boosting models were evaluated using R2 score and mean squared error. Plots were generated to visualize the predictions.

Uploaded by

PaDiNjArAn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

88 views25 pages

Seminar Presentation

Uploaded by

PaDiNjArAn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 25

USED CAR PRICE

PREDICTION
USING
MACHINE LEARNING
1. Machine Learning
● Machine learning is a branch of artificial intelligence (AI) and
computer science which focuses on the use of data and
algorithms to imitate the way that humans learn, gradually
improving its accuracy.
● Machine learning is an important component of the growing
field of data science.
● Through the use of statistical methods, algorithms are trained
to make classifications or predictions, uncovering key insights
within data mining projects.
3
2. Ensemble Learning
● Ensemble learning helps improve machine learning results by
combining several models.
● This approach allows the production of better predictive
performance compared to a single model.
● Basic idea is to learn a set of classifiers (experts) and to allow
them to vote.

5
3. Basic Algorithms Used
3.1 Linear Regression
● Linear regression is one of the easiest and most
popular Machine Learning algorithms.
● It is a statistical method that is used for predictive
analysis.
● Linear regression makes predictions for
continuous/real or numeric variables such as sales,
salary, age, product price, etc.
7
● Linear regression algorithm shows a linear relationship between a
dependent (y) and one or more independent (y) variables, hence called
as linear regression.
● Since linear regression shows the linear relationship, which means it
finds how the value of the dependent variable is changing according to
the value of the independent variable.
● The linear regression model provides a sloped straight line representing
the relationship between the variables.

8
3.2 Random Forest
● Random Forest is a popular machine learning algorithm that
belongs to the supervised learning technique.
● It can be used for both Classification and Regression problems
in ML.
● It is based on the concept of ensemble learning, which is a
process of combining multiple classifiers to solve a complex
problem and to improve the performance of the model.

9
● "Random Forest is a classifier that contains a number of
decision trees on various subsets of the given dataset and
takes the average to improve the predictive accuracy of
that dataset.“
● Instead of relying on one decision tree, the random forest
takes the prediction from each tree and based on the
majority votes of predictions, and it predicts the final
output.

10
3.3 Gradient Boost
● Gradient boosting is a machine learning technique used
in regression and classification tasks, among others.

● It gives a prediction model in the form of an ensemble of weak prediction

models, which are typically decision trees.

● When a decision tree is the weak learner, the resulting algorithm is called
gradient-boosted trees; it usually outperforms random forest.

● A gradient-boosted trees model is built in a stage-wise fashion as in

other boosting methods, but it generalizes the other methods by allowing
optimization of an arbitrary differentiable loss function. 11
3.4 Randomized Search CV
● Random search is a technique where random combinations of the
hyperparameters are used to find the best solution for the built model.

● It is similar to grid search, and yet it has proven to yield better results
comparatively.

● RandomizedSearchCV implements a “fit” and a “score” method.

● It also implements “score_samples”, “predict”, “predict_proba”,

“decision_function”, “transform” and “inverse_transform” if they are
implemented in the estimator used.
12
4. Basic Tools Used
4.1 Numpy
● NumPy is a general-purpose array-processing
package.
● It provides a high-performance multidimensional
array object, and tools for working with these
arrays.
● It is the fundamental package for scientific
computing with Python.
14
4.2 Pandas
● pandas is a software library written for the
Python programming language for data
manipulation and analysis.

15
4.3 Matplotlib
● Matplotlib is easy to use and an amazing visualizing library in
Python.
● It is built on NumPy arrays and designed to work with the
broader SciPy stack and consists of several plots like line, bar,
scatter, histogram, etc.
● Matplotlib is a low level graph plotting library in python that
serves as a visualization utility.
● Matplotlib is open source and we can use it freely.
16
4.4 Seaborn
● Seaborn is a data visualization library built on top of
matplotlib and closely integrated with pandas data
structures in Python.
● Visualization is the central part of Seaborn which
helps in exploration and understanding of data.

17
4.6 Sklearn
● Scikit-learn (Sklearn) is the most useful and robust library
for machine learning in Python.
● It provides a selection of efficient tools for machine
learning and statistical modeling including classification,
regression, clustering and dimensionality reduction via a
consistence interface in Python.
● This library, which is largely written in Python, is built upon
NumPy, SciPy and Matplotlib.

18
4.6.1 Mean Squared Error
● The Mean Squared Error (MSE) or Mean Squared Deviation
(MSD) of an estimator measures the average of error squares i.e. the
average squared difference between the estimated values and true
value.
● It is a risk function, corresponding to the expected value of the
squared error loss.
● It is always non – negative and values close to zero are better.
● The MSE is the second moment of the error (about the origin) and
thus incorporates both the variance of the estimator and its bias. 19
4.6.2 R2 Score
● Coefficient of determination also called as R2 score is used to
evaluate the performance of a linear regression model.
● It is the amount of the variation in the output dependent
attribute which is predictable from the input independent
variable(s).
● It is used to check how well-observed results are reproduced
by the model, depending on the ratio of total deviation of
results described by the model.

20
5. Results
Plots

22
23
24
R2 value for various models

● R2 score for Linear Regression: 0.8407655400238144

● R2 score for Random Forest:
0.9128634064889848
● R2 score for Gradient Boosting: 0.8919869294964318
● R2 score for Randomized SearchCV: 0.808154428580237

HW3
0% (1)
HW3
3 pages
COMP1801 - Copy 1
No ratings yet
COMP1801 - Copy 1
18 pages
House Report
No ratings yet
House Report
26 pages
frmCourseSyllabusIPDownload (2)
No ratings yet
frmCourseSyllabusIPDownload (2)
3 pages
Module 3 Data Science Machine Learning
No ratings yet
Module 3 Data Science Machine Learning
53 pages
ML assignment
No ratings yet
ML assignment
13 pages
L03 The Regression Pipeline - 2
No ratings yet
L03 The Regression Pipeline - 2
58 pages
ML Lab Manual
No ratings yet
ML Lab Manual
38 pages
Unit 1 - Machine Learning
No ratings yet
Unit 1 - Machine Learning
17 pages
Mechine Learning
No ratings yet
Mechine Learning
106 pages
CS601_Machine Learning_Unit 1_Notes_1672759748
No ratings yet
CS601_Machine Learning_Unit 1_Notes_1672759748
13 pages
MLT - MKC
No ratings yet
MLT - MKC
10 pages
Lecture 17&18 - Introduction To Machine Learning
No ratings yet
Lecture 17&18 - Introduction To Machine Learning
51 pages
PerceptiLabs-ML Handbook
No ratings yet
PerceptiLabs-ML Handbook
31 pages
A Comprehensive Guide To Machine Learning
No ratings yet
A Comprehensive Guide To Machine Learning
152 pages
ML Notes-1
No ratings yet
ML Notes-1
59 pages
ML UNIT-4
No ratings yet
ML UNIT-4
20 pages
All About ML
No ratings yet
All About ML
18 pages
ML Notes
No ratings yet
ML Notes
52 pages
AI and DS QB1
No ratings yet
AI and DS QB1
31 pages
41 Machine Learning Algorithms I
No ratings yet
41 Machine Learning Algorithms I
8 pages
ML 2
No ratings yet
ML 2
3 pages
Mlpy
0% (1)
Mlpy
113 pages
Unit-5 MECH 3-2
No ratings yet
Unit-5 MECH 3-2
14 pages
Silver Oak College of Computer Application: Subject:Machine Learning
No ratings yet
Silver Oak College of Computer Application: Subject:Machine Learning
15 pages
Machine Learning With Real Life Project: by - Rishabh Gaur
100% (2)
Machine Learning With Real Life Project: by - Rishabh Gaur
26 pages
Orange 3
100% (1)
Orange 3
46 pages
Machine Learning Algorithm
No ratings yet
Machine Learning Algorithm
8 pages
Top 10 Machine Learning Algorithms With Their Use
100% (1)
Top 10 Machine Learning Algorithms With Their Use
12 pages
Machine Learning Most Important Question For Mid Term Ipu University
No ratings yet
Machine Learning Most Important Question For Mid Term Ipu University
36 pages
SML
No ratings yet
SML
8 pages
Methods and Models
No ratings yet
Methods and Models
12 pages
PID5108657
No ratings yet
PID5108657
8 pages
Unit1 6thsemCS
No ratings yet
Unit1 6thsemCS
22 pages
IJRPR22505
No ratings yet
IJRPR22505
3 pages
Regression Models: by Mayuri Bhandari
No ratings yet
Regression Models: by Mayuri Bhandari
64 pages
Interview Preparing - ML Draft
No ratings yet
Interview Preparing - ML Draft
12 pages
Machine Learning With Python
100% (2)
Machine Learning With Python
137 pages
A1388404476 - 64039 - 23 - 2023 - Machine Learning II
No ratings yet
A1388404476 - 64039 - 23 - 2023 - Machine Learning II
10 pages
project
No ratings yet
project
36 pages
AI & ML Unit 3 Notes
No ratings yet
AI & ML Unit 3 Notes
20 pages
AIML MODEL
No ratings yet
AIML MODEL
13 pages
Machine Learning
No ratings yet
Machine Learning
16 pages
Lec05 - Supervised
No ratings yet
Lec05 - Supervised
26 pages
Machine Learning
No ratings yet
Machine Learning
48 pages
Machine Learning (Se204A) Lab Manual
No ratings yet
Machine Learning (Se204A) Lab Manual
27 pages
R22 ML Lab Manual
No ratings yet
R22 ML Lab Manual
25 pages
Machine Learning Guide: Meher Krishna Patel
No ratings yet
Machine Learning Guide: Meher Krishna Patel
121 pages
Machinelearning
No ratings yet
Machinelearning
59 pages
Stat Learn Big Data 20130401
No ratings yet
Stat Learn Big Data 20130401
53 pages
ml record
No ratings yet
ml record
21 pages
Machine Learnig Revision
No ratings yet
Machine Learnig Revision
93 pages
Project Occupancy Alfonso Vicente Aragues
No ratings yet
Project Occupancy Alfonso Vicente Aragues
18 pages
ML Report 1
No ratings yet
ML Report 1
23 pages
A Detailed Analysis of The Supervised Machine Learning Algorithms
No ratings yet
A Detailed Analysis of The Supervised Machine Learning Algorithms
5 pages
Ml Cyber Lab
No ratings yet
Ml Cyber Lab
16 pages
3.popular Machine Learning Algorithm
No ratings yet
3.popular Machine Learning Algorithm
11 pages
ML Lectures Summary 2
No ratings yet
ML Lectures Summary 2
52 pages
Machine Learning with Python for Everyone (Addison Wesley Data & Analytics Series) 1st Edition, (Ebook PDF) - Download the ebook and explore the most detailed content
100% (1)
Machine Learning with Python for Everyone (Addison Wesley Data & Analytics Series) 1st Edition, (Ebook PDF) - Download the ebook and explore the most detailed content
60 pages
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
The Comprehensive Guide to Machine Learning Algorithms and Techniques
From Everand
The Comprehensive Guide to Machine Learning Algorithms and Techniques
Mohammed Ahmed
5/5 (1)
Stat Itics
No ratings yet
Stat Itics
209 pages
Exam 2017
No ratings yet
Exam 2017
8 pages
Age and Gender Identification in Social Media
No ratings yet
Age and Gender Identification in Social Media
8 pages
Parkinson's Disease Detection Using Machine Learning Algorithms
No ratings yet
Parkinson's Disease Detection Using Machine Learning Algorithms
7 pages
Introduction To Kmeans
No ratings yet
Introduction To Kmeans
4 pages
M.tech 1-II Syllabus JNTUGV
No ratings yet
M.tech 1-II Syllabus JNTUGV
6 pages
(Swati Choudhary) Eda On Customer Churn Model
No ratings yet
(Swati Choudhary) Eda On Customer Churn Model
19 pages
Project Work
No ratings yet
Project Work
42 pages
Project Report For News Classification
No ratings yet
Project Report For News Classification
5 pages
Exp 6
No ratings yet
Exp 6
12 pages
VLMs basics
No ratings yet
VLMs basics
29 pages
Automatic Modulation Classification Using Different Neural Network and PCA Combinations
No ratings yet
Automatic Modulation Classification Using Different Neural Network and PCA Combinations
17 pages
3 Must-Have Projects For Your Data Science Portfolio - by Aakash N S - Jovian - Jan, 2021 - Medium
No ratings yet
3 Must-Have Projects For Your Data Science Portfolio - by Aakash N S - Jovian - Jan, 2021 - Medium
1 page
Generative AI (21CS733) AAT-1 Final Marks
No ratings yet
Generative AI (21CS733) AAT-1 Final Marks
8 pages
Natural Language Processing & Info Systems
No ratings yet
Natural Language Processing & Info Systems
439 pages
ANN Report
No ratings yet
ANN Report
26 pages
Machine Learning Algorithm
No ratings yet
Machine Learning Algorithm
32 pages
Machine Learning Textbook
No ratings yet
Machine Learning Textbook
191 pages
Role of Computers in Research
100% (1)
Role of Computers in Research
5 pages
5 Software Design
No ratings yet
5 Software Design
17 pages
Department of Education: Weekly Home Learning Guide in Reading and Writing Skills First Quarter
No ratings yet
Department of Education: Weekly Home Learning Guide in Reading and Writing Skills First Quarter
28 pages
Air Quality Index Using Machine Learning - A Jordan Case Study
No ratings yet
Air Quality Index Using Machine Learning - A Jordan Case Study
11 pages
Efficient Classifier For R2L and U2R Attacks: P. Gifty Jeya M. Ravichandran C. S. Ravichandran
No ratings yet
Efficient Classifier For R2L and U2R Attacks: P. Gifty Jeya M. Ravichandran C. S. Ravichandran
5 pages
Deep Learning
No ratings yet
Deep Learning
42 pages
Machine Learning in Agriculture A Review
No ratings yet
Machine Learning in Agriculture A Review
7 pages
Autonomous Tybsc Syllabus 2020
No ratings yet
Autonomous Tybsc Syllabus 2020
21 pages
ICMR Healthcare Capstone Project - Jupyter Notebook
No ratings yet
ICMR Healthcare Capstone Project - Jupyter Notebook
30 pages
Krishna Data Scientist +1 (713) - 478-5282
No ratings yet
Krishna Data Scientist +1 (713) - 478-5282
5 pages
Research on prediction of multi-class theft crimes by an optimized decomposition and fusion method based on XGBoost
No ratings yet
Research on prediction of multi-class theft crimes by an optimized decomposition and fusion method based on XGBoost
10 pages

Seminar Presentation

Uploaded by

Seminar Presentation

Uploaded by

USED CAR PRICE

● It gives a prediction model in the form of an ensemble of weak prediction

● A gradient-boosted trees model is built in a stage-wise fashion as in

● RandomizedSearchCV implements a “fit” and a “score” method.

● It also implements “score_samples”, “predict”, “predict_proba”,

● R2 score for Linear Regression: 0.8407655400238144

You might also like