project
project
BATCH NO: 15
UNDER THE V.DURGAPRASAD-20N31A05N1
GUIDANCE OF: P.SAIKIRAN-20N31A05J5
MR CH.RAJESH S.GAFAR ALI-20N31A05L7
AGENDA
• Introduction
• Abstract
• Existing system
• Proposed system
• Software and Hardware Requirements
• Literature survey
• System architecture(Modules Explanation)
• UML Diagrams
• Algorithms and technologies
• Output screens/ Results
• Conclusion
• Future scope
INTRODUCTION
• Car price prediction is somehow an interesting and popular problem. As per information that was gotten
from the Agency for Statistics of BiH, 921.456 vehicles were registered in 2014 of which 84% of them are
cars for personal usage [1]. This number has increased by 2.7% since 2013 and this trend will likely
continue, and the number of cars will increase in the future. This adds additional significance to the
problem of car price prediction. Accurate car price prediction involves expert knowledge, because price
usually depends on many distinctive features and factors. Typically, the most significant ones are brand and
model, age, horsepower and mileage.
• Predicting the price of preowned cars is a challenging task due to the multitude of factors that influence
the value of a vehicle. Machine learning algorithms offer a powerful solution to this problem by leveraging
historical data and patterns to make accurate predictions. This introduction provides an overview of the
significance of predicting preowned car prices and outlines the role of machine learning in addressing this
challenge.
• It requires certain things such as background since it is a preowned car is dynamically influenced by
certain conditions such as model, year, mileage, condition, location, etc..
ABSTRACT
• A car price prediction has been a high interest research area, as it requires noticeable
effort and knowledge of the field expert. Considerable number of distinct attributes are
examined for the reliable and accurate prediction.
•To build a model for predicting the price of used cars in Bosnia and Herzegovina, we applied
three machine learning techniques (Artificial Neural Network, Support Vector Machine and
Random Forest).
•Respective performances of different algorithms were then compared to find one that best suits
the available data set. The final prediction model was integrated into Java application.
Furthermore, the model was evaluated using test data and the accuracy of 87.38% was obtained.
• Machine learning models, including linear regression, decision trees, random forests, and
neural networks, are trained on the prepared dataset. The evaluation process, employing metrics
such as Mean Absolute Error (MAE) or Root Mean Squared Error (RMSE), ensures the model's
ability to generalize to new, unseen data. Iterative fine-tuning and optimization enhance model
performance, addressing challenges like overfitting and improving robustness.
PROBLEM STATEMENT
• The market for preowned cars is vast and diverse, with numerous factors influencing the pricing of these vehicles.
Determining an accurate valuation for a used car involves considering variables such as make, model, year of
manufacture, mileage, condition, location, and various other features. This complexity presents a challenge for both
• In this context, the problem statement revolves around developing a machine-learning model capable of predicting
the price of preowned cars accurately. The primary goal is to leverage historical data on car sales, encompassing
various attributes, to build a predictive model that can estimate the market value of a used vehicle with minimal error.
EXISTING SYSTEM
•Artificial
The existing system of the project involves the application of machine learning techniques, specifically
Neural Network, Support Vector Machine, and Random Forest, as an ensemble, to predict the
price of used cars in Bosnia and Herzegovina. The data for the prediction was collected from the web
portal autopijaca.ba using a web scraper written in PHP. The researchers compared the performances of
different algorithms to find the one that best suits the available dataset. The final prediction model was
integrated into a Java application and evaluated using test data, achieving an accuracy of 87.38%. The
system also involved data preprocessing, attribute ranking, and the creation of nominal classes for
classification. The ensemble method combined multiple machine learning classifiers to strengthen the
overall classification performance.
•Online Car Marketplaces: Several online platforms and car marketplaces employmachine learning
algorithms to provide estimated prices for preowned cars. These estimates often consider factors such as
make, model, year, mileage, and condition.
•Automotive Valuation Tools: Some automotive industry services and websites offer valuation tools that
utilize machine learning to predict preowned car prices. These tools often incorporate historical sales data
and market trends to generate estimates
PROPOSED SYSTEM
The proposed system for predicting the price of preowned cars using machine learning algorithms offers
several advantages:
1. Increased Accuracy: chine learning algorithms can analyze complex relationships and patterns within
large datasets, leading to more accurate price predictions. The system's ability to consider multiple factors
simultaneously enhances accuracy compared to traditional pricing methods.
2. Adaptability to Market Dynamics: The system can adapt to changing market conditions and trends.
Ma
Machine learning models continuously learn from new data, ensuring that the predictions remain relevant
and reflective of the current preowned car market.
3. Comprehensive Feature Consideration: The machine learning models consider a wide range of features,
including mileage, age, model specifications, and regional market trends. This comprehensive feature
consideration allows for a more holistic and nuanced understanding of the factors influencing preowned car
prices.
4. Time and Cost Savings: Once trained, machine learning models can quickly generate predictions, saving
time compared to manual pricing methods.
5. Reduced Human Bias: Traditional pricing methods may be influenced by subjective judgments or biases.
Machine learning algorithms rely on data-driven insights, minimizing human biases in the pricing process.
LIMITATIONS OF EXISTING SYSTEM
• The existing system for the car price prediction project faces several challenges. One of the primary
issues is the limited accuracy of single machine learning algorithms when applied to the dataset. The
document highlights that the application of a single machine learning algorithm resulted in an accuracy
of less than 50%, indicating the inadequacy of this approach for achieving accurate car price
predictions. Additionally, the ensemble of multiple machine learning algorithms, while significantly
improving accuracy to 92.38%, was noted to consume more computational resources than a single
machine learning algorithm.
SOFTWARE REQUIREMENT SPECIFICATION
Functional Requirements:
1.Programming Language: Python for its extensive machine learning libraries like Scikit-learn, TensorFlow,
or Pytorch.
2. Machine Learning Libraries: Scikit-learn, TensorFlow, or Pytorch for building and training machine
learning models.
3. Data Processing Libraries: Pandas, and NumPy for data manipulation and analysis.
4. Visualization Libraries: Matplotlib, and Seaborn for visualizing data and model outputs.
5. Integrated Development Environment (IDE): Such as Jupyter Notebook, PyCharm, or VsCode for code
development.
6. Web Development (if applicable): Flask, and Django for creating a web interface to interact with the
prediction model.
Literature Survey:
• The first paper is Predicting the Price of Used Car Using Machine Learning
Techniques. In this paper, they investigate the application of supervised machine
learning techniques to predict the price of used cars in Mauritius. The predictions
are based on historical data collected from daily newspapers. Different
techniques like multiple linear regression analysis, k-nearest neighbors, naïve
Bayes, and decision trees have been used to make the predictions.
• The Second paper is Car Price Prediction Using Machine Learning Techniques. A
considerable number of distinct attributes are examined for reliable and accurate
prediction. To build a model for predicting the price of used cars in Bosnia and
Herzegovina, they have applied three machine learning techniques (Artificial
Neural Network, Support Vector Machine, and Random Forest).
• The Third paper is the Price Evaluation model in second-hand car systems based
on BP neural networks. In this paper, the price evaluation model based on big
data analysis is proposed, which takes advantage of widely circulated vehicle
data and a large number of vehicle transaction data to analyze the price data for
each type of vehicle by using the optimized BP neural network algorithm. It aims
to establish a second-hand car price evaluation model to get the price that best
matches the car.
USER REQUIREMENTS
• User requirements for a car price prediction system typically revolve around
functionality, usability, accuracy, and convenience. Here's a breakdown of these
requirements specific to car price prediction:
• Accurate Predictions: Users expect the system to provide precise estimates of car
prices based on various parameters like model, make, year, mileage, condition, and
optional features.
• Customization Options: Users might desire the ability to fine-tune predictions by
adjusting weightage or importance given to certain features or parameters based on
personal knowledge or preferences.
• Ease of Use: The system should have an intuitive interface that allows users to input
car details effortlessly and receive accurate price predictions without complexity or
confusion.
• Real-time Updates: For users interested in current market values, the system should
ideally provide real-time or up-to-date predictions to reflect the current market trends.
• Accuracy Validation: Users might seek a way to validate the accuracy of predictions
through historical data or benchmarks, ensuring the system's reliability.
SOFTWARE AND HARDWARE REQUIREMENTS
• SOFTWARE REQUIREMENTS:
• Operating system: Windows 7,8,10 Ultimate, Linux, Mac.
• Front-End: Python.
• Coding Language: Python.
• Software Environment: Visual Studio.
• HARDWARE REQUIREMENTS:
• System: Intel I-3, 5, 7 Processor.
• Hard Disk: 500 GB.
• Floppy Drive: 1.44 MB.
• Monitor: 14 Color Monitor.
• Mouse: Optical Mouse.
SYSTEM ARCHITECTURE
SYSTEM MODULES
Car price prediction involves multiple modules to process, analyze, and forecast prices accurately. Here are the key
modules typically involved in building a car price prediction system:
• Data Collection: Gathering comprehensive data is crucial. This includes historical car prices, features, mileage,
year of manufacture, model, brand, location, and market trends. This data can be sourced from various databases,
APIs, or scraped from websites.
• Data Preprocessing: Raw data often requires cleaning, normalization, and transformation. Handling missing
values, outlier detection, and feature engineering are essential here. This step prepares the data for further analysis.
• Feature Selection: Identifying the most relevant features that significantly impact the car price. Some features
might have a more pronounced effect on price, such as mileage, brand reputation, model year, engine size, etc.
• Model Selection: Choosing an appropriate machine learning or statistical model for prediction. Common models
used include linear regression, decision trees, random forests, support vector machines (SVM), neural networks, or
ensemble methods. The selection depends on the dataset size, complexity, and accuracy requirements.
•
and validation sets, feeding it to the model, and tuning parameters to achieve the best
Training the Model: Using historical data to train the selected model. This involves splitting the data into training
performance.
ALGORITHMS/MODULES
• Predicting the price of preowned cars involves regression, and several machine-learning algorithms can
be used for this task. Here are some commonly used algorithms for predicting car prices:
Linear Regression:
• Simple and interpretable.
• Assumes a linear relationship between features and the target variable.
Decision Trees:
• Can capture non-linear relationships.
• Prone to overfitting, often used within ensemble methods.
• Random Forest:
• Ensemble of decision trees that can improve generalization.
• Handles overfitting better than individual decision trees.
Support Vector Machines (SVM):
• Effective in high-dimensional spaces.
• Requires proper feature scaling.
K-Nearest Neighbors (KNN):
• Predicts based on the majority of k-nearest neighbors.
• Sensitive to outliers.
UML Diagrams
•1. USE CASE DIGRAM
2.CLASS DIAGRAM
3. SEQUENCE DIAGRAM
4. COLLABORATION DIAGRAM
5. ACTIVITY DIAGRAM
Algorithms and Technologies
• Linear Regression:
In conclusion, our study demonstrates the efficacy of machine learning algorithms in predicting prices for pre-owned cars.
Through the implementation of various algorithms and rigorous evaluation, we have shown that these models can offer accurate
and reliable price estimations, thereby enhancing transparency and efficiency in the pre-owned car market. Our findings indicate
that gradient boosting emerges as the top-performing algorithm, outperforming other methods in terms of prediction accuracy.
This highlights the importance of leveraging ensemble methods and advanced techniques in capturing the complex relationships
between car attributes and prices.
Furthermore, our analysis underscores the significance of features such as mileage, year of manufacture, and geographical
location in influencing pre-owned car prices. Understanding the impact of these factors is crucial for both buyers and sellers to
negotiate fair deals and optimize their transactions. Additionally, our study sheds light on the potential of machine learning
models to transform pricing strategies in the automotive industry, empowering stakeholders with data-driven insights to make
informed decisions.