Car Price Prediction
Car Price Prediction
Name:
Section: Q2145
Reg No:
Roll No:
Group No: 10
Submitted To: Mr. Tanveer Kajla
Date of Submission: 14/10/2021
Peer Rating:
Learning Outcomes:
Declaration:
I declare that this Assignment is my individual work. I have not copied it from any
other student’s work or from any other source except where due acknowledgement is
made explicitly in the text, nor has any part been written for me by any other person.
Stud
ent’s
Sign
ature
:
car. One of the most important aspect in predicting the price it is found from the brand , model,
horsepower and mileage of the car.
And also the type of fuel used for the car also effect price of a car because of the fluctuation of
the fuel. The features of the car like the exterior, color door etc also influence the car price. In
this varoius methods and techniques are used to predict the car price.
Research Questionairre:
How the given prediction affect the car price and what does an individual look while purchasing
a car and effect the price.
Prediction of the car price is difficult and most challenging task because there are many factors
that effect the car price. It can be seen clearly from the data that USA is the country whose price
for the car is higher compared to the Asia and Eeurope. In this report we will try to estimate how
to prdict the car price by using various factors or features like the model, mileage, horsepower of
the car, engine size etc. And also the type of the car that is generally preferred by an individual
based in terms of the small car, sedan, or suv etc. And also the preference regarding the price
whether to buy for high, low or moderate price of car.
Make2
TYPE_CAT
ORIGIN_CAT
DRIVE_CAT
EngineSize
Cylinders
Horsepower
MPG_City
MPG_Highway
Weight
Wheelbase
Length
Correlation Matrix
A correlation matrix is simply a table which displays the correlation. It is to used to measure and
is best used in variables that demonstrate a linear relationship between each other. Each cell in
the table shows the correlation betweeen two variables.
Invoice Make2 TYPE_CAT ORIGIN_CAT DRIVE_CAT EngineSize Cylinders Horsepower MPG_City MPG_Highway Weight Wheelbase Length
1.0000
-0.0658 1.0000
0.0201 0.0020 1.0000
0.3962 0.2586 -0.0078 1.0000
-0.4689 0.1069 -0.2998 -0.2156 1.0000
0.5660 -0.2238 0.0400 -0.1721 -0.3904 1.0000
0.6452 -0.2022 -0.0094 0.0159 -0.4291 0.9080 1.0000
0.8241 -0.1429 0.0726 0.2006 -0.4671 0.7932 0.8103 1.0000
-0.4713 0.1643 -0.1094 -0.0103 0.3747 -0.7179 -0.6844 -0.6770 1.0000
-0.4355 0.1169 -0.0878 0.0100 0.3858 -0.7259 -0.6761 -0.6474 0.9410 1.0000
0.4419 -0.1554 -0.0570 -0.0605 -0.2519 0.8087 0.7422 0.6318 -0.7404 -0.7936 1.0000
0.1480 -0.2057 0.1117 -0.2764 -0.1484 0.6389 0.5467 0.3876 -0.5080 -0.5255 0.7609 1.0000
0.1656 -0.1455 0.1115 -0.3440 -0.0734 0.6360 0.5478 0.3824 -0.5042 -0.4688 0.6892 0.8898 1.0000
REGRESSION ANALYSIS
Regression analysis is a set of processces that is used for estimating the relationship between the
dependent variable often called as outcome or response variable and one or more than
independent variables are often called as predictors, covariates, or features. The most common
6
form of regression is linear regression, in which one finds the line that fits with the data acording
to the criteria.
Regression Statistics
Multiple R 0.999222148
R Square 0.998444901
Adjusted R Square 0.998395832
Standard Error 779.9122765
Observations 426
ANOVA
df SS MS F Significance F
Regression 13 1.609E+11 12376885083 20347.91175 0
Residual 412 250604421.5 608263.159
Total 425 1.6115E+11
Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
Intercept 606.817528 814.3168092 0.745186052 0.456584138 -993.9164475 2207.551504 -993.9164475 2207.551504
Invoice 1.089888441 0.004448559 244.9980712 0 1.081143737 1.098633146 1.081143737 1.098633146
Make2 -10.12748716 3.575433827 -2.832519815 0.004844944 -17.15585541 -3.099118916 -17.15585541 -3.099118916
TYPE_CAT 22.80957907 43.51100967 0.52422546 0.600403641 -62.72169168 108.3408498 -62.72169168 108.3408498
ORIGIN_CAT -30.77138349 71.98614262 -0.427462597 0.669265745 -172.2773219 110.7345549 -172.2773219 110.7345549
DRIVE_CAT -98.7633829 58.60803924 -1.685150778 0.092716709 -213.9714673 16.44470145 -213.9714673 16.44470145
EngineSize 200.1718922 118.5400758 1.688643193 0.092044486 -32.84690919 433.1906936 -32.84690919 433.1906936
Cylinders -97.49311515 66.58883209 -1.464106098 0.143927526 -228.389352 33.40312169 -228.389352 33.40312169
Horsepower 1.28423358 1.390545691 0.923546481 0.356263045 -1.449215745 4.017682906 -1.449215745 4.017682906
MPG_City 21.94430475 24.49238937 0.895964229 0.370795048 -26.20133018 70.08993968 -26.20133018 70.08993968
MPG_Highway -7.354320198 24.47645848 -0.300465045 0.76397401 -55.46863916 40.75999876 -55.46863916 40.75999876
Weight 0.296651326 0.137571472 2.156343336 0.031634957 0.026221776 0.567080877 0.026221776 0.567080877
Wheelbase -46.40918566 12.36057829 -3.754612817 0.000198573 -70.70685137 -22.11151994 -70.70685137 -22.11151994
Length 16.63710955 6.832798312 2.434889601 0.015319873 3.20561424 30.06860487 3.20561424 30.06860487
R-squared value is 0.998 which means valribles are fillted in regression model with the accquracy of
99.8%. They are contributing 99.8% percent in the predtiction of dependent valriable.
Interpretation: From the above figure the regression model is 99.8% accurate in the prediction of
dependent variable.
Interpretation: The regression model from the abve table there is 99.8% accurate predictions for
the dependent variable.
9
10
Interpretation: From the above figure the predicted price and residuals are 99.8% accurate as
there is not much price difference and the variables are dependent variables.
Interpretation: There is not much fluctation in the price and the wheelbase and it is 99.8%
accurate of the dependent variables.
11
Interpretation: From the above figure theere is 99.8% and are contributing in the prediction of the
dependent variable.
12
Int erpr
etation: The length of the car is contributing 99.8% accurate for the prediction of the dependent
variable.
13
14
Interpretation: The above figure shows that the invoice and residual plot is contibuting 99.8%
accuracy in the dependent variable.
Interpretation: The figure shows that it is contributing 99.8 percent in the prediction of the
dependent variable.
15
Interpretation: In both the above figure drive cat and size of engine it is contributing 99.8
percent for the prediction of the dependent variable and is fit for the role in the predection of
the car price.
16
Interpretation: From the above figure the length and weight are contibuting 99.8 percent in
predicting of dependent variable and is considered to be fitted in the regression model.
17
Interpretation: The above figure shows that wheelbase of the car is considered to be 99.8 percent
in the prediction of the dependent variable and is fitted for the role in the regression model with
the accquarcy of 99.8%.
The value of the R- squared is the 0.998 which means that the variable is fitted for the regression
model because the accuracy of the car price prediction is 99.8 percent. Therefore it is
contributing 99.8% predictionof the dependent variable.
18
REFERENCES
https://towardsdatascience.com/predicting-used-car-prices-with-machine-
learning-techniques-8a9d8313952
https://towardsdatascience.com/introduction-to-regression-analysis-
9151d8ac14b3