0% found this document useful (1 vote)

278 views7 pages

Prediction of Medical Costs Using Regression Algorithms: A. Lakshmanarao, Chandra Sekhar Koppireddy, G.Vijay Kumar

Uploaded by

Gr Ranjere

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (1 vote)

278 views7 pages

Prediction of Medical Costs Using Regression Algorithms: A. Lakshmanarao, Chandra Sekhar Koppireddy, G.Vijay Kumar

Uploaded by

Gr Ranjere

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Journal of Information and Computational Science ISSN: 1548-7741

Prediction of medical costs using regression algorithms

A. Lakshmanarao, Chandra Sekhar Koppireddy, G.Vijay Kumar

1
Assistant Professor, Department of CSE, Raghu Engineering College, Dakamarri,
Visakhapatnam
Email: [email protected]

2
Assistant Professor, Department of CSE, Pragati Engineering College, Surampalem,
Andhra Pradesh, India.
Email: [email protected]

3
Assistant Professor, Department of CSE, Pragati Engineering College, Surampalem,
Andhra Pradesh, India
Email: [email protected]

Abstract
Heath care costs increases day by day. As there are a greater number of new viruses
entering into people, there is a need to predict health charges. This type of prediction
helps the governments to make a decision regarding health issues. People also knows the
importance of health care costs. Machine Learning is a filed which has its impact on
every filed. Health care system also uses machine learning models for several health
related applications.In this paper,we have done predicate analysis on medical health
insurance charges.We build a model to predict the medical insurance cost of a person
based on gender.We collect the dataset from Kaggle,which contains 1338 rows of data
with the features age, gender, smoker ,BMI, children,region, insurance charges.The data
contains medical information and costs billed by health insurance companies.We applied
various regression algorithms on this dataset to predict medical costs.For
implementation, we used python programming language.

Keywords: Medical insurance costs, Kaggle, Machine Learning

1. Introduction

As indicated by the World Bank, the absolute use on medicinal services as an extent of
GDP in 2015 was 3.89%. Out of 3.89%, the legislative wellbeing consumption as an
extent of GDP is simply 1%, and the cash-based use as an extent of the present wellbeing
use was 65.06% in 2015. Throughout the most recent couple of decades, the progression
in clinical innovation has made it conceivable to fix illnesses that were once viewed as
serious. In any case, the expense of their treatment is so high, it is practically

Volume 10 Issue 5 - 2020 751 www.joics.org

Journal of Information and Computational Science ISSN: 1548-7741

incomprehensible for a white collar class individual to manage the cost of them. As
indicated by insights, Rs 5 lakh family floater strategy will cover self, mate and one kid
will cost anyplace between Rs 10,000 and Rs 17,000 on a yearly premise though Rs. 5
lakh singular wellbeing plan will cost a multi year old Rs. 4,000-7,000 per year.

2. Literature Survey

Machine Learning is a technology where machines can learn from the previous data and
predict new samples. Machine Learning models are applicable in all fileds. Medical files
also not having any exclusion to machine learning. Medical field usingML models in
different situation from last several years. Many of the researchers applied machine
learning techniques to medical related cost prediction. B. Nithya [1] et.al applied
machine learning models in predictive Analytics in Health Care.They applied various
supervised and unsupervised models for predictive analysis. They also suggested
machine learning tools and techniques are decisive in health care province and
exclusively used in the diagnosis and predictions of various types of cancers. Anuja
Tike[2] et.al applied hierarchical decision tress for medical price prediction system.
Their experiments shown that the price prediction system achieves high accuracy. Moran
et al. [3] utilized linear regression techniques to anticipate Intensive Care Unit (ICU)
expenses and utilize understanding socioeconomics, DRG (Diagnostic Related Group),
length of stay in the clinic and a couple of others as highlights. Gregori [4] et.al applied
various regression models for analyzing medical costs in health care system. They
mainly concentrated on reduce the bias in the cost estimates to achieve good results.
Dimitris Bertsimas[5] et.al applied different data mining techniques which provided an
accurate predictions of medical costs and represent a powerful tool for prediction of
health-care costs.

3. Proposed Method

The dataset used for experiments is collected from Kaggle[6] machine learning
repository. This dataset was inspired by the book Machine Learning with R by Brett
Lantz. The data contains medical information and costs billed by health insurance
companies. It contains 1338 rows of data and the following columns: age, gender, BMI,
children, smoker, region, insurance charges.In these features insurance charges is a
dependent variable and the remaining features are called as independent variables.In
regression analysis, we need to predict the value of dependent variable using independent
variables. First, we collected dataset and applied various data preprocessing methods.
Data preprocessing is a technique in which we can remove missing values in the data.
Because of these missing values, it is not possible to apply machine learning algorithms.
After removal of missing values, we need to apply label encoding, one hot encoding data
to the categorical features. Categorical features are the features whose values are labels
instead of values. After that, apply standardization or normalization techniques to our
data. This method is used when all the attribute values are not in the same scale.

Volume 10 Issue 5 - 2020 752 www.joics.org

Journal of Information and Computational Science ISSN: 1548-7741

Collection of dataset
from Kaggle store

Splitting of data as training

set and testing set (and apply
ML methods on training set)

Apply Apply Apply Apply

Multiple Support Decision Random
Linear Vector Tree Forest
Regressor Regressor Regressor Regressor

Apply model on testing

data

Compare and select

best model

Figure 1: Proposed model

We applied following four regression models on the dataset.

i) Multiple Linear Regression
ii) Support Vector Regression
iii) Decision Tree Regression
iv) Random Forest Regression

3.1 Multiple Linear Regression:

Multiple linear regression (MLR) is a basic machine learning regression model,in

which there is one dependent variable and multiple independent variables. The value of
dependent variable is calculated from independent variables.In this dataset the dependent
variable is medical charges and independent variables are age, gender, smoker ,BMI,
children,region.
Multiple Linear Regression uses ordinary least-squares (OLS) method to find a best
fitting line which involves multiple independent variables.
The formula for Multiple linear regression is as follows:
Y=b0+ b1X1+…bkXk + α

Volume 10 Issue 5 - 2020 753 www.joics.org

Journal of Information and Computational Science ISSN: 1548-7741

Here, Y is dependent variable, Xi is independent variables, b0 is y-intercept (constant

term), bk is slope coefficient for dependent variables, α is model error term.

3.2 Support Vector Regression:

Support Vector Regresison is variant of all other models.It is used for regression and
classification. In Support Vector Regression, a hyperplane is plotted to separate to
predict the value of dependent variable. This line is the margin of tolerance. In
regression, this hyperplane line is used to predict continuous value.

3.3. Decision Tree Regression:

Decision Tree is one of the most widely regression model. It is tree structured based
machine learning model.In this model,Mean squared error(MSE) is used in each step to
find the root node.This is recursively applied to build a tree. It breakdown the dataset
values into subets by incrementally developing decision tree. The final tree contains
decision nodes and leaf nodes. Decision nodes contains 2 or more child nodes, denoting
values for attributes tested. Leaf nodes indicating a decision on numerical target.
Decision trees are capable of dealing with both numerical and categorical data.

3.4. Random Forest Regression:

Random Forest is combination of more than one model. It is also called as ensemble
approach. In the ensemble technique, we combine the predictions from more than
decision tree to predict the value of dependent variable. It can be treated as a bagging
method, where the weighted average is used for final prediction.

4. Experimentation and Results

we conducted all experiments in python language.In Python,there is a library named as

“scikit-learn” ,which provides a vast number of functions and classses for machine
learning models.The results of the experiments are tabulated. Regression analysis is
based on the following measures.
MAE (Mean absolute error):It is used to identify the difference between the original
value and predicted values extracted by averaged the absolute difference over the data
set.
MSE (Mean Squared Error):It is used to represent the difference between the original
and predicted values extracted by squared the average difference over the data set.
RMSE (Root Mean Squared Error): is the error rate by the square root of Mean
Squared Error.

Volume 10 Issue 5 - 2020 754 www.joics.org

Journal of Information and Computational Science ISSN: 1548-7741

R-squared (Coefficient of determination) : R-squared represents the coefficient of how

well the values fit compared to the original values. The value of r-squared is between 0
and 1.The best possible score is 1.0. The higher the value is, the better the model is.
Using the above four measures,we compared different models and the results are
tabulated below:

ML model R-Squared Mean Mean Squared Root Mean

error Absolute Error Squared Error
Error
Multiple 0.78 4008 33571665 5794
Linear
Regression
Support 0.26 5678 1734584581 13175
Vector
Regression
Decision Tree 0.68 3401 50404643 7099
Regression
Random 0.85 2760 23294452 4826
Forest
Regression

Table 1: Results of regression models

Comparision of models based on R-squared values:

R-squared is the most widely used measure.After applying the four models,random forest
performed well on this dataset.

Figure 2: R-squared values of models

Volume 10 Issue 5 - 2020 755 www.joics.org

Journal of Information and Computational Science ISSN: 1548-7741

Comparision of models based on MAE:

MAE is also one of the measure which decides the performance of the machine learning
model.Afer applying four different machine learning algorithms,random forest performs
well with dataset.

Figure 3: MAE values comparison

So,after applying four algorithm,Random Forest Regression gives better results.

We also applied Multiple Linear Regression with backward elimination technique to find
most predominate variables for deciding strength of concrete. In backward elimination
method, initially we started with all dependent variables. After that, we are removing
variables with high p values until we find best dependent variables.

Steps for implementing MLR with backpropagation:

Step 1: Start with all independent variables
Step 2: Identify the variable with high p value, remove it.
Step 3: Identify the next variable with high p-value and remove it.
Step 4: Repeat step-3 until one or two variables remains.
Step 5: The remaining features are valuable features for regression analysis.

Volume 10 Issue 5 - 2020 756 www.joics.org

Journal of Information and Computational Science ISSN: 1548-7741

After step-1 we obtained following values.

coef std err t P>|t| [0.025 0.975]

------------------------------------------------------------------------------
const -102.5428 472.699 -0.217 0.828 -1029.859 824.773
x1 13.0485 286.203 0.046 0.964 -548.409 574.506
x2 -115.5913 292.189 -0.396 0.692 -688.793 457.610
x3 -1.196e+04 294.329 -40.645 0.000 -1.25e+04 -1.14e+04
x4 1.186e+04 331.935 35.731 0.000 1.12e+04 1.25e+04
x5 257.7350 11.904 21.651 0.000 234.383 281.087
x6 322.3642 27.419 11.757 0.000 268.576 376.153
x7 474.4111 137.856 3.441 0.001 203.973

5. Conclusion
In this paper, we proposed a machine learning model for predicting medical costs.. We
applied four regression techniques Multiple Linear Regression, Support Vector
Regression, Decision Tree Regression, Random Forest Regression. We also applied
MLR with backward elimination technique and observed that age,bmi are features which
decides the dependent variable. Out of all experiments,Random Forest model given
better result.

References
1) B. Nithya, Dr. V. Ilango,“Predictive Analytics in Health Care Using Machine
Learning Tools and Techniques”, International Conference on Intelligent Computing
and Control Systems ICICCS 2017, 978-1-5386-2745-7/17/$31.00 ©2017 IEEE.
2) A. Tike and S. Tavarageri. (2017). A Medical Price Prediction System using
Hierarchical Decision Trees. In: IEEE Big Data Conference 2017. IEEE, 978-1-5386-
2715-0/17/$31.00 ©2017 IEEE.
3) Lahiri and N. Agarwal, “Predicting healthcare expenditure increase for an
individualfrom medicare data,” in Proceedings of the ACM SIGKDD Workshop on
Health Informatics, 2014.
4) Gregori, M. Petrinco, S. Bo, A. Desideri, F. Merletti, and E. Pagano, “Regression
modelsfor analyzing costs and their determinants in health care: an introductory
review,” International Journal for Quality in Health Care, vol. 23, no. 3, pp. 331–341,
2011.
5) Bertsimas, M. V. Bjarnad´ottir, M. A. Kane, J. C. Kryder, R. Pandey, S. Vempala, and
G.Wang, “Algorithmic prediction of health-care costs,” Operations Research, vol. 56,
no. 6, pp. 1382–1392, 2008.

6) https://www.kaggle.com/mirichoi0218/insurance

Volume 10 Issue 5 - 2020 757 www.joics.org

Iim Grp-1 Capstone Project
0% (1)
Iim Grp-1 Capstone Project
24 pages
Step-By-Step-Diabetes-Classification-Knn-Detailed-Copy1 - Jupyter Notebook
No ratings yet
Step-By-Step-Diabetes-Classification-Knn-Detailed-Copy1 - Jupyter Notebook
12 pages
Project Synopsis
No ratings yet
Project Synopsis
5 pages
Capstone Project - Team 13 - 30.01.2022 - AI in Healthcare
No ratings yet
Capstone Project - Team 13 - 30.01.2022 - AI in Healthcare
20 pages
Title: Smart Heath Prediction Using Machine Learning
No ratings yet
Title: Smart Heath Prediction Using Machine Learning
20 pages
Mini Project
No ratings yet
Mini Project
31 pages
SPSS ANNOTATED OUTPUT Discriminant Analysis 1
No ratings yet
SPSS ANNOTATED OUTPUT Discriminant Analysis 1
14 pages
Pima Indian Diabetes Questions
No ratings yet
Pima Indian Diabetes Questions
6 pages
HR Digital Transformation Guide AIHR
100% (6)
HR Digital Transformation Guide AIHR
49 pages
Transforming Healthcare With AI
100% (6)
Transforming Healthcare With AI
134 pages
Abnormal ECG
67% (3)
Abnormal ECG
55 pages
Medical Insurance Cost Prediction
100% (2)
Medical Insurance Cost Prediction
16 pages
P4 Project Report
No ratings yet
P4 Project Report
28 pages
Medical Insurance Cost Prediction
100% (1)
Medical Insurance Cost Prediction
18 pages
Medical Insurance Cost Prediction System: Dharesh Bahety EN18EL301057 Under The Guidance of Mr. Parag Ravekar Sir
0% (1)
Medical Insurance Cost Prediction System: Dharesh Bahety EN18EL301057 Under The Guidance of Mr. Parag Ravekar Sir
18 pages
Medical Expenses Prediction
No ratings yet
Medical Expenses Prediction
51 pages
Medical Disease Prediction Using Machine Learning Algorithms
No ratings yet
Medical Disease Prediction Using Machine Learning Algorithms
10 pages
Smart Disease Prediction Using Machine Learning
No ratings yet
Smart Disease Prediction Using Machine Learning
5 pages
Breast Cancer
No ratings yet
Breast Cancer
20 pages
Fyp 1 Report: Title
No ratings yet
Fyp 1 Report: Title
36 pages
Health Prediction Using Data Mining - Scope Document
No ratings yet
Health Prediction Using Data Mining - Scope Document
4 pages
Discriminant Analysis
No ratings yet
Discriminant Analysis
33 pages
Heart Attack Predictions Using Machine Learning
No ratings yet
Heart Attack Predictions Using Machine Learning
8 pages
Disease Prediction Using Machine Learning
No ratings yet
Disease Prediction Using Machine Learning
4 pages
Salary Prediction Using Machine Learning
No ratings yet
Salary Prediction Using Machine Learning
4 pages
Final Heart Disease Prediction
No ratings yet
Final Heart Disease Prediction
26 pages
08250771
No ratings yet
08250771
8 pages
Optimizing Doctor Availability and Appointment Allocation in Hospitals Through Digital Technology and AI Integration
No ratings yet
Optimizing Doctor Availability and Appointment Allocation in Hospitals Through Digital Technology and AI Integration
9 pages
Regression: Unit Iii
No ratings yet
Regression: Unit Iii
54 pages
AgriComp Case Study
100% (1)
AgriComp Case Study
4 pages
Disease Prediction Using Python
100% (1)
Disease Prediction Using Python
7 pages
Health Insurance Cost Prediction Using IBM Watson
No ratings yet
Health Insurance Cost Prediction Using IBM Watson
27 pages
DBMS hospital management system mini project
No ratings yet
DBMS hospital management system mini project
36 pages
5132 DTP002 HDT0859
No ratings yet
5132 DTP002 HDT0859
16 pages
Supervised Learning (Classification and Regression)
No ratings yet
Supervised Learning (Classification and Regression)
14 pages
Federal Bank Placement Paper Contributed by Priya Updated On Aug 2024
No ratings yet
Federal Bank Placement Paper Contributed by Priya Updated On Aug 2024
1 page
MCA Project Titles
No ratings yet
MCA Project Titles
2 pages
Healthcare Recommendation
0% (1)
Healthcare Recommendation
90 pages
Medicinal Drug Recommendation System
No ratings yet
Medicinal Drug Recommendation System
52 pages
7 - Classification
No ratings yet
7 - Classification
71 pages
Heart Disease Prediction
No ratings yet
Heart Disease Prediction
70 pages
Breast Cancer Detection - Final
No ratings yet
Breast Cancer Detection - Final
21 pages
Heart Disease Python Report 1st Phase
No ratings yet
Heart Disease Python Report 1st Phase
33 pages
Predictive Analytics For Future Life Expectancy Using Machine Learning
No ratings yet
Predictive Analytics For Future Life Expectancy Using Machine Learning
6 pages
Heart Disease Prediction Using Machine Learning
No ratings yet
Heart Disease Prediction Using Machine Learning
7 pages
Heart Disease Prediction Using Machine Learning in R
No ratings yet
Heart Disease Prediction Using Machine Learning in R
69 pages
Pima Indian Diabetes Prediction
No ratings yet
Pima Indian Diabetes Prediction
22 pages
What Are The Differences Between Supervised and Unsupervised Learning?
No ratings yet
What Are The Differences Between Supervised and Unsupervised Learning?
22 pages
Heart Disease Detection Report
No ratings yet
Heart Disease Detection Report
10 pages
Multiple Disease Prediction
No ratings yet
Multiple Disease Prediction
23 pages
Linear Regression Quiz
No ratings yet
Linear Regression Quiz
6 pages
Multiple Disease Prediction Using Machine Learning
No ratings yet
Multiple Disease Prediction Using Machine Learning
4 pages
Medical Diagnosis1
No ratings yet
Medical Diagnosis1
50 pages
Problem Sheet On Markov Chain
No ratings yet
Problem Sheet On Markov Chain
5 pages
Candidate Elimination Algorithm
No ratings yet
Candidate Elimination Algorithm
24 pages
Hypothesis Testing Numericals
No ratings yet
Hypothesis Testing Numericals
5 pages
Final Major Project
No ratings yet
Final Major Project
99 pages
Project Synopsis On Breast Cancer Detection Using Data Mining
No ratings yet
Project Synopsis On Breast Cancer Detection Using Data Mining
3 pages
Student Alcohol Consumption 1649318453
No ratings yet
Student Alcohol Consumption 1649318453
20 pages
Medical Insurance Cost
No ratings yet
Medical Insurance Cost
12 pages
Interview Preparations - NielsenIQ
No ratings yet
Interview Preparations - NielsenIQ
1 page
Computerized Patient Record System
100% (2)
Computerized Patient Record System
198 pages
Batch C03 Medicine Recommendation System Using Machine Learning
No ratings yet
Batch C03 Medicine Recommendation System Using Machine Learning
17 pages
Touchpad Plus Ver. 1.1 Class 7
From Everand
Touchpad Plus Ver. 1.1 Class 7
Nisha Batra
No ratings yet
AIH_LAB1
No ratings yet
AIH_LAB1
10 pages
AI in Healthcare
100% (6)
AI in Healthcare
41 pages
DesignThinking MindsetSkillsetToolset v1
100% (11)
DesignThinking MindsetSkillsetToolset v1
101 pages
3 - 13 - Regional Radar Network and Radar Data Exchange
No ratings yet
3 - 13 - Regional Radar Network and Radar Data Exchange
18 pages
COOLING AND HEATING IN WINERIES - Editeda
No ratings yet
COOLING AND HEATING IN WINERIES - Editeda
10 pages
Module 1-Evaluate-Task 2-Prelim Journal Writing
No ratings yet
Module 1-Evaluate-Task 2-Prelim Journal Writing
3 pages
New World International School ACADEMIC SESSION 2024-2025 Grade-Vii Extra Practice Worksheet
No ratings yet
New World International School ACADEMIC SESSION 2024-2025 Grade-Vii Extra Practice Worksheet
7 pages
Start Up Catalysts Incubators and Accelerators
No ratings yet
Start Up Catalysts Incubators and Accelerators
89 pages
FCE 2speakingnew
No ratings yet
FCE 2speakingnew
24 pages
RS232 Connector Pin Assignment
No ratings yet
RS232 Connector Pin Assignment
8 pages
RRF - EDS Extension For Trivea 3 Land Dev Project
No ratings yet
RRF - EDS Extension For Trivea 3 Land Dev Project
5 pages
Advantages of 3 Phase Over Single Phase System
No ratings yet
Advantages of 3 Phase Over Single Phase System
7 pages
Writing Creative Nonfiction Course Guidebook Copy (Annotations)
No ratings yet
Writing Creative Nonfiction Course Guidebook Copy (Annotations)
8 pages
Tmo
No ratings yet
Tmo
14 pages
Hands On Guide Learn How To Quit Smoking in 60 Minutes
No ratings yet
Hands On Guide Learn How To Quit Smoking in 60 Minutes
29 pages
Special Carry Over Paper: Lucknow
No ratings yet
Special Carry Over Paper: Lucknow
7 pages
Provident Kenworth Rajendra Nagar Hyderabad
No ratings yet
Provident Kenworth Rajendra Nagar Hyderabad
3 pages
Rittal General Ingles
No ratings yet
Rittal General Ingles
72 pages
Aquaculture: Farming Aquatic Animals and Plants Third Edition. Edition John S Lucas - Download the complete ebook in PDF format and read freely
100% (5)
Aquaculture: Farming Aquatic Animals and Plants Third Edition. Edition John S Lucas - Download the complete ebook in PDF format and read freely
64 pages
Western Philippines University: Republic of The Philippines
No ratings yet
Western Philippines University: Republic of The Philippines
5 pages
Coal Tars and Coal-Tar Pitches
No ratings yet
Coal Tars and Coal-Tar Pitches
3 pages
M Sipko PDF
No ratings yet
M Sipko PDF
64 pages
The Mini Capsule Wardrobe An Introduction To Capsule Wardrobes v2
50% (4)
The Mini Capsule Wardrobe An Introduction To Capsule Wardrobes v2
16 pages
JChart Global Guide
No ratings yet
JChart Global Guide
32 pages
Pharmacology
No ratings yet
Pharmacology
1 page
New Insights Into The Pathophysiology of Oedema in Nephrotic Syndrome
No ratings yet
New Insights Into The Pathophysiology of Oedema in Nephrotic Syndrome
7 pages
Excel Application in Cost-Volume-Profit Analysis Final
No ratings yet
Excel Application in Cost-Volume-Profit Analysis Final
71 pages
Versed - Armantrout, Rae - Wesleyan Poetry, 2010 - Wesleyan University Press - 9780819571106 - Anna's Archive
No ratings yet
Versed - Armantrout, Rae - Wesleyan Poetry, 2010 - Wesleyan University Press - 9780819571106 - Anna's Archive
133 pages
Application Form For SD Payment
No ratings yet
Application Form For SD Payment
1 page
07 July 1993
No ratings yet
07 July 1993
116 pages
Pods+1 70+Fall+PDF
No ratings yet
Pods+1 70+Fall+PDF
72 pages
Biomass Gasification Using Reactive Ash Volatilisation Technology
No ratings yet
Biomass Gasification Using Reactive Ash Volatilisation Technology
51 pages

Prediction of Medical Costs Using Regression Algorithms: A. Lakshmanarao, Chandra Sekhar Koppireddy, G.Vijay Kumar

Uploaded by

Prediction of Medical Costs Using Regression Algorithms: A. Lakshmanarao, Chandra Sekhar Koppireddy, G.Vijay Kumar

Uploaded by

Journal of Information and Computational Science ISSN: 1548-7741

Prediction of medical costs using regression algorithms

Keywords: Medical insurance costs, Kaggle, Machine Learning

Volume 10 Issue 5 - 2020 751 www.joics.org

Volume 10 Issue 5 - 2020 752 www.joics.org

Splitting of data as training

Apply Apply Apply Apply

Apply model on testing

Compare and select

Figure 1: Proposed model

We applied following four regression models on the dataset.

3.1 Multiple Linear Regression:

Multiple linear regression (MLR) is a basic machine learning regression model,in

Volume 10 Issue 5 - 2020 753 www.joics.org

Here, Y is dependent variable, Xi is independent variables, b0 is y-intercept (constant

3.2 Support Vector Regression:

3.3. Decision Tree Regression:

3.4. Random Forest Regression:

4. Experimentation and Results

we conducted all experiments in python language.In Python,there is a library named as

Volume 10 Issue 5 - 2020 754 www.joics.org

R-squared (Coefficient of determination) : R-squared represents the coefficient of how

ML model R-Squared Mean Mean Squared Root Mean

Table 1: Results of regression models

Comparision of models based on R-squared values:

Figure 2: R-squared values of models

Volume 10 Issue 5 - 2020 755 www.joics.org

Comparision of models based on MAE:

Figure 3: MAE values comparison

So,after applying four algorithm,Random Forest Regression gives better results.

Steps for implementing MLR with backpropagation:

Volume 10 Issue 5 - 2020 756 www.joics.org

After step-1 we obtained following values.

coef std err t P>|t| [0.025 0.975]

Volume 10 Issue 5 - 2020 757 www.joics.org

You might also like