0% found this document useful (0 votes)
2 views

12 PAGES_Random Forest Algorithm, Support Vector Machine for Regression Analysis

This research paper explores the Random Forest algorithm and Support Vector Machine (SVM) for regression analysis, detailing their principles, applications, and performance evaluation. It highlights how Random Forest combines multiple decision trees to enhance accuracy and reduce overfitting, while SVM operates in a multi-dimensional space to generate prediction functions based on support vectors. The study includes empirical analyses using real-world datasets and emphasizes the importance of hyperparameter tuning and feature selection for improving model performance.

Uploaded by

Tushar Yadav
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

12 PAGES_Random Forest Algorithm, Support Vector Machine for Regression Analysis

This research paper explores the Random Forest algorithm and Support Vector Machine (SVM) for regression analysis, detailing their principles, applications, and performance evaluation. It highlights how Random Forest combines multiple decision trees to enhance accuracy and reduce overfitting, while SVM operates in a multi-dimensional space to generate prediction functions based on support vectors. The study includes empirical analyses using real-world datasets and emphasizes the importance of hyperparameter tuning and feature selection for improving model performance.

Uploaded by

Tushar Yadav
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Random Forest Algorithm, Support Vector Machine for

Regression Analysis

Tushar Yadav, Nabha Varshey, Shanu khare

Computer Science and Engineering & Chandigarh University, India


Computer Science and Engineering & Chandigarh University, India
Computer Science and Engineering & Chandigarh University, India
[email protected], [email protected], [email protected]

Abstract—Random Forest is one of the most versatile and powerful machine learning algorithms
and has gained considerable traction in many fields. In this research study, we explore the
fundamental ideas behind Random Forest, its practical applications, and its predictive modeling
performance indicators. Bagging, decision trees, and ensemble learning are some of the fundamental
concepts behind the Random Forest algorithm. The first part of the paper describes how Random
Forest clusters many decision trees to minimize overfitting and variability while increasing
forecast accuracy. Its versatility can be seen in a number of industries, from banking to healthcare,
image processing to natural language processing. This study also looks at several situations and use
cases in which Random Forest has shown exceptional forecasting power. This paper provides an
empirical analysis based on real-world data sets to evaluate the method’s performance and compares
Random Forest to other machine learning algorithms. The research also highlights key elements to
consider when implementing Random Forest, including hyperparameter tuning and feature selection
techniques to improve model performance, as well as how to interpret Random Forest models to
understand the importance of features.
The concept and actual model of Support Vector Machine was initially introduced by; Vapnik
and is categorized into further two types: Support Vector Classification; and Support Vector
Regression. SVM is a learning model that operates in a multi-dimensional feature space and does the
work of producing the prediction functions that are purely; based on a subset of support vectors.
This technique effectively compress complex; gray level structure only with the help of a few support
vectors. This paper provides an overview of SVMs for regression and function estimation, including
algorithms for training SVMs and modifications to the standard SV algorithm. It also discusses
regularization and capacity control from an SV perspective.

Keywords: SVM (Support Vector Machine), SLR (Support Linear Regression), SVC (Support
Vector Classification), VM(Vector Machine), Pred Fn (Prediction Funxtion).

I. INTRODUCTION
The Random Forest algorithm is a highly versatile and potent machine learning technique that has gained
significant popularity across various domains [1]. This research study delves into the core principles
behind Random Forest, its practical applications, and how to gauge its predictive modeling effectiveness.
Key concepts like bagging, decision trees, and ensemble learning underpin the Random Forest algorithm
[2]. The initial part of the paper elucidates how Random Forest amalgamates multiple decision trees to
reduce overfitting and variability while enhancing forecasting accuracy. Random Forest’s adaptability is
evident in its wide-ranging applications, spanning from industries such as banking and healthcare to image
processing and natural language understanding. An examination of different scenarios and use cases
reveals the remarkable predictive capabilities of Random Forest[3]. The study conducts an empirical
analysis using real-world datasets to assess the algorithm’s performance and compare it with other
machine learning methods. Various performance metrics, such as; precision, recall, and F1-scores, are
employed to evaluate the effectiveness of the algorithm in diverse scenarios. The research also emphasizes
critical factors to consider when implementing Random Forest, such as fine-tuning hyperparameters and
employing feature selection techniques to enhance model performance[4]. Additionally, it elucidates
methods for interpreting Random Forest models to comprehend feature importance. This research paper
comprehensively covers the fundamentals, applications, and performance evaluation of the Random Forest
algorithm, serving as a valuable resource for both novice and experienced researchers in the field of
predictive modelling and Ml. To harness the full potential of the Random Forest Algorithm across real-
world scenarios, it is imperative to grasp its strengths and weaknesses.

The Support Vector Machine; was originally introduced by Vapnik and it is further classified into two
types: Support Vector Classification and Support Vector Regression [5]. SVM is a type of learning
system that operates in a multi-dimensional feature space that helps in generating prediction functions
based on a subset of support vectors. This technique can effectively compress complex gray-level
structures using only a few support vectors [6]. This paper presents an overview of SVMs for regression
and function estimation, including algorithms for training SVMs and modifications to the standard SV
algorithm. It also discusses regularization and capacity control from an SV perspective.

II. LITERATURE REVIEW


Bernard at all [7] (2007) describes providing practical insights and recommendations for parameter
settings when applying random forest algorithms in real-world pattern recog- nition tasks.

Diaz Valera at all [8](2010) describes Fuzzy Random Forest as a multiple classifier system, leveraging
the robustness of ensemble methods, randomness for diversity, and fuzzy logic for handling imperfect data.

Adam S. at All [9] (2009) describes the mechanisms that drive cooperation among trees in a Random
Forest ensemble. It emphasizes the significance of the ”Correlation/Strength” ratio in explaining the
performance of sub-forests, potentially offering valuable insights for optimizing Random Forest models.

SpringerVerlag at all [10](2008) describes artificial intelligence as a multidisciplinary field that


automates tasks requiring human intelligence with the aims to educate readers on its applications in data
analysis and decision-making across diverse disciplines.

Atlanta at All [11] (2009) describes the limitations of classical Random Forest induction, which involves
a fixed number of randomized decision trees. It highlights two main drawbacks: the need to predefine the
number of trees and the loss of interpretability due to randomization.

Heute L at all [12] (2010) describes the Dynamic Random Forest induction algorithm, which adapts tree
induction to enhance ensemble complementarity. It achieves this through data resampling, inspired by
boosting techniques, and incorporating traditional RF randomization processes.

Angelis A at all [13](2006) describes the significance of Leo Breiman’s Random Forests in machine
learning is acknowledged for its robustness and improved classification results.

Christine Diwe at all [14](2019) describes the significance of feature selection in datasets with numerous
variables and highlights Random Forest as a robust tool for this purpose, particularly in regression tasks.
Gibbs Y. Kanyongo at all [15](2006) describes the correlation between home environment factors and
reading performance in Zimbabwe using data from SACMEQ. Linear regression analysis conducted through
structural equation modeling with AMOS 4.0 revealed that a socioeconomic status proxy emerged as the
most influential predictor of reading achievement.

III. METHODOLOGY
3.1 Random Forest The Random Forest is made up of a collection of selected trees. We advance the
classification performance of single-tree classifier by combining bootstrapping collection, also known as a
congestion strategy, with unpredictability in the choosing of parling information hubs in the creation of
decision trees.[16]
1.1. In a decision tree which have no of leaves as M leaves basically divides the space of the
feature into M regions like Rm. 1, m ,M . In the prediction function where f
(x) is defined for each of the tree As in equations (1) and (2):

where the number of regions in the feature space is M, Rm is a region corresponding to m, cm is a


constant corresponding to m:

3.2 Important Features Study The utilization of Random Forest for Variable Importance Analysis has
garnered signifi- cant attention from researchers. However, there are still unre- solved issues that require
satisfactory resolution, as outlined in.[17] In particular, the R version of this technique delivers two correct
measurements for each variable that provides explanation. In the 1st metric, %IncMSE, evaluates the
average reduction in precision, or else the amount to which the predictions degrade when the value of the
variable varies. This measure is computed by permuting the test data and recording the prediction error of
each tree test, followed by the same process after permuting each predictor. The previous stages are then
repeated after sorting every predicted variable.The gener- ated difference is found by computing the mean
of every tree and normalizing it by the average range of the difference.[18] If the average variation in the
difference for a given variable is 0, no division is done and the mean will often be zero. The magnitude of
the difference indicates the importance of the variable, with larger differences indicating greater
importance. This approach leverages the concept of Out of Bagging (OOB), which involves a collection of
regression trees. Specifically, we utilize the OOB subset excluded during the building of every tree to
determine the average squared error according to the following formula:
in the above equation as yi is the real hour rates, yˆi is the projected one and n is the amount of data
which is utilized in the OOB set. In each of the trees b and variable j, that were utilized for producing the
tree which is periodically permuted in the OOB set. A fresh; Mean Squared Error is formed and an
indication of the importance of the parameter can then be determined from the presentation of Formula
(4).[19]

is the average of every trees (B) have, in random forest where the variable ’j’ was used. The overall
significance value

is developed by standardizing using the standard error as a formula.


The standard deviation of δbj is denoted by σδbj.A larger percentage increase in the mean squared error
(MSE), ex- pressed as %IncMSE, indicates a variable’s greater impor- tance. The other important measure
which cannot be neglected is, IncNodePurity, refers to the reduction function determined from the
optimum split. For regression, the loss function is MSE, while for classification, the loss function is Gini
impurity. More useful variables lead to increased node purity. This is done by detecting splits with large
segment variance and small segment variance.

3.3 Support Vector Machines and Support Vector Regression are ML algorithms that have been
extensively researched in recent years and have been introduced as powerful methods for classification. A
comprehensive overview of SVM can be found in references.[20] SVM utilizes a high-dimensional space
to identify a hyperplane that minimizes the error rate for binary classification, as described in references.
The initial data format and the result range of SVM are specified by Equation (6). . ., (xn, yn) represents the
learning information, n represents the number of samples, m is the starting point of vectors, while y
corresponds with the class 1 or -1.
As the border among classes established from the hyper- plane that’s worked out as a correct
combination regarding subset of information points, known as Support Vectors (SVs). In regression
scenarios, purpose is to anticipate an amount, and regression might involve real- estimated or disconnected
input variables. A challenge with numerous input variables is continually related to as a multivariate
regression case. The SVM path has been extended to regression cases, working in SVR, as described in
reference.
As we can computed the output using Formula(7) for SVR, where Y svr(x) represents the output,
βik(x; xi) represents the kernel function, and b is a constant. The variables βi and xi represent the
weight and position of each support vector, respectively. Additionally, n denotes the number of SVs, b
represents the bias, and k(x; xi) is the kernel function corresponding to xi. The standard approach employs
a single kernel function characterized by a set of parameters.
To avoid over-fitting, the support vector regression (SVR) function offers the penalization of regression
via a cost function. The SVR approach is adaptable in regards to the maximum permissible error and
penalty cost, permitting the change of both parameters to carry out sensitivity analysis as well as to
improve the model. This sensitivity analysis includes training numerous models with varying permitted
errors and cost parameters . The process of tuning the model is achieved via a grid-based search using the
generic function tune(), that tunes hyperparameters of statistical techniques.[21] The SVR model’s
versatility with regard to maximum error and penalty cost renders it an attractive choice for tweaking.

3.4 Classification and regression training package The Caret package; offers multiple features whose
work is to simplify the development of the model and evaluation of the process. This package gives
services for optimizing the training of the model for challenging the regression and classification
duties.[22] The package uses the capabilities of multiple R packages and in addition also makes an effort
not to load them all during package startup. The time of the package start-up; can be greatly shortened
through the elimination of formal package dependencies. The package indicates that the field has 30
packages. Caret loads packages as needed, supposing they are already installed.[23] The application
includes data splitting, pre-processing, feature selection, modifying the model via resampling, variable
importance estimate, among other tasks.
3.5 Research workflow Figure 1. depicts the full fledged working of the research. The study comprises
of multiple phases. In this study, the Random Forest approach is utilized to pick out critical characteristics
from each dataset. Moreover, the creation of the independent ML models uses SVM, RF, and the precised
combination of SVM and RF together. Different models are going to possess different capabilities in
predicting data.[24]

Fig. 1. Diagrammatic Representation of Methodology

An effort was put forth to amalgamate the benefits of RF, SVM, and tweak SVM; algorithms for
regression to improve accuracy, and the final phase included comparing the findings. RF was applied to
determine the most significant charac- teristics for each dataset. This method is relatively recent compared
to the other techniques included in this study and was created by Breimann to produce more exact
predictions without the use of overfitting technique.[25] RF applies a randomized subset which consists of
predictors in order to split each tree, leading to several alternative trees and a more accurate prediction. In
this study, 500 trees were employed using RF. Previous study has showed that RF may deliver correct
results in terms of evaluation of critical variable and accuracy of prediction, while additionally minimizing
the issue of instability originating from training sample changes . The relevant characteristics identified by
RF were subsequently used to create the SVM model. SVM is an effective approach for classification and
regression analysis, as explained in. Other study has revealed that SVM leverages a highly based multi-
dimensional space to identify a hyperplane for binary classification with minimum error rates.
It is imperative to verify the parameters of the SVM algorithm, as with many ML algorithms, SVM
requires parameter tuning to achieve optimal performance. This is of utmost importance as SVM is highly
sensitive to parameter selection, where even slight variations in parameter values can result in vastly
different classification outcomes. To address this issue, we will conduct tests using various parameter
values and utilize the svm() and tune.svm() functions within the e1071 packages of the R language to
construct the SVM model. Therefore, we will employ the RBF kernel function , where two parameters, cost
and gamma, must be determined. We will utilize the tune.svm() function to identify the optimal cost and
gamma values. [26] Additionally, the tune SVM regression method can effectively address issues of
nonlinearity, small sample size, and high dimensionality , thereby enhancing regression analysis accuracy.

3.6 Model Performance Evaluation


The performance evaluation of the model is carried out by taking optimum use of specified statistical
indicators, that have been chosen to measure the efficacy of the suggested models. Given that our suggested
model is based on regression analysis, we have adopted regression assessment measures. Notably, the; Root
Mean Squared Error is the most reliable and optimum measure for quantifying the accuracy in the field.
The square root is applied to guarantee that errors are adjusted in the same way as the objectives.[27] The
equation for this is Formula . Lower values of RMSE suggest a more favorable result. The coefficient of
Pearson correlation (r), runs between 0 and 1; that is used to measure the coefficient of determination.
When the value of r is 1 it mean that the regression predictions match the data completely. The
matching equation is indicated as Formula.

IV. CONCLUSION
4.1 Data-set descriptions. These working simulations de- ploys four datasets publically; that are;
accessible using the UCI ML library. All the dataset related to regression infor- mation and have various
total cases and characteristics. The details and information of each and every dataset could be obtained
from the below defined Table 1.

A dataset; that is used for the underlying study belonged to the category of regression data as presented in
Table 1. We utilize the Wisconsin Breast Cancer(WBC) Dataset, that was released in the year 1991 with
the total number of 699 patients and 10 characteristics. We also utilize the Forest Fire Dataset, which was
released in year of 2008 with the total number
of 517 occurrences and 13 characteristics, the Wine Quality Dataset, that was published in year of 2009
with the total number of instances; 4898 and features; 12 , and the ”Bike Sharing Dataset”, which was
published in year 2013 featuring 17379 instances and 15 features. In addition, Figure 2 and 3 highlight the
crucial RF measurement of every variable and dataset.

Fig. 2. Cancer Dataset

Fig. 3. Forest Dataset

Figure 1 and Figure 2 shows the variables for the ”Wis- consin Breast Cancer Dataset” and the ”Forest
Fire Dataset” ordered by the two crucial measures %IncMSE and IncN- odePurity in decreasing order.
These two metrics are those that the RF assigned. According to %IncMSE, the Wisconsin Breast Cancer
Database Dataset is ranked as follows: ”The Bare Nuclei, Uniformity of the Cell Size, Uniformity of the
Cell Shape, and the Clump Thickness” are the four most important factors. A grid search is used for
parameter modi- fications of functions after bland chromatin , normal nucleoli
, and marginal adhesion. The score based on %IncMSE will be used to enhance the accuracy of forecasts.
The wind speed, varying from 0.40 to 9.40 km/h, is the most relevant variable in this dataset, followed by
the temperature, which ranges between: Temperature can vary from 2.2 to 33.30, the DC
index; of FWI system’s; raises from 7.89 to 860.61, the ranges of FFMC index increases from 18.69 to
96.24, the DMC index ranges from 1.11 to 291.31, and the month ranges from ”Jan to Dec” from 1st
(January) to 12th (December).

Fig. 4. Wine Database

Fig. 5. Bike Database

According to %IncMSE and IncNodePurity, the figures 3 and 4 define the crucial value for each and
every variable in the; ”wine quality dataset” and the ”bike sharing dataset”, respectively. Volatile acidity
scores as the most significant element of the Wine Quality Dataset. Bike Sharing Dataset’s primary
features are also displayed in Figure 3 based on %IncMSE ranking. The number of registered users is
the most crucial attribute, followed by the year of (0: 2011, 1: 2012), the variety of casual users, the hour
(0 to 22), and the working days: if the day that is choosed is not a holiday, it is 1, otherwise it is 0. The
normalized sensation temperature in Celsius is an attribute of atemp. The values have been divided into 50
(max), 50 (atemp), and weekday.

4.2 Experiment result. Numerous tests have been carried out. We employ these chosen features for the
SVM model after using RF to determine the key characteristics. The total number of trees we have used
RF on is 500. The SVM algorithm’s eps-regression parameters are used with an RBF kernel as the
kernel function. We simulate the findings with the following cost values: gamma = 0.0001, 0.001, 0.01,
0.1, 1, 5, 10 for each dataset. To get the best and optimum values of parameter for gamma and use them for
the tuning, this model uses tune; svm().

The SVM model can be tuned since the approach allows for flexibility that helps to reduces the error and
penalty costs. The model needs to be tuned in order to optimize the parameters for the most accurate
prediction. Tables II, III, IV, V, and Figure 3 and Figure 4 illustrate the evaluation results for each
experiment using a different dataset. The method of the evaluation regression using the Wisconsin Cancer
Dataset; is shown in Table II. This dataset comprises 10 features overall and 699 instances in total. In
addition, in comparison to previous methods, our suggested model has a optimum value for; RMSE and
r. With six features, the; RF, SVM, and updated SVM; regression techniques could decrease the; value of
RMSE.

The result of the following experiment, which makes use of the Forest Fire Dataset, is shown in Table
III.
In addition, our suggested approach performs better than the alternative approaches. With six features,
the RMSE value falls and the r value rises. The evaluation of the Wine Quality Dataset’s
regression look at is shown in Table IV. With six features, the optimum; RMSE value is 0.4909776,
and the optimum value of r is 0.6974942. Table V presents an analysis of the regression approach using
the Bike Sharing Dataset. As a result, when compared to other methods, our suggested model, which
includes 7 features, has the greatest possible r value of 0.999314 and the lowest possible; RMSE value of
4.813466. A lower RMSE is often preferable than a greater one. On the other side, the larger the value
of r, the better it is. If the value of r is 1, then that predicted data is accurate. The results of the underlined
study clearly demonstrate that in each and every experiment for all the datasets taken into consideration,
the; RMSE value is trending downward and the r value is rising upward. This result is depicted in
Figure 4. Based on the findings of the evaluation, the suggested model performs better than any other
methods in each and every dataset. We get to the conclusion that the RF approach is reliable for choosing
the key features, while the SVM method works excellently with little amounts of data. In each
experiment, we could see that using fewer features allowed us to reduce RMSE values
and increase r values.

Fig. 6. Evaluation of the regression method

According to the recommended and underlying model, the combination of ”RF, SVM, and modify SVM
Regression” may reduce the; RMSE value and enhance the value of r based on the consideration and
evaluation of regression approach on the tables II,III,IV and V. For instance, with 6 features, the Wine
Quality Dataset’s ”RMSE value” decreases from the value of 0.6104656 to 0.4909776 and its r value
grows from 0.5343693 to 0.6974942. So the deduction from Figure 5 is that, across every experiment and
datasets, the RMSE value tends to decline while the r value rises. It shows the fact that the
regression’s predictions for the information were accurate. Additionally, we take important traits using
the Random Forest technique. Figures 2 and 3 demonstrates the most important measure; for each
variable in each of the following dataset according to %IncMSE and In- cNodePurity. The performance
of the model could potentially improved via closely picking the most important features, as we
demonstrate in our study. With just a few features, we were still able to obtain respectable RMSE and
r values. For enhanced efficiency, SVM has a few parameters which needs to be modified.
The use of the tunes.svm() look at significantly boosts efficiency over other ways, and it is good for
little data sets.
For instance, in Table 5, the tuning plan reduces the the value of RMSE. Tuning the model optimizes the
parameters for the most accurate prediction. The results generated by the simulation experiments indicate
the impor- tance and precision of the suggested technique. The developed combination prediction
technique is an important problem currently, so the combination algorithms discussed in this research
may be general to other fields as well. Further study may incorporate a comparison of additional unique
models, kernels, methodologies, databases, and affection aspects.

ACKNOWLEDGMENT

First of all, we would like to appreciate to thank the Almighty for bestowing his blessings upon us and
over the successful completion of our project, and also keeping us healthy throughout. Secondly, we
would like to thank and express our heartiest gratitude towards our project mentor, Ms Shanu Khare
who helped and guided us at each and every step in completing our project. Without her guidance, we
shall not have succeeded in our project successfully. Her dynamism, vision and exquisite efforts have
deeply inspired us. She taught us the methodology to carry out the project and to present the project work
as clearly as possible. It was a great privilege for us to study and work under her guidance. Finally, we
would like to thank our institution “Chandigarh University” for giving us such fortunate opportunity to
showcase our talent through this project and we have gained a lot of knowledge about various tools used
for android development throughout the making of this project.

REFERENCS

[1] Yaqi Li, Chun Yan, Wei Liu, “Research and Application of Ran- dom Forest Model in Mining
Automobile Insurance Fraud,” Maozhen Li,Department of Electronic and Computer Engineering,
Brunel Univer- sity London Uxbridge, UB8 3PH, UK.
[2] J. Ilham Cahya Suherman, ”Implementation of Random Forest Regres- sion for COCOMO II Effort
Estimation”.
[3] ”Research on Power Load Forecasting Based on Random Forest Regres- sion”,Na Liu, Yanzhu Hu and
Xinbo Ai* School of Automation, Beijing University of Posts and Telecommunications, Beijing,
100876, China
[4] ”In the International Conference for Document Analysis and Recogni- tion”, Bernard, Heutte, and
Adam (2007) employed randomly selected forests to recognize handwritten digits.
[5] ”A Fuzzy Random Forest”, International Journal for Approximate Thinking 51, 729 to 747 2010,
Bonissone P, Cadenas J, Garrido M, Diaz Valera R.
[6] ”BernardS., Heutte L., and Adam S. Enhancing the Understanding of Random Forests through
Strength and Correlation Studies”, ICIC Proceedings of the Intelligent Computing, 2009.
[7] ”A new method of random forest induction”, by SpringerVerlag in 2008.
[8] ”On the Selection of Decision Trees in Random Forest”, International Joint Conference on Neural
Networks Proceedings, Atlanta, Georgia, USA, June 14–19, 302- 307, (2009).
[9] ”Dynamic Random Forest, Pattern Recognition” Letters 33 2012, 1580–1586. Bernard S., Heute L.,
and Adam S.
[10] ”Coherent Random Forest”, International Journal of Computational Intelligence 2, 2006, Boinee P,
Angelis A, and Foresti G.
[11] ”Diaz R, Bonissone P, Candenas J, Garrido M Studies in Fuzziness and Soft Computing”, Vol. 249,
23–42, 2010, ”A Fuzzy Random Forest: Fundamental for Design and Construction.
[12] Bartlett P. and Shawe-Taylor J., “Generalization performance of support vector machine and other
pattern classifiers”, In C.B˜ urges B.S˜cholkopf, editor, “Advances in Kernel Methods–Support Vector
Learning”. MIT press, 1998.
[13] Bottou L. and Vapnik V., “Local learning algorithms”, Neural Compu- tation, 4(6): 888–900,
November 1992.
[14] Burges C., “A tutorial on support vector machines for pattern recogni- tion”, In “Data Mining and
Knowledge Discovery”. Kluwer Academic Publishers, Boston, 1998, (Volume 2).
[15] Osuna E., Freund R., and Girosi F., “Support Vector Machines: Training and Applications”, A.I.
Memo No. 1602, Artificial Intelligence Labora- tory, MIT, 1997.
[16] Pontil M., Mukherjee S., and Girosi F., “On the noise model of Support Vector Machine
regression” A.I. Memo, MIT Artificial Intelligence Laboratory, 1998
[17] Osuna E., Freund R., and Girosi F., “Support Vector Machines: Training and Applications”, A.I.
Memo No. 1602, Artificial Intelligence Labora- tory, MIT, 1997.
[18] Vapnik, V.N.: Statistical learning theory. Springer, New York, 1995.
[19] Zhang, Y., Zhu, Y., Lin, S., Liu, X.: Application of Least Squares Support Vector Machine in
Fault Diagnosis. In: Liu, C., Chang, J., Yang,
A. (eds.) ICICA 2011, Part II. CCIS, vol. 244, pp. 192–200. Springer, Heidelberg, 2014.
[20] Trafalis T., ”Primal-dual optimization methods in neural networks and support vector machines
training”, 1990.
[21] ”Gradient Descent for Linear Regression” by Levente Kriston-Vizi, 2019.

You might also like