0% found this document useful (0 votes)

9 views

Random Forests

Uploaded by

AhmedPasha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views

Random Forests

Uploaded by

AhmedPasha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 43

1

RANDOM FORESTS
Ensemble methods 2

 A single decision tree does not perform well

 But, it is super fast
 What if we learn multiple trees?

 We need to make sure they do not all just learn the same
Bagging 3

 If we split the data in random different ways, decision trees give different results,
high variance.
 Bagging: Bootstrap aggregating is a method that result in low variance.
 If we had multiple realizations of the data (or multiple samples), we could
calculate the predictions multiple times and take the average of the fact that
averaging multiple onerous estimations produce less uncertain results
Bagging 4

Bootstrap
Bootstrap 5

 Construct B (hundreds) of trees (no pruning)

 Learn a classifier for each bootstrap sample and average them
 Very effective
Bagging for classification: Majority vote 6
Bagging decision trees 7
Out‐of‐Bag Error Estimation 8

 No cross validation?
 Remember, in bootstrapping we sample with replacement, and therefore not all
observations are used for each bootstrap sample. On average 1/3 of them are not
used!
 We call them out‐of‐bag samples (OOB)
 We can predict the response for the i-th observation using each of the trees in
which that observation was OOB and do this for n observations
 Calculate overall OOB MSE or classification error
Bagging 9

 Reduces overfitting (variance)

 Normally uses one type of classifier
 Decision trees are popular
 Easy to parallelize
Variable Importance Measures 10

 Bagging results in improved accuracy over prediction using a single tree

 Unfortunately, difficult to interpret the resulting model. Bagging improves
prediction accuracy at the expense of interpretability.
 Calculate the total amount that the RSS or Gini index is decreased due to splits
over a given predictor, averaged over all B trees.
Bagging 11

 Each tree is identically distributed (i.d.)

 the expectation of the average of B such trees is the same as the expectation of any
one of them
 the bias of bagged trees is the same as that of the individual trees

 i.d. and not i.i.d

Bagging 12
Why does bagging generate correlated 13

trees?
 Suppose that there is one very strong predictor in the data set, along with a
number of other moderately strong predictors.
 Then all bagged trees will select the strong predictor at the top of the tree and
therefore all trees will look similar.
 How do we avoid this?
 What if we consider only a subset of the predictors at each split?
 We will still get correlated trees unless …. we randomly select the subset !
Random Forest, Ensemble Model 14

 The random forest (Breiman, 2001) is an ensemble approach that can also be
thought of as a form of nearest neighbor predictor.
 Ensembles are a divide-and-conquer approach used to improve performance. The
main principle behind ensemble methods is that a group of “weak learners” can
come together to form a “strong learner”.
Trees and Forests 15

 The random forest starts with a standard machine learning technique called a
“decision tree” which, in ensemble terms, corresponds to our weak learner. In a
decision tree, an input is entered at the top and as it traverses down the tree the
data gets bucketed into smaller and smaller sets.
Random Forest 16

 As in bagging, we build a number of decision trees on bootstrapped training

samples each time a split in a tree is considered, a random sample of m predictors
is chosen as split candidates from the full set of p predictors.
 Note that if m = p, then this is bagging.
Trees and Forests 17
 In this example, the tree advises us, based upon weather conditions, whether to play ball. For
example, if the outlook is sunny and the humidity is less than or equal to 70, then it’s probably
OK to play.
Trees and Forests 18
 The random forest takes this notion to the next level by combining trees with the notion of an
ensemble. Thus, in ensemble terms, the trees are weak learners and the random forest is a strong
learner.
Random Forest Algorithm 19

 For b = 1 to B:
(a) Draw a bootstrap sample Z∗ of size N from the training data.
(b) Grow a random-forest tree to the bootstrapped data, by recursively repeating the
following steps for each terminal node of the tree, until the minimum node size
nmin is reached.
i. Select m variables at random from the p variables.
ii. Pick the best variable/split-point among the m.
iii. Split the node into two daughter nodes. Output the ensemble of trees.
Random Forest Algorithm 20

 To make a prediction at a new point x we do:

 For regression: average the results
 For classification: majority vote
21
Training the algorithm 22

 For some number of trees T:

 Sample N cases at random with replacement to create a subset of the data. The subset should be about
66% of the total set.
 At each node:
 For some number m (see below), m predictor variables are selected at random from all the predictor variables.
 The predictor variable that provides the best split, according to some objective function, is used to do a binary
split on that node.
 At the next node, choose another m variables at random from all predictor variables and do the same.
 Depending upon the value of m, there are three slightly different systems:
 Random splitter selection: m =1
 Breiman’s bagger: m = total number of predictor variables
 Random forest: m << number of predictor variables. Breiman suggests three possible values for m:
½√m, √m, and 2√m
Running a Random Forest 23

 When a new input is entered into the system, it is run down all of the trees. The
result may either be an average or weighted average of all of the terminal nodes
that are reached, or, in the case of categorical variables, a voting majority.
Note that:
 With a large number of predictors, the eligible predictor set will be quite different
from node to node.
 The greater the inter-tree correlation, the greater the random forest error rate, so
one pressure on the model is to have the trees as uncorrelated as possible.
 As m goes down, both inter-tree correlation and the strength of individual trees go
down. So some optimal value of m must be discovered.
Differences to standard tree 24

 Train each tree on Bootstrap Resample of data (Bootstrap resample of data set
with N samples: Make new data set by drawing with Replacement N samples;
i.e., some samples will probably occur multiple times in new data set)
 For each split, consider only m randomly selected variables
 Don’t prune
 Fit B trees in such a way and use average or majority voting to aggregate results
Random Forests Tuning 25

 The inventors make the following recommendations:

 For classification, the default value for m is √p and the minimum node size is one.
 For regression, the default value for m is p/3 and the minimum node size is five.
 In practice the best values for these parameters will depend on the problem, and
they should be treated as tuning parameters.
 Like with Bagging, we can use OOB and therefore RF can be fit in one sequence,
with cross-validation being performed along the way. Once the OOB error
stabilizes, the training can be terminated.
Why Random Forests works: 26
Advantages of Random Forest 27

 No need for pruning trees

 Accuracy and variable importance generated automatically
 Overfitting is not a problem
 Not very sensitive to outliers in training data
 Easy to set parameters
 Good performance
Example 28

 4,718 genes measured on tissue samples from 349 patients.

 Each gene has different expression
 Each of the patient samples has a qualitative label with 15 different levels: either
normal or 1 of 14 different types of cancer.
 Use random forests to predict cancer type based on the 500 genes that have the
largest variance in the training set.
29
R Example 30
 We will use the R in-built data set named readingSkills to create a decision tree. It
describes the score of someone's readingSkills if we know the variables
"age","shoesize","score" and whether the person is a native speaker.
 Here is the sample data.
> library(randomForest)
randomForest 4.6-14
Type rfNews() to see new features/changes/bug fixes.
> # Print some records from data set readingSkills.
> print(head(readingSkills))
nativeSpeaker age shoeSize score
1 yes 5 24.83189 32.29385
2 yes 6 25.95238 36.63105
3 no 11 30.42170 49.60593
4 yes 7 28.66450 40.28456
5 yes 11 31.88207 55.46085
6 yes 10 30.07843 52.83124
# Create the forest.
31
output.forest <- randomForest(nativeSpeaker ~ age + shoeSize + score,
data = readingSkills)

# View the forest results.

print(output.forest)
Number of trees: 500
No. of variables tried at each split: 1

OOB estimate of error rate: 1%

Confusion matrix:
no yes class.error
no 99 1 0.01
yes 1 99 0.01
• Gini Impurity measures how often a randomly chosen record
# Importance of each predictor. from the data set used to train the model will be incorrectly
print(importance(output.forest,type = 2)) 32
labeled if it was randomly labeled according to the distribution
of labels in the subset. Gini Impurity reaches zero when all
MeanDecreaseGini records in a group fall into a single category. This measure is
essentially the probability of a new record being incorrectly
age 14.13397 classified at a given node in a Decision Tree, based on the
shoeSize 18.48703 training data.
• Because Random Forests are an ensemble of individual Decision
score 57.52747 Trees, Gini Importance can be leveraged to calculate Mean
Decrease in Gini, which is a measure of variable importance for
plot(output.forest, type="l") estimating a target variable.
varImpPlot(output.forest) • Mean Decrease in Gini is the average of a variable’s total
decrease in node impurity, weighted by the proportion of
samples reaching that node in each individual decision tree in
the random forest.
• This is a measure of how important a variable is for estimating
the value of the target variable across all of the trees that make
up the forest. A higher Mean Decrease in Gini indicates higher
variable importance.
• Variables are sorted and displayed in the Variable Importance
Plot created for the Random Forest by this measure. The most
important variables to the model will be highest in the plot and
have the largest Mean Decrease in Gini Values, conversely, the
least important variable will be lowest in the plot, and have the
smallest Mean Decrease in Gini values.
pred= predict(output.forest, readingSkills)

33
library(e1071)

library(caret)

# Create Confusion Matrix

confusionMatrix(data=pred, reference=readingSkills$nativeSpeaker, positive='yes')

Confusion Matrix and Statistics

Reference

Prediction no yes

no 100 0

yes 0 100

Accuracy : 1

95% CI : (0.9817, 1)

No Information Rate : 0.5

P-Value [Acc > NIR] : < 2.2e-16

Kappa : 1

Mcnemar's Test P-Value : NA

Sensitivity : 1.0

Specificity : 1.0

Pos Pred Value : 1.0

Neg Pred Value : 1.0

Prevalence : 0.5

Detection Rate : 0.5

Detection Prevalence : 0.5

Balanced Accuracy : 1.0

'Positive' Class : yes

Random Forest Implementation in 34

(prediction)
 Let’s use Boston dataset
require(randomForest)
require(MASS)#Package which contains the Boston housing
dataset
attach(Boston)
set.seed(101)
?Boston #to search on the dataset
crim: per capita crime rate by town. 35
zn: proportion of residential land zoned for lots over 25,000 sq.ft.
indus: proportion of non-retail business acres per town.
chas: Charles River dummy variable (= 1 if tract bounds river; 0 otherwise).
nox: nitrogen oxides concentration (parts per 10 million).
rm: average number of rooms per dwelling.
age: proportion of owner-occupied units built prior to 1940.
dis: weighted mean of distances to five Boston employment centres.
rad: index of accessibility to radial highways.
tax: full-value property-tax rate per \$10,000.
ptratio: pupil-teacher ratio by town.
black: 1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town.
lstat: lower status of the population (percent).
medv: median value of owner-occupied homes in \$1000s.
We are going to use variable ′medv′ as the Response variable, which is the Median Housing Value.
We will fit 500 Trees.
36
dim(Boston)
[1] 506 14
#training Sample with 300 observations
train=sample(1:nrow(Boston),300)
Fitting the Random Forest
We will use all the Predictors in the dataset.
Boston.rf=randomForest(medv ~ . , data = Boston , subset = train)
Boston.rf
Number of trees: 500
No. of variables tried at each split: 4

Mean of squared residuals: 12.07361

% Var explained: 85.91
 The above Mean Squared Error and Variance explained are calculated using Out of Bag Error
Estimation. In this 2/3 of Training data is used for training and the remaining 1/3 is used to 37
Validate the Trees. Also, the number of variables randomly selected at each split is 4.
 Plotting the Error vs Number of Trees Graph.
plot(Boston.rf)

This plot shows the Error

and the Number of Trees.
We can easily notice that
how the Error is dropping
as we keep on adding
more and more trees and
average them.
 Now we can compare the Out of Bag Sample Errors and Error on Test set
 The above Random Forest model chose Randomly 4 variables to be considered at each split. We could now try
all possible 13 predictors which can be found at each split. 38

oob.err=double(13)
test.err=double(13)

#mtry is no of Variables randomly chosen at each split

for(mtry in 1:13)
{ rf=randomForest(medv ~ . , data = Boston , subset = train,mtry=mtry,ntree=400)
oob.err[mtry] = rf$mse[400] #Error of all Trees fitted

pred<-predict(rf,Boston[-train,]) #Predictions on Test Set for each Tree

test.err[mtry]= with(Boston[-train,], mean( (medv - pred)^2)) #Mean Squared Test
Error

cat(mtry," ") #printing the output to the console }

> test.err
39
[1] 20.89175 14.95446 13.03060 12.78799 12.03247 11.73759 11.50418 11.59229
[9] 12.23918 11.89928 11.91245 12.36971 12.41598
> oob.err # Out of Bag Error Estimation
[1] 21.17376 13.68955 12.59845 11.83516 11.72935 11.07857 11.77836 11.61401
[9] 12.39642 12.78779 12.10131 12.82391 12.44966
What happens is that we are growing 400 trees for 13 times i.e for all 13 predictors.
Plotting both Test Error and Out of Bag Error
matplot(1:mtry , cbind(oob.err,test.err), pch=19 ,
col=c("red","blue"),type="b",ylab="Mean Squared Error",xlab="Number of
Predictors Considered at each Split")
legend("topright",legend=c("Out of Bag Error","Test Error"),pch=19,
col=c("red","blue"))
40
Now what we observe is that
the Red line is the Out of Bag
Error Estimates and the Blue
Line is the Error calculated on
Test Set. Both curves are quite
smooth and the error
estimates are somewhat
correlated too. The Error
Tends to be minimized at
around mtry=4.
Parameter Tuning with an Algorithm 41

> bestmtry <- tuneRF(Boston[,-13], Boston[,13], stepFactor=1.5,

improve=1e-5, ntree=500)
mtry = 4 OOB error = 10.15204
Searching left ...
mtry = 3 OOB error = 10.58809
-0.04295218 1e-05
Searching right ...
mtry = 6 OOB error = 10.47271
-0.0315876 1e-05
> print(bestmtry)
mtry OOBError
3 3 10.58809
4 4 10.15204
6 6 10.47271
> Boston.rf2=randomForest(medv ~ . , data = Boston , subset = 42
train, mtry=4)
> Boston.rf2

Type of random forest: regression

Number of trees: 500
No. of variables tried at each split: 4

Mean of squared residuals: 11.68897

% Var explained: 86.36
Boston.lm=lm(medv ~ . , data = Boston)
43
summary(Boston.lm)

error <- Boston.lm$residuals

lm_error <- mean(error^2)
lm_error
[1] 21.89483

The Risk - Elle Kennedy
No ratings yet
The Risk - Elle Kennedy
4 pages
Random Forests 2
No ratings yet
Random Forests 2
43 pages
Lecture #15: Regression Trees & Random Forests
No ratings yet
Lecture #15: Regression Trees & Random Forests
34 pages
Random Forest
No ratings yet
Random Forest
29 pages
Random Forests
No ratings yet
Random Forests
35 pages
Random Forest
No ratings yet
Random Forest
8 pages
Bagging and Boosting
No ratings yet
Bagging and Boosting
32 pages
Random Forest Class Lecture Notes
No ratings yet
Random Forest Class Lecture Notes
2 pages
Bagging and Random Forests
No ratings yet
Bagging and Random Forests
24 pages
Da MS
No ratings yet
Da MS
24 pages
Lecture 05 Random Forest 07112022 124639pm
No ratings yet
Lecture 05 Random Forest 07112022 124639pm
25 pages
Random Forest
No ratings yet
Random Forest
83 pages
Machine Learning: Practical Tutorial On Random Forest and Parameter Tuning in R
No ratings yet
Machine Learning: Practical Tutorial On Random Forest and Parameter Tuning in R
11 pages
Data Mining Notes
No ratings yet
Data Mining Notes
5 pages
Random+Forest+Summary
No ratings yet
Random+Forest+Summary
6 pages
Random Forest
No ratings yet
Random Forest
25 pages
Random Forest
No ratings yet
Random Forest
25 pages
Machine Learning With Random Forests - by Knoldus Inc. - Knoldus - Technical Insights - Medium
No ratings yet
Machine Learning With Random Forests - by Knoldus Inc. - Knoldus - Technical Insights - Medium
12 pages
Bagging and Random Forest Presentation1
100% (2)
Bagging and Random Forest Presentation1
23 pages
Random Forest Summary
No ratings yet
Random Forest Summary
6 pages
Random Forest Algorithm
No ratings yet
Random Forest Algorithm
39 pages
Random Forest
No ratings yet
Random Forest
14 pages
D3 IT Random Forest Apr 2023
No ratings yet
D3 IT Random Forest Apr 2023
32 pages
Deep Learning and Neural Networks
No ratings yet
Deep Learning and Neural Networks
21 pages
Lecture 5
No ratings yet
Lecture 5
53 pages
Session 7 - Random Forest
No ratings yet
Session 7 - Random Forest
8 pages
Random Forest Algorithms - Comprehensive Guide With Examples
No ratings yet
Random Forest Algorithms - Comprehensive Guide With Examples
13 pages
Lecture+Notes+-+Random Forests
No ratings yet
Lecture+Notes+-+Random Forests
10 pages
ML-Lec6
No ratings yet
ML-Lec6
4 pages
Decision Tree & Regression
No ratings yet
Decision Tree & Regression
33 pages
Classification Algorithms
No ratings yet
Classification Algorithms
68 pages
Random Forests: N 1 N J X A I X A I
No ratings yet
Random Forests: N 1 N J X A I X A I
12 pages
Random Forest (RF) : Decision Trees
No ratings yet
Random Forest (RF) : Decision Trees
3 pages
ML Mid Question Solve
No ratings yet
ML Mid Question Solve
19 pages
Machine learning
No ratings yet
Machine learning
5 pages
Schonlau Zou 2020 The Random Forest Algorithm For Statistical Learning
No ratings yet
Schonlau Zou 2020 The Random Forest Algorithm For Statistical Learning
27 pages
05.Random Forest (2)
No ratings yet
05.Random Forest (2)
3 pages
Lecture 6
No ratings yet
Lecture 6
24 pages
Machine Learning: Classification & Decision Trees
No ratings yet
Machine Learning: Classification & Decision Trees
24 pages
Random forest algorithm 1
No ratings yet
Random forest algorithm 1
14 pages
Week 12
No ratings yet
Week 12
34 pages
Random Forest Intro Presented
No ratings yet
Random Forest Intro Presented
38 pages
Random Forest
No ratings yet
Random Forest
6 pages
Random Forest - Basics
No ratings yet
Random Forest - Basics
9 pages
Ensemble Methods.pptx
No ratings yet
Ensemble Methods.pptx
32 pages
03_Random Forest
No ratings yet
03_Random Forest
24 pages
Random Forest Algorithm
No ratings yet
Random Forest Algorithm
4 pages
Random Forest
No ratings yet
Random Forest
5 pages
DecisionTrees RandomForest v2
No ratings yet
DecisionTrees RandomForest v2
27 pages
ML pp12_u2
No ratings yet
ML pp12_u2
18 pages
14 - Ensemble Methods
No ratings yet
14 - Ensemble Methods
38 pages
Random Forests For Beginners PDF
No ratings yet
Random Forests For Beginners PDF
71 pages
2023AIB1008_Lab08
No ratings yet
2023AIB1008_Lab08
8 pages
Random FOrest
No ratings yet
Random FOrest
19 pages
Week 6 - Random Forest
No ratings yet
Week 6 - Random Forest
12 pages
UNIT-V (Bagging, Boosting, Random Forest) : by Dr. K. Aditya Shastry Associate Professor Dept. of ISE NMIT, Bengaluru
No ratings yet
UNIT-V (Bagging, Boosting, Random Forest) : by Dr. K. Aditya Shastry Associate Professor Dept. of ISE NMIT, Bengaluru
27 pages
Data Science - Decision Tree - Random Forest
No ratings yet
Data Science - Decision Tree - Random Forest
15 pages
Alternating Decision Tree: Fundamentals and Applications
From Everand
Alternating Decision Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet
Decision Tree Pruning: Fundamentals and Applications
From Everand
Decision Tree Pruning: Fundamentals and Applications
Fouad Sabry
No ratings yet
De-Mystifying Math and Stats for Machine Learning: Mastering the Fundamentals of Mathematics and Statistics for Machine Learning
From Everand
De-Mystifying Math and Stats for Machine Learning: Mastering the Fundamentals of Mathematics and Statistics for Machine Learning
Seaport AI Madhavan
No ratings yet
Statistical Foundations for Psychology
From Everand
Statistical Foundations for Psychology
James C. Ware
No ratings yet
DOM T2 Question Bank
No ratings yet
DOM T2 Question Bank
6 pages
DLL COT Mapeh PE
No ratings yet
DLL COT Mapeh PE
6 pages
Activity 3.1
No ratings yet
Activity 3.1
2 pages
5b Circumference
No ratings yet
5b Circumference
8 pages
PLLT: Chapter 1 Language, Learninh and Teaching
No ratings yet
PLLT: Chapter 1 Language, Learninh and Teaching
3 pages
Elcold UNI
No ratings yet
Elcold UNI
4 pages
Materials Letters: In-Hwan Jo, Kwan-Ha Shin, Young-Mi Soon, Young-Hag Koh, Jong-Hoon Lee, Hyoun-Ee Kim
No ratings yet
Materials Letters: In-Hwan Jo, Kwan-Ha Shin, Young-Mi Soon, Young-Hag Koh, Jong-Hoon Lee, Hyoun-Ee Kim
3 pages
Lesson Plan 2nd Grade 2012
No ratings yet
Lesson Plan 2nd Grade 2012
4 pages
Galfo, Requilme & Roa-Assignment 5
No ratings yet
Galfo, Requilme & Roa-Assignment 5
5 pages
320 Two Strong Women
No ratings yet
320 Two Strong Women
6 pages
Italvibras G. Silingardi - Mvsi PDF
No ratings yet
Italvibras G. Silingardi - Mvsi PDF
5 pages
BDA Requirement For New Construction
No ratings yet
BDA Requirement For New Construction
8 pages
Treasury Management
No ratings yet
Treasury Management
6 pages
Hong Kong Experience of Using Recycled Aggregates From Construction and Demolition Materials I...
No ratings yet
Hong Kong Experience of Using Recycled Aggregates From Construction and Demolition Materials I...
10 pages
DISS Week-5
No ratings yet
DISS Week-5
20 pages
1 Weather Vocabulary
No ratings yet
1 Weather Vocabulary
7 pages
Panorama: Medium Voltage Products
No ratings yet
Panorama: Medium Voltage Products
6 pages
Working Student IT Support
No ratings yet
Working Student IT Support
2 pages
AGMVP Fanuc User manual Version 4
No ratings yet
AGMVP Fanuc User manual Version 4
70 pages
What Is The Purpose of The Johari Window
No ratings yet
What Is The Purpose of The Johari Window
5 pages
2017 Supplement Guide: Muscle Building, Strength, Recovery
No ratings yet
2017 Supplement Guide: Muscle Building, Strength, Recovery
14 pages
Extinguishment of Sale1
No ratings yet
Extinguishment of Sale1
17 pages
BS Buzz
No ratings yet
BS Buzz
8 pages
Claims For Head Office Overheads
100% (1)
Claims For Head Office Overheads
11 pages
Detailed Lesson Plan in Mathematics
No ratings yet
Detailed Lesson Plan in Mathematics
11 pages
A XControl G2 Error Codes
No ratings yet
A XControl G2 Error Codes
5 pages
Association of Chartered Certified Accountants (Acca)
No ratings yet
Association of Chartered Certified Accountants (Acca)
4 pages
digi-comp-1-parts
No ratings yet
digi-comp-1-parts
2 pages
BOLT TGE Final Paper - Nov2018
No ratings yet
BOLT TGE Final Paper - Nov2018
30 pages

Random Forests

Uploaded by

Random Forests

Uploaded by

1

 A single decision tree does not perform well

 Construct B (hundreds) of trees (no pruning)

 Reduces overfitting (variance)

 Bagging results in improved accuracy over prediction using a single tree

 Each tree is identically distributed (i.d.)

 i.d. and not i.i.d

 As in bagging, we build a number of decision trees on bootstrapped training

 To make a prediction at a new point x we do:

 For some number of trees T:

 The inventors make the following recommendations:

 No need for pruning trees

 4,718 genes measured on tissue samples from 349 patients.

# View the forest results.

OOB estimate of error rate: 1%

# Create Confusion Matrix

confusionMatrix(data=pred, reference=readingSkills$nativeSpeaker, positive='yes')

Confusion Matrix and Statistics

No Information Rate : 0.5

P-Value [Acc > NIR] : < 2.2e-16

Mcnemar's Test P-Value : NA

Pos Pred Value : 1.0

Neg Pred Value : 1.0

Detection Rate : 0.5

Detection Prevalence : 0.5

Balanced Accuracy : 1.0

'Positive' Class : yes

Mean of squared residuals: 12.07361

This plot shows the Error

#mtry is no of Variables randomly chosen at each split

pred<-predict(rf,Boston[-train,]) #Predictions on Test Set for each Tree

cat(mtry," ") #printing the output to the console }

> bestmtry <- tuneRF(Boston[,-13], Boston[,13], stepFactor=1.5,

Type of random forest: regression

Mean of squared residuals: 11.68897

error <- Boston.lm$residuals

You might also like