A caret call I frequently use. Given that x is training data and y response,
library(doMC) registerDoMC(cores=6) tc <- trainControl(method="repeatedcv", number=10, repeats=1, returnData=TRUE, savePredictions="all", verboseIter=TRUE, classProbs=TRUE) mod <- train(x=x, y=y, trControl=tc, method="rf", tuneGrid=data.frame(mtry=500))
- library(doMC) and registerDoMC allow me to use more than one processor
- repeatedcv: if more than one repeat of k-fold crossvalidation is requested, the
repeated=parameter should be modified.repeatedcvmust be used instead ofcv - savePredictions: if we want to evaluate predictions on our own
- verboseIter: to see the progress
- classProbs: to report class probabilities, so we can use them to calculate ROC post factum
- tuneGrid: if not specified, caret will tune parameters. Normally, we don’t want that