Why is the alternative to GridSearchCV in RandomForestClassifier documented to be OOB?

by shadi   Last Updated January 14, 2018 12:19 PM

The sklearn documentation for gridsearch (link) puts Out of Bag Estimates in a subsection under 3.2.4. Alternatives to brute force parameter search. I understand each of grid search and OOB, but I don't understand how it's an alternative. For example, if I need to determine the ideal max_features parameter to use with RandomForestClassifier, how would I use OOB instead of GridSearch? I can imagine for example using GridSearch with the scoring parameter being a callable returning the OOB, but it's not really an alternative as much as a complementary feature.

Answers 1

Both OOB and CV try to provide honest estimates of performance. OOB is basically "for free", while CV is more accurate.

So instead of picking the optimal column subsampling proportion by CV, you could try different values and pick the one with best OOB score.

Michael M
Michael M
January 14, 2018 17:08 PM

Related Questions

Random Forest Regression: define the split function

Updated August 06, 2015 15:08 PM