The sklearn documentation for gridsearch (link) puts
22.214.171.124. Out of Bag Estimates in a subsection under
3.2.4. Alternatives to brute force parameter search. I understand each of grid search and OOB, but I don't understand how it's an
alternative. For example, if I need to determine the ideal
max_features parameter to use with
RandomForestClassifier, how would I use OOB instead of GridSearch? I can imagine for example using GridSearch with the
scoring parameter being a callable returning the OOB, but it's not really an alternative as much as a
Both OOB and CV try to provide honest estimates of performance. OOB is basically "for free", while CV is more accurate.
So instead of picking the optimal column subsampling proportion by CV, you could try different values and pick the one with best OOB score.