Random Forest

Optimal performance with Random Forests: does feature selection beat tuning?

This blog post demonstrates that the presence of irrelevant variables can reduce the performance of the Random Forest algorithm (as implemented in R by `ranger()`). The solution is either to tune one of the algorithm's parameters, OR to remove irrelevant features using a procedure called Recursive Feature Elimination (RFE).

Improving a parametric regression model using machine learning

In this post, I explore how we can improve a parametric regression model by comparing its predictions to those of a Random Forest model. This might informs us in what ways the OLS model fails to capture all non-linearities and interactions between the predictors.