Hey everyone,
We all know that the good old plus/minus is flawed. Most notably, it doesn’t take into account the quality of your teammates or that of the opposition. So here is my idea: why not use a Poisson regression?
The Poisson regression is a special type of regression where the response variable is a count. For example, ecologists will model the number of fish in samples of a lake of varying volume, accounting for different characteristics of that volume, and actuaries will model the number of claims someone will make during a car insurance policy of varying duration accounting for different characteristics of the driver.
UPDATE 2020: skimr v2 now produces nice html in rmarkdown, so skimr::kable() has been deprecated. https://www.r-bloggers.com/reintroducing-skimr-v2-a-year-in-the-life-of-an-open-source-r-project/
Introduction Ratemaking models in insurance routinely use Poisson regression to model the frequency of auto insurance claims. They usually are GLMs but some insurers are moving towards GBMs, such as xgboost.
xgboosthas multiple hyperparameters that can be tuned to obtain a better predictive power. There are multiple ways to tune these hyperparameters. In order of efficiency are the grid search, the random search and the bayesian optimization search.