- [[overfitting]], [[bagging]]
# Idea
[[random forests]], [[bagging]], [[stacking]], [[XGboost]]
In Kaggle competitions, XGBoost is often more popular than random forests.
[[error is bias plus variance]]
Boosting is based on weak learners (high bias, low variance) whereas [[random forests]] uses fully grown trees (low bias, high variance)^[https://stats.stackexchange.com/questions/173390/gradient-boosting-tree-vs-random-forest]. Boosting tries to reduce bias, whereas forests try to reduce variance.
# References
- https://www.kaggle.com/cerberus4229/voting-regressor-with-pipelines
- https://scikit-learn.org/stable/modules/calibration.html
- https://stats.stackexchange.com/questions/173390/gradient-boosting-tree-vs-random-forest
- https://fastml.com/what-is-better-gradient-boosted-trees-or-random-forest/
- https://github.com/dmlc/xgboost/blob/master/demo/guide-python/sklearn_examples.py
- https://www.kaggle.com/search
- https://towardsdatascience.com/getting-started-with-xgboost-in-scikit-learn-f69f5f470a97