ensemble method

- [[overfitting]], [[bagging]] # Idea [[random forests]], [[bagging]], [[stacking]], [[XGboost]] In Kaggle competitions, XGBoost is often more popular than random forests. [[error is bias plus variance]] Boosting is based on weak learners (high bias, low variance) whereas [[random forests]] uses fully grown trees (low bias, high variance)^[https://stats.stackexchange.com/questions/173390/gradient-boosting-tree-vs-random-forest]. Boosting tries to reduce bias, whereas forests try to reduce variance. # References - https://www.kaggle.com/cerberus4229/voting-regressor-with-pipelines - https://scikit-learn.org/stable/modules/calibration.html - https://stats.stackexchange.com/questions/173390/gradient-boosting-tree-vs-random-forest - https://fastml.com/what-is-better-gradient-boosted-trees-or-random-forest/ - https://github.com/dmlc/xgboost/blob/master/demo/guide-python/sklearn_examples.py - https://www.kaggle.com/search - https://towardsdatascience.com/getting-started-with-xgboost-in-scikit-learn-f69f5f470a97