- other hyperparameters: [[decision tree minimum number of samples to split]], [[decision tree minimum number of samples to split]]
# Idea
The maximum depth of a decision tree is the length or distance between the root and the leaf. The parameter in `sklearn` is `max_depth`.
Because each node can only be split into two, a tree of maximum depth or length $k$ can have at most $2^k$ leaves.
The deeper the tree, the more complex the decision rules and the fitter the model (see [sklearn docs](https://scikit-learn.org/stable/modules/tree.html)), which can lead to [[overfitting]].
Choosing a lower `max_depth` can reduce the number of splits and may help to reduce overfitting.
Depth = 1, max leaves = $2^1 = 2$
![[Pasted image 20210515204952.png]]
Depth = 2, max leaves = $2^2 = 4$
![[Pasted image 20210515205008.png]]
Depth = 3, max leaves = $2^3 = 8$
![[Pasted image 20210515205017.png]]
Depth = 4, max leaves = $2^4 = 16$
![[Pasted image 20210515205033.png]]
![[Pasted image 20210125221118.png|610]]
# References
- https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html
- https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeRegressor.html
- [udacity](https://classroom.udacity.com/nanodegrees/nd229/parts/a7ab8516-6980-4c4e-87f7-b19a975d809e/modules/f47944d6-4ded-4acb-8f52-531d85697932/lessons/7bf3146d-1583-4e02-96ac-325b275892a7/concepts/a750d064-6240-47e7-87de-6e41dab807c5)