decision tree maximum depth

- other hyperparameters: [[decision tree minimum number of samples to split]], [[decision tree minimum number of samples to split]] # Idea The maximum depth of a decision tree is the length or distance between the root and the leaf. The parameter in `sklearn` is `max_depth`. Because each node can only be split into two, a tree of maximum depth or length $k$ can have at most $2^k$ leaves. The deeper the tree, the more complex the decision rules and the fitter the model (see [sklearn docs](https://scikit-learn.org/stable/modules/tree.html)), which can lead to [[overfitting]]. Choosing a lower `max_depth` can reduce the number of splits and may help to reduce overfitting. Depth = 1, max leaves = $2^1 = 2$ ![[Pasted image 20210515204952.png]] Depth = 2, max leaves = $2^2 = 4$ ![[Pasted image 20210515205008.png]] Depth = 3, max leaves = $2^3 = 8$ ![[Pasted image 20210515205017.png]] Depth = 4, max leaves = $2^4 = 16$ ![[Pasted image 20210515205033.png]] ![[Pasted image 20210125221118.png|610]] # References - https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html - https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeRegressor.html - [udacity](https://classroom.udacity.com/nanodegrees/nd229/parts/a7ab8516-6980-4c4e-87f7-b19a975d809e/modules/f47944d6-4ded-4acb-8f52-531d85697932/lessons/7bf3146d-1583-4e02-96ac-325b275892a7/concepts/a750d064-6240-47e7-87de-6e41dab807c5)