decision tree minimum number of samples per leaf

- other hyperparameters: [[decision tree maximum depth]], [[decision tree minimum number of samples to split]] # Idea When we split a node, we can set a minimum number of samples we allow on each leaf (child). The `sklearn` parameter is `min_samples_leaf`; if `float` is provided, it's the fraction of the number of samples. > This number can be specified as an integer or as a float. If it's an integer, it's the minimum number of samples allowed in a leaf. If it's a float, it's the minimum percentage of samples allowed in a leaf. For example, 0.1, or 10%, implies that a particular split will not be allowed if one of the leaves that results contains less than 10% of the samples in the dataset. This hyperparameter prevents leafs from having too few or many samples (e.g., a node with 100 samples gets split into two leafs with 99 and 1 samples respectively—such splitting isn't useful). ![[Pasted image 20210125221422.png|610]] # References - [udacity](https://classroom.udacity.com/nanodegrees/nd229/parts/a7ab8516-6980-4c4e-87f7-b19a975d809e/modules/f47944d6-4ded-4acb-8f52-531d85697932/lessons/7bf3146d-1583-4e02-96ac-325b275892a7/concepts/a750d064-6240-47e7-87de-6e41dab807c5)