# Idea The Classification and Regression Trees algorithm is used in [[scikit-learn]] to train [[decision trees]]. It is a [[greedy algorithm]]—it searches for an optimum split at each depth level and doesn't consider whether or not a particular split will lead to better results (lowest impurity) a greater depths/lower levels. Algorithm - split the training set into two subsets using *single feature* $k$ and a *threshold* $t_k$ (e.g., feature `petal_length` whose threshold or `length < 2.45 `) - search for the pair $(k, t_k)$ that produces the purest subsets (weighted by size) (see [[impurity measure]], [[Gini impurity]]) - repeat the above two steps for every child node recursively CART cost function for classification (from [[Geron 2019 hands-on machine learning with scikit-learn keras tensorflow]]) $ J\left(k, t_{k}\right)=\frac{m_{\mathrm{left}}}{m} G_{\mathrm{left}}+\frac{m_{\mathrm{right}}}{m} G_{\mathrm{right}} $ - $G_{left}$: impurity of the left - $m_{left}$: number of instances/samples in the left subset # References