- simplest logistic regression classifier # medium vs substack - 1050 headlines per source ```python # 5-fold cross validation accuracies (logistic regression) [ 0.5714, 0.5405, 0.5929, 0.5667, 0.5667] ``` - medium/substack coded 0, 1 - negative coef = more predictive of medium ![[feature_imp__popular150monthlyjune__sentiment.png|900]] # economist vs dailymail - 93 headlines per source ```python # 5-fold cross validation accuracies (logistic regression) array([ 0.5789, 0.6216, 0.5405, 0.6216, 0.7027]) ``` - dailymail/economist coded 0, 1 - negative coef = more predictive of dailymail ![[feature_imp__economistdailymail__sentiment.png|900]] Decision tree classifier feature importance ![[Pasted image 20220205081014.png]]