- simplest logistic regression classifier
# medium vs substack
- 1050 headlines per source
```python
# 5-fold cross validation accuracies (logistic regression)
[ 0.5714, 0.5405, 0.5929, 0.5667, 0.5667]
```
- medium/substack coded 0, 1
- negative coef = more predictive of medium
![[feature_imp__popular150monthlyjune__sentiment.png|900]]
# economist vs dailymail
- 93 headlines per source
```python
# 5-fold cross validation accuracies (logistic regression)
array([ 0.5789, 0.6216, 0.5405, 0.6216, 0.7027])
```
- dailymail/economist coded 0, 1
- negative coef = more predictive of dailymail
![[feature_imp__economistdailymail__sentiment.png|900]]
Decision tree classifier feature importance
![[Pasted image 20220205081014.png]]