- same analyses as [[220117_163635 medium vs substack - popular 150 monthly june 2021 to dec 2021]] but focus on **least popular 150 per month**
- Why focus on the least popular headlines? Perhaps the most popular headlines are written by authors who already know how/what to write to appeal to their audience. Whereas the least popular headlines could be written by newbies to the platforms who haven't built up a following yet—so they're "purer" writers (reflects either their personal style or what the style/content they think would work for each platform)
# Model 1
- 8 input columns (headline text, headline length, polarity [-1, 1], subjectivity [0, 1], 4 [[VADER]] sentiment columns)
- transformed into 4098 features
- mean prediction accuracy: 64.62%
![[feature_imp__unpopular150monthlyjune__headline-headline_len-polarity_subjectivity 1.png|700]]
# Model 2 (excludes headline text)
- 7 input columns/features (headline length, polarity [-1, 1], subjectivity [0, 1])
- mean prediction accuracy: 55.85%
![[feature_imp__unpopular150monthlyjune__headline_len-polarity_subjectivity 1.png|700]]