- [[20210622_140058 meet Gord Dave]]
# new sources
- newguard, fact-checkers, [[IFFY quotient]]
# problems
- tradeoff: block size, match rate, recency, tweet frequency
# covariates for blocking
- retweet/tweet link quality since pull time
- no. of tweets/retweets since pull time
- mean no. of daily tweets/retweets since pull time
- no. of links shared since pull time (all links, not just those that match our lists)
- overall likes, favorites, tweets, followers, following (all time)
- account age
- others: language toxicity
# covariates for modeling/predicting match rate
```python
```
# 20210623_205849
- 10 days: 14 June to 23 June
- searched 77 news sources for retweets/tweets containing links to these sources
- 874637 tweets with links
- 219733 unique users with at least 1 tweet
- 127715 users shared only 1 tweet with link
- 90185 shared **2 to 50** tweets
- 1000 random samples of audience (size = 500) uploaded (see batch 67 in [[20210405_114424 clustering match rates]]): match rates are between .3 to .4