- `credible_descriptive.Rmd`
# urls and domains
- 8,937,779 urls (excluding many popular domains like youtube, wiki, amazon)
- unique urls: 1,709,701
- 1,030,725 appeared only once
- unique domains: 126,648
- excluding urls were shared only 1 time, we have 54,944 unique domains
- excluding urls were shared only 2 times, we have 36,062 unique domains
- excluding urls were shared only 10 times, we have 12,225 unique domains
- excluding urls were shared only 100 times, we have 12,225 unique domains
```r
# most popular domains
> urls[, .(count = sum(count)), domain][order(-count)][1:40]
domain count
1: foxnews.com 151189
2: nytimes.com 151170
3: cnn.com 137781
4: dailymail.co.uk 121566
5: washingtonpost.com 112468
6: rumble.com 107211
7: nypost.com 103684
8: thehill.com 101458
9: rawstory.com 100188
10: reuters.com 91152
11: theguardian.com 79198
12: msnbc.com 72801
13: rt.com 71629
14: thepostmillennial.com 66962
15: breitbart.com 66215
16: nbcnews.com 65640
17: thegatewaypundit.com 64518
18: yahoo.com 59014
19: newsmax.com 57942
20: msn.com 53991
21: conservativebrief.com 52810
22: apnews.com 51430
23: zerohedge.com 49789
24: babylonbee.com 47728
25: bloomberg.com 47499
26: dailywire.com 45550
27: theepochtimes.com 45135
28: justthenews.com 44554
29: bbc.com 43297
30: wsj.com 43221
31: thedailybeast.com 42218
32: news.yahoo.com 42092
33: opindia.com 38541
34: politico.com 37035
35: rsbnetwork.com 32755
36: bbc.co.uk 32726
37: telegraph.co.uk 32527
38: businessinsider.com 32470
39: go.com 32344
40: cbsnews.com 31714
domain count
```
# [Is this Credible?](https://www.isthiscredible.com) - headline-level ratings
- [[220129_142818 credible scores correlate with pre-test|preliminary analyses: correlations with pre-test and other rated domains]]
- managed to checked 571,355 unique urls
- 212,078 unique urls with scores
- 1763 unique domains scored
- 20829 questionable news source + 4394 questionable news
- recoded `score = 0`
- 270 unique domains
- ~2k rated domains: 1763 (rated) + 270 (unrated but clearly questionable)
- [domain avg scores csv](https://www.dropbox.com/s/2zpczbtfx1wai1f/credible_scores_domain_avg.csv?dl=0)
- fc: 41 domains of 60 are rated
- ng: 145 of 207
- mbfc: 1364 of 3216
- afm: 236 of 283
- misinfome/iffy: 177 of 471
```r
> scores[, .N, quality][order(-N)]
quality N
1: see score 212078
2: Not a rated news source 121457
3: Not Able to Analyze 64017
4: Story is too short to be rated (< 250 words). 48046
5: Unable to fetch content 36568
6: Questionable News Source 20829
7: error_no_response 18810
8: Not an English language site 15942
9: ""Story is too short to be rated (< 250 words)."" 8338
10: Mixed Content/Stories detected on page 7192
11: Questionable News 4394
12: Error analyzing content 4295
13: ""Unable to fetch content"" 2218
14: No Content on Page 1196
15: ""Not an English language site"" 1142
16: ""This article appears to be behind a paywall. Cannot see citations."" 780
17: Unable to analyze 736
18: ""Error analyzing content"" 621
19: Forbidden 581
20: This article appears to be behind a paywall. Cannot see citations. 563
21: Unable to get content 411
22: Error Analyzing content 336
23: Not Found 319
24: ""Unable to get content"" 115
25: ""No Content on Page"" 77
26: ""Error Analyzing content"" 70
27: Content could not be analyzed. 46
28: Not Analyzable"" 41
29: ""Forbidden"" 33
30: ""Not Able to Analyze"" 31
31: Gone 10
32: ""Content could not be analyzed."" 10
33: ""Not Found"" 8
34: ""Story has less than 3 citations and may be an opinion piece or breaking news."" 7
35: Multiple Stories in Article 7
36: Bad Request 6
37: Not Analyzable 5
38: Internal Server Error 3
39: Error 3
40: Service Unavailable 3
41: ""Service Temporarily Unavailable"" 2
42: not rated site 2
43: Not Modified 2
44: ""Gateway Time-out"" 1
45: Bad Gateway 1
46: Story has less than 3 citations and may be an opinion piece or breaking news. 1
47: {}"" 1
48: ""Internal Server Error"" 1
quality N
```
```r
# top 20 and worst 20 RATED domains (based on average score)
# score: average score
# n: times shared
> domain_score_avg[c(1:20, (.N-20):.N)]
domain score n
1: maplight.org 89.00000 1
2: polygraph.info 87.60000 5
3: factcheck.org 87.41538 65
4: climatefeedback.org 87.00000 1
5: quantamagazine.org 84.40000 110
6: the-scientist.com 84.21538 65
7: limacharlienews.com 84.00000 1
8: sciencealert.com 83.83645 214
9: bluestemprairie.com 83.50000 2
10: smithsonianmag.com 82.74059 478
11: sunlightfoundation.com 82.00000 2
12: climatecentral.org 82.00000 2
13: cpp.edu 82.00000 1
14: placesjournal.org 81.33333 3
15: thebulletin.org 81.30645 62
16: mongabay.com 81.00000 4
17: tmc.edu 81.00000 3
18: wilsonquarterly.com 81.00000 1
19: tamhsc.edu 81.00000 1
20: undark.org 80.10000 40
---
21: emirates247.com 27.00000 1
22: atlasnetwork.org 27.00000 1
23: utsystem.edu 27.00000 1
24: miami.edu 27.00000 1
25: dailysurge.com 27.00000 1
26: renewedright.com 26.25000 40
27: post-gazette.com 26.00000 1
28: msn.com 26.00000 1
29: citylab.com 26.00000 4
30: couriermail.com.au 26.00000 1
31: jhuapl.edu 26.00000 1
32: newsela.com 26.00000 1
33: wnyc.org 26.00000 2
34: theodysseyonline.com 24.25000 4
35: legitgov.org 24.00000 1
36: mrc.org 21.00000 1
37: apple.com 18.00000 3
38: prageru.com 17.00000 4
39: fcnp.com 17.00000 1
40: newsgru.com 17.00000 1
41: livestrong.com 17.00000 1
domain score n
```
```r
# fc and credible ratings
> ratings4[, .(domain, fc, credible)][!is.na(fc)][order(-fc)]
domain fc credible
1: washingtonpost.com 1.0000 68.67840
2: nytimes.com 1.0000 69.27448
3: cnn.com 0.9231 54.94118
4: bbc.co.uk 0.8901 58.31250
5: bostonglobe.com 0.8242 56.57615
6: latimes.com 0.8242 67.01519
7: wsj.com 0.7912 64.12860
8: msnbc.com 0.7253 54.30508
9: cbsnews.com 0.7253 64.22920
10: usatoday.com 0.7253 64.84035
11: sfchronicle.com 0.6484 62.88475
12: abcnews.go.com 0.6154 NA
13: chicagotribune.com 0.5824 61.58108
14: huffingtonpost.com 0.5165 56.56522
15: dailymail.co.uk 0.4835 42.36571
16: foxnews.com 0.4835 54.59372
17: aol.com/news 0.4505 NA
18: nypost.com 0.4176 52.06535
19: nydailynews.com 0.3736 47.22960
20: breitbart.com 0.1758 53.28260
21: dailywire.com 0.1758 52.96334
22: dailykos.com 0.1758 48.68154
23: newsmax.com 0.1429 NA
24: crooksandliars.com 0.1429 44.55621
25: dailycaller.com 0.1429 53.89865
26: thedailysheeple.com 0.0989 NA
27: rawstory.com 0.0989 53.49583
28: ijr.com 0.0989 70.70370
29: westernjournal.com 0.0659 54.44807
30: newspunch.com 0.0659 NA
31: redstate.com 0.0659 44.51575
32: yournewswire.com 0.0659 0.00000
33: channel24news.com 0.0659 NA
34: infowars.com 0.0330 0.00000
35: thepoliticalinsider.com 0.0330 54.99548
36: commondreams.org 0.0330 67.50730
37: conservativetribune.com 0.0330 NA
38: freedomdaily.com 0.0330 0.00000
39: whatdoesitmean.com 0.0000 0.00000
40: blacklistednews.com 0.0000 0.00000
41: angrypatriotmovement.com 0.0000 NA
42: bb4sp.com 0.0000 NA
43: beforeitsnews.com 0.0000 0.00000
44: clashdaily.com 0.0000 0.00000
45: conservativedailypost.com 0.0000 0.00000
46: patriotpost.us 0.0000 44.58378
47: dailysignal.com 0.0000 50.62857
48: antiwar.com 0.0000 50.27097
49: now8news.com 0.0000 NA
50: downtrend.com 0.0000 NA
51: activepost.com 0.0000 NA
52: dailybuzzlive.com 0.0000 0.00000
53: notallowedto.com 0.0000 NA
54: onepoliticalplaza.com 0.0000 NA
55: newsbreakshere.com 0.0000 NA
56: americannews.com 0.0000 NA
57: thenewyorkevening.com 0.0000 NA
58: socialeverythings.com 0.0000 NA
59: react365.com 0.0000 NA
60: realnewsrightnow.com 0.0000 0.00000
domain fc credible
```
```r
# best/worst RATED urls
> scored[order(-score)][c(1:20, (.N-20):.N)]
url domain bias cred score quality
1: https://www.smithsonianmag.com/smart-news/hilma-af-klint-new-works-watercolors-tree-of-knowledge-sale-180978920 smithsonianmag.com CENTER VERY_HIGH 99 see score
2: https://www.smithsonianmag.com/smart-news/giant-ram-head-statues-found-in-egypt-180978914 smithsonianmag.com CENTER VERY_HIGH 99 see score
3: https://www.smithsonianmag.com/smart-news/missouri-cave-picture-osage-nation-auction-180978627 smithsonianmag.com CENTER VERY_HIGH 99 see score
4: https://www.smithsonianmag.com/smart-news/rediscovered-medieval-manuscript-offers-new-twist-on-arthurian-legend-180978705 smithsonianmag.com CENTER VERY_HIGH 99 see score
5: https://www.factcheck.org/2021/07/scicheck-covid-19-vaccine-generated-spike-protein-is-safe-contrary-to-viral-claims factcheck.org CENTER VERY_HIGH 99 see score
6: https://www.smithsonianmag.com/smart-news/mexico-documents-colonial-cortes-stolen-recovered-180978767 smithsonianmag.com CENTER VERY_HIGH 99 see score
7: https://www.smithsonianmag.com/smart-news/window-kerry-marshall-washington-cathedral-confederacy-stonewall-lee-180978743 smithsonianmag.com CENTER VERY_HIGH 99 see score
8: https://www.smithsonianmag.com/smart-news/trove-of-unseen-photos-documents-indigenous-culture-in-1920s-alaska-180978713 smithsonianmag.com CENTER VERY_HIGH 99 see score
9: https://www.quantamagazine.org/the-cartoon-picture-of-magnets-that-has-transformed-science-20200624 quantamagazine.org CENTER VERY_HIGH 99 see score
10: https://www.smithsonianmag.com/smart-news/tiles-fit-for-the-emperor-found-at-remains-of-roman-building-beneath-english-cricket-club-180978763 smithsonianmag.com CENTER VERY_HIGH 99 see score
11: https://www.smithsonianmag.com/smart-news/stonehenge-repairs-decades-monument-180978630 smithsonianmag.com CENTER VERY_HIGH 99 see score
12: https://www.smithsonianmag.com/smart-news/first-edition-of-mary-shelleys-frankenstein-breaks-records-at-auction-180978724 smithsonianmag.com CENTER VERY_HIGH 99 see score
13: https://www.smithsonianmag.com/smart-news/how-bags-became-ultimate-fashion-accessory-180976690 smithsonianmag.com CENTER VERY_HIGH 99 see score
14: https://www.smithsonianmag.com/smart-news/mexico-exhibition-showcases-prehispanic-artifacts-recovered-from-abroad-180978801 smithsonianmag.com CENTER VERY_HIGH 99 see score
15: https://www.smithsonianmag.com/smart-news/diver-finds-crusader-sword-off-israels-coast-180978884 smithsonianmag.com CENTER VERY_HIGH 99 see score
16: https://www.smithsonianmag.com/smart-news/la-palma-island-volcano-eruption-sends-lava-flowing-to-residential-buildings-180978761 smithsonianmag.com CENTER VERY_HIGH 98 see score
17: https://www.newscientist.com/article/2237475-covid-19-news-28-million-years-of-life-lost-globally-to-covid/#echobox=1634813602 newscientist.com CENTER VERY_HIGH 98 see score
18: https://www.newscientist.com/article/2237475-covid-19-news-model-predicts-uk-cases-will-fall-even-without-plan-b newscientist.com CENTER VERY_HIGH 98 see score
19: https://www.bbc.com/future/article/20210524-the-reason-wild-forests-beat-plantations bbc.com LEAN_LEFT VERY_HIGH 98 see score
20: https://www.quantamagazine.org/dna-has-four-bases-some-viruses-swap-in-a-fifth-20210712 quantamagazine.org CENTER VERY_HIGH 98 see score
---
21: https://www.beliefnet.com/entertainment/celebrities/galleries/8-celebrities-that-passionately-love-jesus-christ.aspx beliefnet.com RIGHT MIXED 12 see score
22: https://www.forbes.com/sites/amyblaschka/2021/10/15/6-ways-to-maximize-creativity-when-you-dont-think-youre-creative forbes.com LEAN_RIGHT MIXED 12 see score
23: https://raiderswire.usatoday.com/lists/10-worst-personnel-decisions-during-jon-gruden-second-stint-with-raiders usatoday.com LEAN_LEFT HIGH 12 see score
24: https://www.buzzfeed.com/southerncomfort/signs-love-might-be-right-under-your-nose buzzfeed.com LEAN_LEFT MIXED 12 see score
25: https://www.webmd.com/osteoarthritis/ss/slideshow-hand-finger-exercises webmd.com CENTER HIGH 12 see score
26: https://www.beliefnet.com/faiths/galleries/top-10-bible-foods-that-heal.aspx beliefnet.com RIGHT MIXED 12 see score
27: https://www.dailymail.co.uk/news/article-5781735/un-knew-decade-sex-food-scandal-involving-15-charities.html dailymail.co.uk RIGHT MIXED 12 see score
28: https://www.houstonchronicle.com/food-culture/restaurants-bars/article/alison-cook-the-5-best-things-i-ate-last-week-16510674.php houstonchronicle.com LEAN_LEFT HIGH 12 see score
29: https://www.beliefnet.com/entertainment/celebrities/7-stars-who-give-glory-to-god.aspx beliefnet.com RIGHT MIXED 12 see score
30: https://www.beliefnet.com/faiths/christianity/galleries/5-important-things-the-bible-teaches-about-heaven-that-we-often-forget.aspx beliefnet.com RIGHT MIXED 12 see score
31: https://www.forbes.com/sites/qai/2021/10/06/6-reasons-stock-picking-is-risky forbes.com LEAN_RIGHT MIXED 11 see score
32: https://www.forbes.com/sites/kimberlywhitler/2021/10/09/10-attributes-of-successful-digital-business-transformation forbes.com LEAN_RIGHT MIXED 11 see score
33: https://sputniknews.com/20211008/top-10-halloween-movies-1089780943.html sputniknews.com LEAN_RIGHT MIXED 10 see score
34: https://www.beliefnet.com/love-family/relationships/marriage/9-things-a-husband-needs-from-his-wife.aspx beliefnet.com RIGHT MIXED 9 see score
35: https://theathletic.com/2897176/2021/10/18/the-top-5-breaking-down-the-texas-nascar-playoff-race theathletic.com MIXED 9 see score
36: https://www.dailywire.com/news/11-books-every-boy-should-read dailywire.com RIGHT MIXED 6 see score
37: https://www.forbes.com/sites/forbes-personal-shopper/2021/08/27/best-september-birthstone-jewelry-gifts forbes.com LEAN_RIGHT MIXED 5 see score
38: https://www.forbes.com/sites/jillgriffin/2020/05/18/6-ways-to-win-back-your-post-pandemic-customers forbes.com LEAN_RIGHT MIXED 3 see score
39: https://www.beliefnet.com/inspiration/angels/galleries/7-things-you-should-know-about-guardian-angels.aspx beliefnet.com RIGHT MIXED 2 see score
40: https://www.rappler.com/technology/gaming/ps5-xbox-series-x-games-120hz rappler.com LEFT MIXED 2 see score
41: https://www.beliefnet.com/inspiration/angels/galleries/7-signs-you-may-be-an-angel-on-earth.aspx beliefnet.com RIGHT MIXED 2 see score
url domain bias cred score quality
```
```r
# unrated but questionable domains
> scores[quality %in% recode0, .N, domain][order(-N)]
domain N
1: theepochtimes.com 5163
2: opindia.com 2110
3: zerohedge.com 2012
4: thegatewaypundit.com 1684
5: dailystar.co.uk 1131
---
266: cernovich.com 1
267: stormcloudsgathering.com 1
268: thinkingmomsrevolution.com 1
269: voiceofeurope.com 1
270: whydontyoutrythis.com 1
```
# correlations
```r
# Correlation Matrix (pearson-method)
Parameter | credible | misinfome | mbfc_bias | mbfc_fact | afm_rely | afm_bias | afm | ng
--------------------------------------------------------------------------------------------------
fc | 0.61*** | 0.69 | 0.72*** | 0.61*** | 0.79*** | 0.80*** | 0.82*** | 0.76***
ng | 0.78*** | 0.15 | 0.57*** | 0.84*** | 0.79*** | 0.69*** | 0.75*** |
afm | 0.58*** | 0.56 | 0.79*** | 0.72*** | 0.97*** | 0.98*** | |
afm_bias | 0.51*** | 0.53 | 0.79*** | 0.67*** | 0.90*** | | |
afm_rely | 0.63*** | 0.56 | 0.75*** | 0.74*** | | | |
mbfc_fact | 0.79*** | 0.01 | 0.62*** | | | | |
mbfc_bias | 0.50*** | 0.09 | | | | | |
misinfome | -0.04 | | | | | | |
p-value adjustment method: Holm (1979)
```
![[Pasted image 20220225011126.png]]
# results
![[Pasted image 20220225125858.png]]
![[Pasted image 20220225125945.png]]