# fact-checker summed badness
- threshold: 70 (for domains where original fc rating is > 70, badness = 0)
- x: original fact-checker ratings (good to bad)
- y: transformed fact-checker ratings (higher = worse)
- good domains (original ratings > 70 have new badness scores [y-axis] of 0)
- see also [[220310_101008 fact-checker summed badness - threshold 30|results when threshold is 30]]
![[dv_fc_badness_70.png|900]]
# descriptives
32888 obs 17 cols
| | NUnique| PercentMissing| Mean| SD| Min| Median| Max| Histogram|
|:----------|-------:|--------------:|-------:|-------:|-------:|-------:|--------:|----------:|
|block | 5424| 0| 2897.31| 1577.02| 0| 2973.00| 5423| ▅▅▅▅▆▆▆▆▇▇|
|weight | 21| 0| 2.00| 0.22| 1.67| 2.00| 2.50| ▂▁ ▇ ▁▁▁|
|sum_t0 | 5091| 0| 400.41| 931.10| 0.00| 93.41| 40279.17| ▇|
|sum_t1 | 4477| 0| 352.66| 1016.88| 0.00| 51.65| 44935.39| ▇|
|count_t0 | 220| 0| 11.46| 22.45| 0| 3.00| 522| ▇|
|count_t1 | 273| 0| 10.18| 25.48| 0| 2.00| 652| ▇|
|conditionC | 2| 0| 0.00| 0.50| -0.50| -0.50| 0.50| ▇▇|
|sum_t0C | 5305| 0| 0.00| 931.10| -400.41| -307.00| 39878.76| ▇|
|count_t0L | 220| 0| 1.56| 1.34| 0.00| 1.39| 6.26| ▇▅▄▃▃▂▁|
|count_t0LC | 220| 0| 0.00| 1.34| -1.56| -0.18| 4.70| ▇▅▄▃▃▂▁|
|sum_t0L | 5091| 0| 3.51| 2.96| 0.00| 4.55| 10.60| ▇▁▃▃▂▁|
|sum_t0LC | 5091| 0| 0.00| 2.96| -3.51| 1.04| 7.10| ▇▁▃▃▂▁|
|count_t1L | 273| 0| 1.35| 1.34| 0.00| 1.10| 6.48| ▇▄▃▃▂▁▁|
|count_t1LC | 273| 0| 0.00| 1.34| -1.35| -0.25| 5.14| ▇▄▃▃▂▁▁|
After winsorizing (99th percentile)
| | NUnique| PercentMissing| Mean| SD| Min| Median| Max| Histogram|
|:----------|-------:|--------------:|-------:|-------:|-------:|-------:|--------:|----------:|
|block | 5424| 0| 2897.31| 1577.02| 0| 2973.00| 5423| ▅▅▅▅▆▆▆▆▇▇|
|weight | 21| 0| 2.00| 0.22| 1.67| 2.00| 2.50| ▂▁ ▇ ▁▁▁|
|sum_t0 | 4764| 0| 375.56| 708.56| 0.00| 93.41| 4217.76| ▇▁|
|sum_t1 | 4149| 0| 321.28| 701.00| 0.00| 51.65| 4384.77| ▇▁|
|count_t0 | 108| 0| 11.01| 19.25| 0.00| 3.00| 106.13| ▇▁▁|
|count_t1 | 116| 0| 9.44| 19.18| 0.00| 2.00| 115.00| ▇▁|
|conditionC | 2| 0| 0.00| 0.50| -0.50| -0.50| 0.50| ▇▇|
|sum_t0C | 5305| 0| 0.00| 931.10| -400.41| -307.00| 39878.76| ▇|
|count_t0L | 108| 0| 1.56| 1.34| 0.00| 1.39| 4.67| ▇▃▄▂▃▂▂▂▁▁|
|count_t0LC | 108| 0| 0.00| 1.34| -1.56| -0.18| 3.11| ▇▃▄▂▃▂▂▂▁▁|
|sum_t0L | 4764| 0| 3.50| 2.95| 0.00| 4.55| 8.35| ▇▁▂▃▂▂▁|
|sum_t0LC | 4764| 0| 0.00| 2.95| -3.50| 1.05| 4.85| ▇▁▂▃▂▂▁|
|count_t1L | 116| 0| 1.34| 1.33| 0.00| 1.10| 4.75| ▇▂▃▂▂▂▁▁▁▁|
|count_t1LC | 116| 0| 0.00| 1.33| -1.34| -0.24| 3.41| ▇▂▃▂▂▂▁▁▁▁|
# models
```r
> m <- feglm(sum_t1 ~ conditionC * sum_t0LC | block, dt1[domain_type == "overall"], family = "quasipoisson", vcov = "HC1")
NOTE: 462 fixed-effects (2,523 observations) removed because of only 0 outcomes.
> m
GLM estimation, family = quasipoisson, Dep. Var.: sum_t1
Observations: 30,365
Fixed-effects: block: 4,962
Standard-errors: Heteroskedasticity-robust
Estimate Std. Error t value Pr(>|t|)
conditionC 0.027051 0.057180 0.473084 0.63616
sum_t0LC 0.166327 0.011313 14.701751 < 2.2e-16 ***
conditionC:sum_t0LC -0.012958 0.017661 -0.733692 0.46314
# winsorize time0 and time1 summed badness, 99th percentile
dt1[, sum_t0 := Winsorize(sum_t0, probs = c(0, 0.99))]
dt1[, sum_t1 := Winsorize(sum_t1, probs = c(0, 0.99))]
> m <- feglm(sum_t1 ~ conditionC * sum_t0LC | block, dt1[domain_type == "overall"], family = "quasipoisson", vcov = "HC1")
NOTE: 462 fixed-effects (2,523 observations) removed because of only 0 outcomes.
> m
GLM estimation, family = quasipoisson, Dep. Var.: sum_t1
Observations: 30,365
Fixed-effects: block: 4,962
Standard-errors: Heteroskedasticity-robust
Estimate Std. Error t value Pr(>|t|)
conditionC -0.047660 0.039266 -1.213762 0.22485
sum_t0LC 0.194941 0.008117 24.016215 < 2.2e-16 ***
conditionC:sum_t0LC 0.011093 0.012059 0.919895 0.35764
```
# user CDFs
Only showing time1 summed badness up to 5000.
- x: user's summed badness
- y: proportion (bottom: proportion difference)
![[dv_fc_cdf_summedbadness70.png|800]]
Winsorize 99th percentile
![[dv_fc_cdf_summedbadness70_winsorize.png|800]]