# mbfc summed badness (3216)
- `mbfc = mean(mbfc_fact, mbfc_bias)`; see [[different measures of domain quality]]
- threshold: 80 (for domains where original mbfc_min rating is > 80, badness = 0)
- good domains (original ratings > 80 have new badness scores [y-axis] of 0)
- see also [[220310_140604 mbfc_min threshold 80|mbfc_min threshold 80]]
# descriptives
32888 obs 17 cols
| | NUnique| PercentMissing| Mean| SD| Min| Median| Max| Histogram|
|:----------|-------:|--------------:|-------:|-------:|--------:|-------:|--------:|----------:|
|block | 5424| 0| 2897.31| 1577.02| 0| 2973.00| 5423| ▅▅▅▅▆▆▆▆▇▇|
|weight | 21| 0| 2.00| 0.22| 1.67| 2.00| 2.50| ▂▁ ▇ ▁▁▁|
|sum_t0 | 8707| 0| 1163.38| 1730.45| 0.00| 575.00| 58000.00| ▇|
|sum_t1 | 7055| 0| 967.95| 1982.15| 0.00| 333.34| 65246.63| ▇|
|count_t0 | 460| 0| 41.57| 61.35| 0| 20.00| 2167| ▇|
|count_t1 | 569| 0| 36.08| 74.30| 0| 13.00| 2585| ▇|
|conditionC | 2| 0| 0.00| 0.50| -0.50| -0.50| 0.50| ▇▇|
|sum_t0C | 9435| 0| 0.00| 1730.45| -1163.38| -588.38| 56836.62| ▇|
|count_t0L | 460| 0| 2.98| 1.35| 0.00| 3.04| 7.68| ▂▂▅▇▇▅▃▁|
|count_t0LC | 460| 0| 0.00| 1.35| -2.98| 0.06| 4.70| ▂▂▅▇▇▅▃▁|
|sum_t0L | 8707| 0| 5.91| 2.18| 0.00| 6.36| 10.97| ▂ ▁▃▇▇▃|
|sum_t0LC | 8707| 0| 0.00| 2.18| -5.91| 0.44| 5.06| ▂ ▁▃▇▇▃|
|count_t1L | 569| 0| 2.54| 1.53| 0.00| 2.64| 7.86| ▆▃▆▇▆▄▂▁|
|count_t1LC | 569| 0| 0.00| 1.53| -2.54| 0.10| 5.31| ▆▃▆▇▆▄▂▁|
winsorize (99th percentile)
| | NUnique| PercentMissing| Mean| SD| Min| Median| Max| Histogram|
|:----------|-------:|--------------:|-------:|-------:|--------:|-------:|--------:|----------:|
|block | 5424| 0| 2897.31| 1577.02| 0| 2973.00| 5423| ▅▅▅▅▆▆▆▆▇▇|
|weight | 21| 0| 2.00| 0.22| 1.67| 2.00| 2.50| ▂▁ ▇ ▁▁▁|
|sum_t0 | 8380| 0| 1130.58| 1490.63| 0.00| 575.00| 8150.44| ▇▂▁▁|
|sum_t1 | 6729| 0| 918.93| 1565.21| 0.00| 333.34| 9420.18| ▇▁▁|
|count_t0 | 287| 0| 40.52| 53.56| 0.00| 20.00| 286.00| ▇▂▁▁|
|count_t1 | 343| 0| 34.11| 57.58| 0.00| 13.00| 345.26| ▇▁▁|
|conditionC | 2| 0| 0.00| 0.50| -0.50| -0.50| 0.50| ▇▇|
|sum_t0C | 9435| 0| 0.00| 1730.45| -1163.38| -588.38| 56836.62| ▇|
|count_t0L | 287| 0| 2.98| 1.34| 0.00| 3.04| 5.66| ▃▁▂▄▆▇▆▅▃▂|
|count_t0LC | 287| 0| 0.00| 1.34| -2.98| 0.06| 2.68| ▃▁▂▄▆▇▆▅▃▂|
|sum_t0L | 8380| 0| 5.91| 2.18| 0.00| 6.36| 9.01| ▃ ▁▃▇▇▅▂|
|sum_t0LC | 8380| 0| 0.00| 2.18| -5.91| 0.45| 3.10| ▃ ▁▃▇▇▅▂|
|count_t1L | 343| 0| 2.54| 1.52| 0.00| 2.64| 5.85| ▆▅▄▇▇▇▆▄▂▂|
|count_t1LC | 343| 0| 0.00| 1.52| -2.54| 0.10| 3.31| ▆▅▄▇▇▇▆▄▂▂|
# models
```r
> m <- feglm(sum_t1 ~ conditionC * sum_t0LC | block, dt1[domain_type == "overall"], family = "quasipoisson", vcov = "HC1")
NOTE: 22 fixed-effects (109 observations) removed because of only 0 outcomes.
> m
GLM estimation, family = quasipoisson, Dep. Var.: sum_t1
Observations: 32,779
Fixed-effects: block: 5,402
Standard-errors: Heteroskedasticity-robust
Estimate Std. Error t value Pr(>|t|)
conditionC -0.003249 0.021044 -0.154409 0.8772885
sum_t0LC 0.018340 0.006302 2.910115 0.0036159 **
conditionC:sum_t0LC 0.006472 0.011458 0.564867 0.5721685
# winsorize
> m <- feglm(sum_t1 ~ conditionC * sum_t0LC | block, dt1[domain_type == "overall"], family = "quasipoisson", vcov = "HC1")
NOTE: 22 fixed-effects (109 observations) removed because of only 0 outcomes.
> m
GLM estimation, family = quasipoisson, Dep. Var.: sum_t1
Observations: 32,779
Fixed-effects: block: 5,402
Standard-errors: Heteroskedasticity-robust
Estimate Std. Error t value Pr(>|t|)
conditionC -0.026127 0.016582 -1.57564 1.1512e-01 #
sum_t0LC 0.041948 0.005155 8.13697 4.2234e-16 ***
conditionC:sum_t0LC 0.015824 0.008680 1.82308 6.8301e-02 . #
```
# user CDFs
Only showing time1 summed badness values up to 10000 (max is 65246).
![[dv_mbfc_cdf_summedbadness80.png|800]]
winsorize
- [[220311_105048 explore mbfc CDF results with regressions at each threshold]]
![[dv_mbfc_cdf_summedbadness80_winsorize.png|800]]
# 3 bins
```
# winsorize
# bin sum and size
> dt1[, .(sum_t0 = mean(sum_t0), n = .N), keyby = .(bin = sum_t0_bin)]
bin sum_t0 n
1: _1 128.3601 10986
2: _2 609.2274 10942
3: _3 2655.6818 10960
> m <- feols(sum_t1 ~ conditionC * sum_t0_bin | block, dt1[domain_type == "overall"], vcov = "HC1")
> summary(m)
OLS estimation, Dep. Var.: sum_t1
Observations: 32,888
Fixed-effects: block: 5,424
Standard-errors: Heteroskedasticity-robust
Estimate Std. Error t value Pr(>|t|)
conditionC -35.7561 20.2638 -1.76453 0.077654 .
sum_t0_bin_2 16.8885 15.6227 1.08102 0.279698
sum_t0_bin_3 749.3882 27.6176 27.13449 < 2.2e-16 ***
conditionC:sum_t0_bin_2 27.8610 24.8103 1.12296 0.261464
conditionC:sum_t0_bin_3 56.1588 37.0300 1.51657 0.129386
# model comparisons
> m101 <- feols(sum_t1 ~ conditionC * sum_t0_bin | block, dt1[domain_type == "overall"])
> m102 <- feols(sum_t1 ~ sum_t0_bin | block, dt1[domain_type == "overall"])
> test_wald(m102, m101)
Name | Model | df | df_diff | F | p
----------------------------------------------
m102 | fixest | 32886 | | |
m101 | fixest | 32883 | 3.00 | 1.31 | 0.268
# treatment effect for each bin
> m201 <- feols(sum_t1 ~ condition * sum_t0_bin | block, dt1[domain_type == "overall"], vcov = "HC1")
> m201
OLS estimation, Dep. Var.: sum_t1
Observations: 32,888
Fixed-effects: block: 5,424
Standard-errors: Heteroskedasticity-robust
Estimate Std. Error t value Pr(>|t|)
conditiont -35.75610 20.2638 -1.764531 0.077654 . # bin 1
conditiont -7.89507 13.0714 -0.603994 0.54585 # bin 2
conditiont 20.4027 31.1134 0.655754 0.51199 # bin 3
```