220303_135802 fact-checker analyses

- 60 fact-checker domains - analyses below are for **retweets**, Oct 17 to Oct 24 - quality (range: 0 to 100, bad to good) - time0: pre-campaign - time1: Oct 17 to Oct 24 - see [[220301_145044 user ECDFs#fact-checker CDFs 60 domains|CDFs]] - [[220305_133219 fact-checker analyses control for count]] Distribution of fact-checker ratings ![[s20220310_174455.png]] # Results model: `time1_meanquality ~ condition[-0.5/0.5] * time0_meanquality` - significant interaction effect ```r # OLS (no blocking) m1 <- feols(mean_t1 ~ conditionC * mean_t0C, dt1[domain_type == "overall"]) > m1 OLS estimation, Dep. Var.: mean_t1 Observations: 32,888 Standard-errors: IID Estimate Std. Error t value Pr(>|t|) (Intercept) 59.485700 0.091029 653.483391 < 2.2e-16 *** conditionC 0.056347 0.182057 0.309503 0.7569407 mean_t0C 0.433655 0.004575 94.796289 < 2.2e-16 *** conditionC:mean_t0C -0.027691 0.009149 -3.026641 0.0024748 ** # OLS (no blocking) with robust SEs (HC1) t test of coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 59.4856997 0.0910304 653.4705 < 2.2e-16 *** conditionC 0.0563473 0.1820609 0.3095 0.756945 mean_t0C 0.4336551 0.0053724 80.7186 < 2.2e-16 *** conditionC:mean_t0C -0.0276913 0.0107449 -2.5772 0.009966 ** # account for blocking and robust SE > m1 <- feols(mean_t1 ~ conditionC * mean_t0C | block, dt1[domain_type == "overall"]) > summary(m1, vcov = "HC1") OLS estimation, Dep. Var.: mean_t1 Observations: 32,888 Fixed-effects: block: 5,424 Standard-errors: Heteroskedasticity-robust Estimate Std. Error t value Pr(>|t|) conditionC 0.035548 0.175312 0.20277 0.8393166 mean_t0C 0.328771 0.006132 53.61338 < 2.2e-16 *** conditionC:mean_t0C -0.032726 0.010347 -3.16279 0.0015644 ** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 RMSE: 14.5 Adj. R2: 0.279248 Within R2: 0.12341 ``` # Divide users into 3 equally-sized bins based on pre-campaign quality - about 11k users in each bin - model: `time1_meanquality ~ condition[0/1] * time0_meanquality(3bins)` ```r # bin size/quality mean > dt1[, .(mean_t0 = mean(mean_t0), n = .N), keyby = .(bin = mean_t0_bin)] bin mean_t0 n 1: _1 36.60039 10964 # bin mean/size 2: _2 57.06844 10974 3: _3 79.26175 10950 # OLS (no blocking) m3 <- feols(mean_t1 ~ conditionC * mean_t0_bin, dt1[domain_type == "overall"]) Estimate Std. Error t value Pr(>|t|) (Intercept) 48.588858 0.157934 307.65331 < 2.2e-16 *** conditionC 0.773480 0.315868 2.44875 0.014341 * mean_t0_bin_2 11.739379 0.223302 52.57186 < 2.2e-16 *** mean_t0_bin_3 20.962081 0.223428 93.82025 < 2.2e-16 *** conditionC:mean_t0_bin_2 -0.788764 0.446603 -1.76614 0.077382 . conditionC:mean_t0_bin_3 -1.262577 0.446856 -2.82547 0.004724 ** # OLS (no blocking) with robust SEs Estimate Std. Error z value Pr(>|z|) (Intercept) 48.58886 0.16252 298.9781 < 2.2e-16 *** conditionC 0.77348 0.32503 2.3797 0.017327 * mean_t0_bin_2 11.73938 0.21323 55.0562 < 2.2e-16 *** mean_t0_bin_3 20.96208 0.23625 88.7272 < 2.2e-16 *** conditionC:mean_t0_bin_2 -0.78876 0.42645 -1.8496 0.064371 . conditionC:mean_t0_bin_3 -1.26258 0.47251 -2.6721 0.007538 ** # account for blocking with robust SEs > m3.1 <- feols(mean_t1 ~ conditionC * mean_t0_bin | block, dt1[domain_type == "overall"]) > summary(m3.1, vcov = "HC1") OLS estimation, Dep. Var.: mean_t1 Observations: 32,888 Fixed-effects: block: 5,424 Standard-errors: Heteroskedasticity-robust Estimate Std. Error t value Pr(>|t|) conditionC 0.845290 0.332034 2.54579 0.01090840 * mean_t0_bin_2 9.155889 0.268567 34.09166 < 2.2e-16 *** mean_t0_bin_3 15.549122 0.290281 53.56578 < 2.2e-16 *** conditionC:mean_t0_bin_2 -0.796721 0.448877 -1.77492 0.07592217 . conditionC:mean_t0_bin_3 -1.589052 0.478399 -3.32160 0.00089618 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 RMSE: 14.5 Adj. R2: 0.270377 Within R2: 0.112685 ``` ![[dv_fc_retweets_oct17-oct24_model_interact_3bins 1.png|600]] ## Coefficient/effect of treatment for each bin - each row is the coefficient for condition/treatment for one bin (rows/bins 1 to 3) - account for blocking and robust SEs (HC1) ```r # reparameterize to get condition/treatment effect for each bin > m201 <- feols(mean_t1 ~ condition * mean_t0_bin | block, dt1[domain_type == "overall"]) > summary(m201, vcov = "HC1") OLS estimation, Dep. Var.: mean_t1 Observations: 32,888 Fixed-effects: block: 5,424 Standard-errors: Heteroskedasticity-robust Estimate Std. Error t value Pr(>|t|) conditiont 0.845290 0.332034 2.54579 0.01090840 * # bin 1 (control vs treatment; treatment minus control) conditiont 0.048569 0.287573 0.168893 0.865882 # bin 2 conditiont -0.743761 0.333629 -2.22931 0.02580158 * # bin 3 ``` ## Model comparison ```r # fit full and reduced (no condition) models (each model accounts for blocking) > m101 <- feols(mean_t1 ~ conditionC * mean_t0_bin | block, dt1[domain_type == "overall"]) > m102 <- feols(mean_t1 ~ mean_t0_bin | block, dt1[domain_type == "overall"]) # compare the two models > test_wald(m102, m101) Name | Model | df | df_diff | F | p ---------------------------------------------- m102 | fixest | 32886 | | | m101 | fixest | 32883 | 3.00 | 4.75 | 0.003 # model with condition is better ``` # Predict whether user quality is below 50 - if quality during campaign is < 50, assign 1, else 0 ```r dt1[, user_low_quality := ifelse(mean_t1 < 50, 1, 0)] # logistic regression Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -1.1517369 0.0142815 -80.645 <2e-16 *** conditionC 0.0107100 0.0285630 0.375 0.7077 mean_t0C -0.0481450 0.0007932 -60.699 <2e-16 *** conditionC:mean_t0C 0.0030506 0.0015864 1.923 0.0545 . # interaction? see figure below # robust SE z test of coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -1.15173688 0.01413998 -81.4525 < 2e-16 *** conditionC 0.01070998 0.02827996 0.3787 0.70490 mean_t0C -0.04814497 0.00092372 -52.1209 < 2e-16 *** conditionC:mean_t0C 0.00305064 0.00184743 1.6513 0.09868 . # account for blocking and robust SEs > m301 <- feglm(user_low_quality ~ conditionC * mean_t0C | block, dt1, family = 'binomial') NOTE: 1,586 fixed-effects (8,677 observations) removed because of only 0 (or only 1) outcomes. > summary(m301, vcov = "HC1") GLM estimation, family = binomial, Dep. Var.: user_low_quality Observations: 24,211 Fixed-effects: block: 3,838 Standard-errors: Heteroskedasticity-robust Estimate Std. Error t value Pr(>|t|) conditionC 0.018978 0.036431 0.520933 0.602413 mean_t0C -0.042918 0.001240 -34.606586 < 2.2e-16 *** conditionC:mean_t0C 0.003856 0.002221 1.735749 0.082608 . # interaction? --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Log-Likelihood: -12,652.3 Adj. Pseudo R2: -0.043799 BIC: 64,077.8 Squared Cor.: 0.247096 ``` ![[Pasted image 20220304092352.png]]