250723_155756 fdr corrections

R1 comment/suggestion > A. Multiplicity control – Report FDR-adjusted q-values for the entire family of confirmatory tests, or adopt a hierarchical modelling strategy. We have 3 outcomes: candidate pref, vote likelihood, vote choice. Each one is considered one family, so we perform 3 FDR adjustments per experiment? different FDR approaches - standard FDR (i.e., "BH" Benjamini–Hochberg correction) - Storey's q-value to control FDR (Storey-Tibshirani procedure) ([wiki](https://en.wikipedia.org/wiki/Q-value_(statistics))) ## candidate preference ### standard FDR algorithm ("BH" procedure) ``` set threshold (q) to 0.05 (target false discovery/error rate) rank n p-values from smallest to largest (pval column) generate p-value threshold, pt_i, for each test, where pt_i = i/n * q, (fdr_threshold column) for each pval, reject null hypothesis if pval <= pt_i (fdr_sig column) ``` - `estimate`, `ci_low`, `ci_high`, `pval`: original model output - `fdr_threshold`: `i/n * q`, where `i` is the rank of the p-value, `n` is the number of tests, and `q` is the target false discovery rate - `fdr_sig`: whether the p-value is significant after FDR correction - `fdr_ci_lb`: lower bound of the confidence interval after FDR correction - `fdr_ci_ub`: upper bound of the confidence interval after FDR correction All the `(Intercept)` terms are from the pre-post change models. For the main model, `lean1 ~ condition*topic*lean`, I included only terms that included `condition` because other terms are controls/covariates. ```r model_spec term estimate ci_low ci_high pval fdr_threshold fdr_sig fdr_ci_lb fdr_ci_ub <char> <char> <num> <num> <num> <num> <num> <char> <num> <num> 1: us_lean1~condition*topic*lean0 conditionproTrump 2.8795 2.0190 3.7400 0.0000 0.0036 * 1.8580 3.9009 2: us_diff_lean_bidentrump~1_proHarris_policy (Intercept) 2.3390 1.4404 3.2375 0.0000 0.0071 * 1.2724 3.4055 3: us_diff_lean_bidentrump~1_proHarris_trump_policy (Intercept) 3.9033 2.2392 5.5675 0.0000 0.0107 * 1.9280 5.8786 4: us_diff_lean_bidentrump~1_proTrump_policy (Intercept) 1.5113 0.7115 2.3112 0.0002 0.0143 * 0.5619 2.4608 5: us_diff_lean_bidentrump~1_proTrump_personality (Intercept) 1.0572 0.3321 1.7824 0.0044 0.0179 * 0.1965 1.9180 6: us_diff_lean_bidentrump~1_proHarris_trump_personality (Intercept) 1.8425 0.3050 3.3800 0.0195 0.0214 * 0.0175 3.6675 7: us_diff_lean_bidentrump~1_proHarris_harris_policy (Intercept) 1.0280 0.1663 1.8898 0.0200 0.0250 * 0.0051 2.0509 8: us_lean1~condition*topic*lean0 conditionproTrump:topicZ 0.9292 0.0681 1.7902 0.0345 0.0286 -0.0929 1.9512 # n.s. from here onward 9: us_lean1~condition*topic*lean0 conditionproTrump:lean_bidentrump_1Z 0.7739 0.0430 1.5048 0.0381 0.0321 -0.0937 1.6414 10: us_diff_lean_bidentrump~1_proHarris_personality (Intercept) 0.9452 -0.0801 1.9705 0.0713 0.0357 -0.2718 2.1622 11: us_diff_lean_bidentrump~1_proTrump_harris_policy (Intercept) 0.4960 -0.4665 1.4585 0.3135 0.0393 -0.6465 1.6385 12: us_diff_lean_bidentrump~1_proTrump_harris_personality (Intercept) 0.2784 -0.8365 1.3934 0.6249 0.0429 -1.0450 1.6019 13: us_lean1~condition*topic*lean0 conditionproTrump:lean_bidentrump_1Z:topicZ -0.1100 -0.8408 0.6209 0.7681 0.0464 -0.9775 0.7575 14: us_diff_lean_bidentrump~1_proHarris_harris_personality (Intercept) 0.1576 -1.2118 1.5269 0.8217 0.0500 -1.4678 1.7829 ``` ### Storey q-value FDR procedure (provides more statistical power) - estimates $\pi_0$ (tuning parameter or scaling factor - see equation below): proportion of tests that are true nulls - not estimated in the standard FDR algorithm (i.e., when $\pi_0 = 1$, it reduces to the standard BH FDR procedure) - estimated empirically using the p-value distribution - if there are true effects, distribution should be right-skewed - if effects are mostly nulls, distributions should be more uniform - if $\pi_0 = 1$: all tests are true nulls, which becomes the same as the standard BH FDR adjustment - if $\pi_0 \to 0$: fewer true nulls (i.e., more real effects) - $\pi_0$ is used then to calculate the q-value ("adjusted p-value") for each test - **qval can be smaller than (original) pvalue!**, especially when $\pi_0$ is small - the smaller $\pi_0$ is (fewer true nulls), the smaller q-values will be for the same set of p-values (i.e., greater expected discoveries) - see [[Storey-Tibshirani FDR procedure]]; [q-value (statistics) - Wikipedia](https://en.wikipedia.org/wiki/Q-value_(statistics)) Columns - `pval`: original pval (sorted from smallest to largest) - **`fdr_qval`: q-value for each test (i.e., "adjusted p-value")** - approximate minimum FDR if we consider this and all smaller q-values significant (i.e., if < .05, it's considered "significant") - `lfdr`: local false discovery rate, that is posterior, p(null is true | observed test result) - it has a natural Bayesian interpretation! - `bf10`: Bayes factor that can be derived from $\pi_0$ (which is 0.416 here) and `ldfr` - how much more likely the observed data/result is under the alternative hypothesis relative to the null ```r model_name term estimate se ci_low ci_high pval fdr_qval lfdr bf10 <char> <char> <num> <num> <num> <num> <num> <num> <num> <num> # low pval and low qval, low ldfr (probably of null is true is almost zero), high BF10 1: us_lean1~condition*topic*lean0 lean_bidentrump_1Z 40.9424 0.3003 40.3538 41.5309 0.0000 0.0000 0.0000 2003345.8449 2: us_lean1~condition*topic*lean0 conditionproTrump 2.8795 0.4390 2.0190 3.7400 0.0000 0.0000 0.0000 2003345.8449 3: us_diff_lean_bidentrump~1_proHarris_policy (Intercept) 2.3390 0.4584 1.4404 3.2375 0.0000 0.0000 0.0000 62649.0114 4: us_diff_lean_bidentrump~1_proHarris_trump_policy (Intercept) 3.9033 0.8490 2.2392 5.5675 0.0000 0.0000 0.0001 5547.3033 5: us_diff_lean_bidentrump~1_proTrump_policy (Intercept) 1.5113 0.4081 0.7115 2.3112 0.0002 0.0003 0.0031 229.6810 6: us_diff_lean_bidentrump~1_proTrump_personality (Intercept) 1.0572 0.3700 0.3321 1.7824 0.0044 0.0052 0.0396 17.2677 # moderate pval and qval, ldfr suggests moderate chance that null is true 7: us_diff_lean_bidentrump~1_proHarris_trump_personality (Intercept) 1.8425 0.7844 0.3050 3.3800 0.0195 0.0177 0.1378 4.4567 8: us_diff_lean_bidentrump~1_proHarris_harris_policy (Intercept) 1.0280 0.4397 0.1663 1.8898 0.0200 0.0177 0.1404 4.3619 9: us_lean1~condition*topic*lean0 conditionproTrump:topicZ 0.9292 0.4393 0.0681 1.7902 0.0345 0.0246 0.2197 2.5310 10: us_lean1~condition*topic*lean0 conditionproTrump:lean_bidentrump_1Z 0.7739 0.3729 0.0430 1.5048 0.0381 0.0246 0.2377 2.2848 11: us_lean1~condition*topic*lean0 topicZ -0.7130 0.3439 -1.3871 -0.0390 0.0382 0.0246 0.2385 2.2744 12: us_diff_lean_bidentrump~1_proHarris_personality (Intercept) 0.9452 0.5231 -0.0801 1.9705 0.0713 0.0420 0.3916 1.1067 # definitely not true discoveries 13: us_diff_lean_bidentrump~1_proTrump_harris_policy (Intercept) 0.4960 0.4911 -0.4665 1.4585 0.3135 0.1705 1.0000 0.0000 14: us_diff_lean_bidentrump~1_proTrump_harris_personality (Intercept) 0.2784 0.5689 -0.8365 1.3934 0.6249 0.3157 1.0000 0.0000 15: us_lean1~condition*topic*lean0 lean_bidentrump_1Z:topicZ -0.1202 0.3004 -0.7090 0.4687 0.6892 0.3250 1.0000 0.0000 16: us_lean1~condition*topic*lean0 conditionproTrump:lean_bidentrump_1Z:topicZ -0.1100 0.3729 -0.8408 0.6209 0.7681 0.3395 1.0000 0.0000 17: us_diff_lean_bidentrump~1_proHarris_harris_personality (Intercept) 0.1576 0.6986 -1.2118 1.5269 0.8217 0.3419 1.0000 0.0000 ``` ## vote likelihood ## vote choice