R1 comment/suggestion
> A. Multiplicity control – Report FDR-adjusted q-values for the entire family of confirmatory tests, or adopt a hierarchical modelling strategy.
We have 3 outcomes: candidate pref, vote likelihood, vote choice. Each one is considered one family, so we perform 3 FDR adjustments per experiment?
different FDR approaches
- standard FDR (i.e., "BH" Benjamini–Hochberg correction)
- Storey's q-value to control FDR (Storey-Tibshirani procedure) ([wiki](https://en.wikipedia.org/wiki/Q-value_(statistics)))
## candidate preference
### standard FDR algorithm ("BH" procedure)
```
set threshold (q) to 0.05 (target false discovery/error rate)
rank n p-values from smallest to largest (pval column)
generate p-value threshold, pt_i, for each test, where pt_i = i/n * q, (fdr_threshold column)
for each pval, reject null hypothesis if pval <= pt_i (fdr_sig column)
```
- `estimate`, `ci_low`, `ci_high`, `pval`: original model output
- `fdr_threshold`: `i/n * q`, where `i` is the rank of the p-value, `n` is the number of tests, and `q` is the target false discovery rate
- `fdr_sig`: whether the p-value is significant after FDR correction
- `fdr_ci_lb`: lower bound of the confidence interval after FDR correction
- `fdr_ci_ub`: upper bound of the confidence interval after FDR correction
All the `(Intercept)` terms are from the pre-post change models. For the main model, `lean1 ~ condition*topic*lean`, I included only terms that included `condition` because other terms are controls/covariates.
```r
model_spec term estimate ci_low ci_high pval fdr_threshold fdr_sig fdr_ci_lb fdr_ci_ub
<char> <char> <num> <num> <num> <num> <num> <char> <num> <num>
1: us_lean1~condition*topic*lean0 conditionproTrump 2.8795 2.0190 3.7400 0.0000 0.0036 * 1.8580 3.9009
2: us_diff_lean_bidentrump~1_proHarris_policy (Intercept) 2.3390 1.4404 3.2375 0.0000 0.0071 * 1.2724 3.4055
3: us_diff_lean_bidentrump~1_proHarris_trump_policy (Intercept) 3.9033 2.2392 5.5675 0.0000 0.0107 * 1.9280 5.8786
4: us_diff_lean_bidentrump~1_proTrump_policy (Intercept) 1.5113 0.7115 2.3112 0.0002 0.0143 * 0.5619 2.4608
5: us_diff_lean_bidentrump~1_proTrump_personality (Intercept) 1.0572 0.3321 1.7824 0.0044 0.0179 * 0.1965 1.9180
6: us_diff_lean_bidentrump~1_proHarris_trump_personality (Intercept) 1.8425 0.3050 3.3800 0.0195 0.0214 * 0.0175 3.6675
7: us_diff_lean_bidentrump~1_proHarris_harris_policy (Intercept) 1.0280 0.1663 1.8898 0.0200 0.0250 * 0.0051 2.0509
8: us_lean1~condition*topic*lean0 conditionproTrump:topicZ 0.9292 0.0681 1.7902 0.0345 0.0286 -0.0929 1.9512 # n.s. from here onward
9: us_lean1~condition*topic*lean0 conditionproTrump:lean_bidentrump_1Z 0.7739 0.0430 1.5048 0.0381 0.0321 -0.0937 1.6414
10: us_diff_lean_bidentrump~1_proHarris_personality (Intercept) 0.9452 -0.0801 1.9705 0.0713 0.0357 -0.2718 2.1622
11: us_diff_lean_bidentrump~1_proTrump_harris_policy (Intercept) 0.4960 -0.4665 1.4585 0.3135 0.0393 -0.6465 1.6385
12: us_diff_lean_bidentrump~1_proTrump_harris_personality (Intercept) 0.2784 -0.8365 1.3934 0.6249 0.0429 -1.0450 1.6019
13: us_lean1~condition*topic*lean0 conditionproTrump:lean_bidentrump_1Z:topicZ -0.1100 -0.8408 0.6209 0.7681 0.0464 -0.9775 0.7575
14: us_diff_lean_bidentrump~1_proHarris_harris_personality (Intercept) 0.1576 -1.2118 1.5269 0.8217 0.0500 -1.4678 1.7829
```
### Storey q-value FDR procedure (provides more statistical power)
- estimates $\pi_0$ (tuning parameter or scaling factor - see equation below): proportion of tests that are true nulls
- not estimated in the standard FDR algorithm (i.e., when $\pi_0 = 1$, it reduces to the standard BH FDR procedure)
- estimated empirically using the p-value distribution
- if there are true effects, distribution should be right-skewed
- if effects are mostly nulls, distributions should be more uniform
- if $\pi_0 = 1$: all tests are true nulls, which becomes the same as the standard BH FDR adjustment
- if $\pi_0 \to 0$: fewer true nulls (i.e., more real effects)
- $\pi_0$ is used then to calculate the q-value ("adjusted p-value") for each test
- **qval can be smaller than (original) pvalue!**, especially when $\pi_0$ is small
- the smaller $\pi_0$ is (fewer true nulls), the smaller q-values will be for the same set of p-values (i.e., greater expected discoveries)
- see [[Storey-Tibshirani FDR procedure]]; [q-value (statistics) - Wikipedia](https://en.wikipedia.org/wiki/Q-value_(statistics))
Columns
- `pval`: original pval (sorted from smallest to largest)
- **`fdr_qval`: q-value for each test (i.e., "adjusted p-value")**
- approximate minimum FDR if we consider this and all smaller q-values significant (i.e., if < .05, it's considered "significant")
- `lfdr`: local false discovery rate, that is posterior, p(null is true | observed test result)
- it has a natural Bayesian interpretation!
- `bf10`: Bayes factor that can be derived from $\pi_0$ (which is 0.416 here) and `ldfr`
- how much more likely the observed data/result is under the alternative hypothesis relative to the null
```r
model_name term estimate se ci_low ci_high pval fdr_qval lfdr bf10
<char> <char> <num> <num> <num> <num> <num> <num> <num> <num>
# low pval and low qval, low ldfr (probably of null is true is almost zero), high BF10
1: us_lean1~condition*topic*lean0 lean_bidentrump_1Z 40.9424 0.3003 40.3538 41.5309 0.0000 0.0000 0.0000 2003345.8449
2: us_lean1~condition*topic*lean0 conditionproTrump 2.8795 0.4390 2.0190 3.7400 0.0000 0.0000 0.0000 2003345.8449
3: us_diff_lean_bidentrump~1_proHarris_policy (Intercept) 2.3390 0.4584 1.4404 3.2375 0.0000 0.0000 0.0000 62649.0114
4: us_diff_lean_bidentrump~1_proHarris_trump_policy (Intercept) 3.9033 0.8490 2.2392 5.5675 0.0000 0.0000 0.0001 5547.3033
5: us_diff_lean_bidentrump~1_proTrump_policy (Intercept) 1.5113 0.4081 0.7115 2.3112 0.0002 0.0003 0.0031 229.6810
6: us_diff_lean_bidentrump~1_proTrump_personality (Intercept) 1.0572 0.3700 0.3321 1.7824 0.0044 0.0052 0.0396 17.2677
# moderate pval and qval, ldfr suggests moderate chance that null is true
7: us_diff_lean_bidentrump~1_proHarris_trump_personality (Intercept) 1.8425 0.7844 0.3050 3.3800 0.0195 0.0177 0.1378 4.4567
8: us_diff_lean_bidentrump~1_proHarris_harris_policy (Intercept) 1.0280 0.4397 0.1663 1.8898 0.0200 0.0177 0.1404 4.3619
9: us_lean1~condition*topic*lean0 conditionproTrump:topicZ 0.9292 0.4393 0.0681 1.7902 0.0345 0.0246 0.2197 2.5310
10: us_lean1~condition*topic*lean0 conditionproTrump:lean_bidentrump_1Z 0.7739 0.3729 0.0430 1.5048 0.0381 0.0246 0.2377 2.2848
11: us_lean1~condition*topic*lean0 topicZ -0.7130 0.3439 -1.3871 -0.0390 0.0382 0.0246 0.2385 2.2744
12: us_diff_lean_bidentrump~1_proHarris_personality (Intercept) 0.9452 0.5231 -0.0801 1.9705 0.0713 0.0420 0.3916 1.1067
# definitely not true discoveries
13: us_diff_lean_bidentrump~1_proTrump_harris_policy (Intercept) 0.4960 0.4911 -0.4665 1.4585 0.3135 0.1705 1.0000 0.0000
14: us_diff_lean_bidentrump~1_proTrump_harris_personality (Intercept) 0.2784 0.5689 -0.8365 1.3934 0.6249 0.3157 1.0000 0.0000
15: us_lean1~condition*topic*lean0 lean_bidentrump_1Z:topicZ -0.1202 0.3004 -0.7090 0.4687 0.6892 0.3250 1.0000 0.0000
16: us_lean1~condition*topic*lean0 conditionproTrump:lean_bidentrump_1Z:topicZ -0.1100 0.3729 -0.8408 0.6209 0.7681 0.3395 1.0000 0.0000
17: us_diff_lean_bidentrump~1_proHarris_harris_personality (Intercept) 0.1576 0.6986 -1.2118 1.5269 0.8217 0.3419 1.0000 0.0000
```
## vote likelihood
## vote choice