220407_201028 fold0 nopooling vs partial pooling

# fold0 (5% of all data) results Selected 5% of subjects from each study/country pseudorandomly, ensuring we have somewhat balanced proportions of subjects with different CRT scores. - 384 subjects in this fold0 (5%); 7066 in remaining fold1 - see [[220325_134343 data split|data split for details]] (note ~9% of 8156 subjects were excluded for having < 15 trials, leaving us with 7450) Didn't need too many subjects (we said 10% initially) for exploration since we ended up working with simulated data instead (e.g., [[220407_113556 parameter recovery - normally distributed simulated params|simulated data with 400 subjects, 18 trials each]]). In simulations and parameter recoveries, **partial pooling tends to outperform no-pooling** very slightly in recovering both simulated parameters and simulated correlations. **BUT interesting/opposite patterns of model fit for no-pooling and partial-pooling** - **no-pooling approach** better at describing **individual data**, worse for country-level data - model fitted to each subject separately (i.e., 384 models) - **partial pooling** better at describing **country-level data**, worse for individual data - likely due to shrinkage toward country level mean parameter estimate - 16 Bayesian hierarchical models, one for each country See [[220408_121448 fold0 model fit|here]] for no-pooling and partial-pooling model fits. Both approaches fit the data quite well—good correlations between model-predicted behavior and observed behavior. Overall, partial pooling might be better because it also ensures we don't have outlier parameter estimates, which would bias correlations with CRT. ```r # subjects per country in fold0 > fit_ddm[, .N, keyby = .(Country)] Country N 1: 1 24 2: 2 23 3: 3 24 4: 4 24 5: 5 24 6: 6 23 7: 7 24 8: 8 24 9: 9 24 10: 10 23 11: 11 24 12: 12 24 13: 13 23 14: 14 24 15: 15 24 16: 16 23 ``` # ddm model specification - upper bound: correct response (share true news; don't share false news) - lower bound: incorrect response (don't share true news; share false news) - news veracity is based on `real` columns (coded 0 or 1) # 3 different statistical models Which one makes more sense? We could prereg one and fit the remaining as secondary robustness checks? - CRT predicts parameter: `parameter ~ crtZ + (1 + crtZ | Country)` - parameter predicts CRT: `crt ~ parameterZ + (1 + parameterZ | Country)` - all parameters predict CRT in a single model: `crt ~ parameter1Z + parameter2Z + ... + (1 + parameter1Z + parameter2Z + ... | Country)` If CRT is the outcome, maybe we should also fit ordinal models because CRT accuracy is ordered data. Though results shouldn't change too much vs regular regression model. - all **predictors** are z-scored within country - for now, frequentist models, but will fit **bayesian** models eventually ## boundary - crt `boundary ~ crt` ```r # no-pooling term results 1: (Intercept) b = 2.40, SE = 0.04, t(362) = 66.99, p < .001, r = 0.96 2: crtZ b = 0.07, SE = 0.04, t(15) = 1.66, p = .117, r = 0.40 ## # partial pooling term results 1: (Intercept) b = 4.71, SE = 0.09, t(15) = 49.65, p < .001, r = 1.00 2: crtZ b = 0.08, SE = 0.06, t(15) = 1.36, p = .195, r = 0.33 ``` `crt ~ boundary` ```r # no-pooling term results 1: (Intercept) b = 1.50, SE = 0.06, t(362) = 26.46, p < .001, r = 0.81 2: B b = 0.11, SE = 0.07, t(15) = 1.62, p = .127, r = 0.39 # partial-pooling term results 1: (Intercept) b = 1.50, SE = 0.06, t(362) = 26.27, p < .001, r = 0.81 2: alpha b = 0.08, SE = 0.06, t(15) = 1.30, p = .214, r = 0.32 ``` ## bias - crt `bias ~ crt` ```r # no-pooling term results 1: (Intercept) b = 0.04, SE = 0.01, t(377) = 2.85, p = .005, r = 0.15 2: crtZ b = −0.01, SE = 0.01, t(377) = −0.78, p = .438, r = 0.04 # partial-pooling term results 1: (Intercept) b = 0.53, SE = 0.006, t(15) = 83.55, p < .001, r = 1.00 2: crtZ b = 0.003, SE = 0.002, t(15) = 1.85, p = .085, r = 0.43 ## ``` `crt ~ bias` ```r # no-pooling term results 1: (Intercept) b = 1.50, SE = 0.06, t(377) = 26.20, p < .001, r = 0.80 2: x0 b = −0.07, SE = 0.06, t(377) = −1.21, p = .227, r = 0.06 # partial-pooling term results 1: (Intercept) b = 1.50, SE = 0.06, t(377) = 26.25, p < .001, r = 0.80 2: beta b = 0.10, SE = 0.06, t(377) = 1.73, p = .084, r = 0.09 ## ``` ## non-decision time - crt `non-decision-time ~ crt` ```r # no-pooling term results 1: (Intercept) b = 3.22, SE = 0.14, t(15) = 22.64, p < .001, r = 0.99 2: crtZ b = 0.07, SE = 0.09, t(15) = 0.80, p = .439, r = 0.20 # partial-pooling term results 1: (Intercept) b = 3.21, SE = 0.14, t(15) = 22.82, p < .001, r = 0.99 2: crtZ b = 0.10, SE = 0.09, t(15) = 1.06, p = .306, r = 0.27 ``` `crt ~ non-decision-time` ```r # no-pooling term results 1: (Intercept) b = 1.50, SE = 0.06, t(362) = 26.29, p < .001, r = 0.81 2: nondectime b = 0.04, SE = 0.07, t(15) = 0.54, p = .596, r = 0.14 # partial-pooling term results 1: (Intercept) b = 1.50, SE = 0.06, t(343) = 26.41, p < .001, r = 0.82 2: tau b = 0.06, SE = 0.07, t(15) = 0.83, p = .422, r = 0.21 ``` ## drift - crt `drift ~ crt` ```r # no-pooling term results 1: (Intercept) b = 0.18, SE = 0.02, t(377) = 8.80, p < .001, r = 0.41 2: crtZ b = 0.04, SE = 0.02, t(377) = 1.93, p = .054, r = 0.10 ## # partial-pooling term results 1: (Intercept) b = 0.17, SE = 0.01, t(15) = 12.96, p < .001, r = 0.96 2: crtZ b = 0.01, SE = 0.005, t(15) = 2.94, p = .010, r = 0.60 ## ``` `crt ~ drift` ```r # no-pooling term results 1: (Intercept) b = 1.50, SE = 0.06, t(362) = 27.20, p < .001, r = 0.82 2: drift b = 0.25, SE = 0.07, t(15) = 3.32, p = .005, r = 0.65 ## much stronger effects than drift ~ crt # partial-pooling term results 1: (Intercept) b = 1.50, SE = 0.06, t(362) = 27.15, p < .001, r = 0.82 2: delta b = 0.29, SE = 0.06, t(15) = 4.62, p < .001, r = 0.77 ## much stronger effects than drift ~ crt ``` ## crt ~ all parameters in one model ```r # no-pooling term results 1: (Intercept) b = 1.50, SE = 0.05, t(345) = 27.69, p < .001, r = 0.83 2: B b = 0.10, SE = 0.08, t(14) = 1.37, p = .193, r = 0.34 # bound 3: x0 b = 0.01, SE = 0.06, t(80) = 0.20, p = .842, r = 0.02 # bias 4: nondectime b = −7e−04, SE = 0.06, t(27) = −0.01, p = .992, r = 0.002 # nondecision time 5: drift b = 0.28, SE = 0.07, t(15) = 3.77, p = .002, r = 0.70 # drift # partial-pooling term results 1: (Intercept) b = 1.50, SE = 0.05, t(313) = 27.55, p < .001, r = 0.84 2: alpha b = 0.12, SE = 0.07, t(65) = 1.59, p = .116, r = 0.19 # bound 3: beta b = −0.02, SE = 0.06, t(90) = −0.32, p = .753, r = 0.03 # bias 4: tau b = −0.005, SE = 0.08, t(21) = −0.06, p = .949, r = 0.01 # nondecision time 5: delta b = 0.30, SE = 0.07, t(21) = 4.49, p < .001, r = 0.70 # drift ``` # correlations between parameters (pool across countries) no-pooling ```r Parameter | B | x0 | nondectime | drift ------------------------------------------------------- B | 1.00*** | -0.19*** | 0.24*** | -0.13* # bound x0 | -0.19*** | 1.00*** | 0.16** | 0.45*** # bias nondectime | 0.24*** | 0.16** | 1.00*** | 0.05 # ndt drift | -0.13* | 0.45*** | 0.05 | 1.00*** # drift ``` partial-pooling ```r Parameter | alpha | beta | tau | delta ------------------------------------------------- alpha | 1.00*** | 0.01 | 0.55*** | -0.07 # bound beta | 0.01 | 1.00*** | 0.10 | 0.07 # bias tau | 0.55*** | 0.10 | 1.00*** | -0.14* # ndt delta | -0.07 | 0.07 | -0.14* | 1.00*** # drift ``` together in one correlation matrix ```r Parameter | B | x0 | nondectime | drift | alpha | beta | tau | delta --------------------------------------------------------------------------------------------- # no pooling parameters B | 1.00*** | -0.19** | 0.24*** | -0.13 | 0.88*** | 0.04 | 0.38*** | 0.02 # bound x0 | -0.19** | 1.00*** | 0.16* | 0.45*** | 0.11 | 0.43*** | 0.14 | -0.05 # bias nondectime | 0.24*** | 0.16* | 1.00*** | 0.05 | 0.45*** | 0.07 | 0.97*** | -0.13 # ndt drift | -0.13 | 0.45*** | 0.05 | 1.00*** | 0.16* | 0.03 | 0.06 | 0.39*** # drift # partial pooling parameters alpha | 0.88*** | 0.11 | 0.45*** | 0.16* | 1.00*** | 0.01 | 0.55*** | -0.07 # bound beta | 0.04 | 0.43*** | 0.07 | 0.03 | 0.01 | 1.00*** | 0.10 | 0.07 # bias tau | 0.38*** | 0.14 | 0.97*** | 0.06 | 0.55*** | 0.10 | 1.00*** | -0.14 # ndt delta | 0.02 | -0.05 | -0.13 | 0.39*** | -0.07 | 0.07 | -0.14 | 1.00*** # drift ```