r - brms - hello

- [[Stan]], [[install rstan on personal Linux server]] - [[brms - fit multivariate models with multiple response variables]] - [[brms - set and sample priors]] - [[brms - variance structures for residuals]] # Problem ## Convergence issues [https://cran.r-project.org/web/packages/brms/brms.pdf](https://cran.r-project.org/web/packages/brms/brms.pdf) > > In addition to choosing the number of iterations, warmup draws, and chains, users can control the behavior of the NUTS sampler, by using the control argument. The most important reason to use control is to decrease (or eliminate at best) the number of divergent transitions that cause a bias in the obtained posterior draws. Whenever you see the warning "There were x divergent transitions after warmup." you should really think about increasing adapt_delta. To do this, write control = list(adapt_delta =<×>), where <×> should usually be value between 0.8 (current default) and 1. Increasing adapt_delta will slow down the sampler but will decrease the number of divergent transitions threatening the validity of your posterior draws. > Another problem arises when the depth of the tree being evaluated in each iteration is exceeded. This is less common than having divergent transitions, but may also bias the posterior draws. When it happens, Stan will throw out a warning suggesting to increase max_treedepth, which can be accomplished by writing control = list(max_treedepth = x) with a positive integer x that should usually be larger than the current default of 10. For more details on the control argument see stan. ```r # first step ctrl <- list(adapt_delta = 0.90) # default is 0.80 brm(..., control = ctrl) # next step ctrl <- list(adapt_delta = 0.90, max_treedepth = 15) brm(..., control = ctrl) ``` - `max_treedepth`: When the depth of the tree being evaluated in each iteration is exceeded, it may bias the posterior draws. Increase this value to fix the problem. It will increase sampling time significantly. Current default is 10? - [Runtime warnings and convergence problems](https://mc-stan.org/misc/warnings.html#maximum-treedepth-exceeded) - [Low ESS and High RHat for Random Intercept & Slope Simulation (rstan and rstanarm) - Modeling - The Stan Forums](https://discourse.mc-stan.org/t/low-ess-and-high-rhat-for-random-intercept-slope-simulation-rstan-and-rstanarm/9985/6) ## Extract model coefficients `coef(model)` ```r data(“sleepstudy”, package =“lme4”) df <- sleepstudy df$Days2 <- factor(ifelse(df$Days < 5, 0, 1)) fit <- brm(Reaction ~ Days2 + (Days2 | Subject), df) fe <- fixef(fit, summary = FALSE) # fixed effect re <- ranef(fit, summary = FALSE) # varying/random effects (res1 <- posterior_summary(fe[, 2] + re$Subject[, , 2])) (res2 <- coef(fit)$Subject[, , 2]) # coef returns fixed + random effects all.equal(res1, res2) ``` - [coef.brmsfit: Extract Model Coefficients in brms: Bayesian Regression Models using 'Stan'](https://rdrr.io/cran/brms/man/coef.brmsfit.html) - [Coef (versus fixef and ranef) - Interfaces / brms - The Stan Forums](https://discourse.mc-stan.org/t/coef-versus-fixef-and-ranef/3914) ## Summarize estimates and posterior Use `bayestestR` or `parameters` ```r library(brms); library(parameters) m <- brm(mpg ~ cyl, mtcars) as.data.frame(model_parameters(m)) library(bayestestR) describe_posterior(m, ci = c(0.89, 0.95), centrality = "all", ci_method = "HDI") ``` - [Parameters from Bayesian Models — model_parameters.data.frame • parameters](https://easystats.github.io/parameters/reference/model_parameters.stanreg.html) ## Incorporate measurement error in the outcome/response/criterion or predictor ```r # measurement error in the outcome brm(y | mi(y_sd) ~ x1 + x2, data = dat) # note # The me function is soft deprecated in favor of the more flexible and consistent mi function. # measurement error in predictor brm(y ~ me(x, sdx), data = dat) ``` - [Left side of the model formula with measurement error · Issue #643 · paul-buerkner/brms · GitHub](https://github.com/paul-buerkner/brms/issues/643) - [14 Missing Data and Other Opportunities | Statistical Rethinking with brms, ggplot2, and the tidyverse](https://bookdown.org/ajkurz/Statistical_Rethinking_recoded/missing-data-and-other-opportunities.html) - [Is the brms() measurement error model doing the 'right' thing? - Interfaces / brms - The Stan Forums](https://discourse.mc-stan.org/t/is-the-brms-measurement-error-model-doing-the-right-thing/9300/17) - [Regression with measurement error in brms - Interfaces / brms - The Stan Forums](https://discourse.mc-stan.org/t/regression-with-measurement-error-in-brms/24700) - [r - Linear regression with both x and y errors in package brms - Stack Overflow](https://stackoverflow.com/questions/68576294/linear-regression-with-both-x-and-y-errors-in-package-brms) - [https://cran.r-project.org/web/packages/brms/brms.pdf](https://cran.r-project.org/web/packages/brms/brms.pdf) - [15 Missing Data and Other Opportunities | Statistical rethinking with brms, ggplot2, and the tidyverse: Second edition](https://bookdown.org/content/4857/missing-data-and-other-opportunities.html#measurement-error) ## Avoid recompiling same model ```r library(brms) # requires compilation on the first trial fit <- brm(count ~ Trt, epilepsy) # load the stanfit object in the global environment stanfit <- fit$fit # does not require compilation anymore fit2 <- brm(count ~ Trt, epilepsy) ``` - [Avoid recompiling the exact same Stan model with brms - Interfaces / brms - The Stan Forums](https://discourse.mc-stan.org/t/avoid-recompiling-the-exact-same-stan-model-with-brms/6129/16) ## Estimate monotonic effects ```r fit1 <- brm(ls ~ mo(income), data = dat) ``` - [Estimating Monotonic Effects with brms](https://cran.r-project.org/web/packages/brms/vignettes/brms_monotonic.html) ## Extract and visualize draws from brms ```r m <- brm(mpg ~ cyl * vs, mtcars) as_draws_df(m) # from posterior package as_draws_df(m, "b_cyl") ``` - [Function reference • posterior](https://mc-stan.org/posterior/reference/index.html) ```r library(brms) library(bayesplot) m <- brm(mpg ~ cyl, mtcars) tidybayes::get_variables(m) plot(m, variable = "b_cyl") ``` `bayesplot` can be used to make quick plots. ```r library(brms) library(bayesplot) m <- brm(mpg ~ cyl, mtcars) tidybayes::get_variables(m) # posterior <- as.array(m) # get posterior draws/samples # or as.matrix(m) mcmc_areas(as.array(m), pars = c("b_cyl")) mcmc_intervals(as.array(m), pars = c("b_cyl")) mcmc_areas_ridges(as.array(m), pars = c("b_cyl")) ``` - [Plot interval estimates from MCMC draws — MCMC-intervals • bayesplot](https://mc-stan.org/bayesplot/reference/MCMC-intervals.html) `tidybayes` can be used to make sophisticated and customized plots. ```r library(tidybayes); library(ggplot2) tidybayes::get_variables(m) spread_draws(m, b_cyl, b_Intercept) |> ggplot(aes(x = b_cyl)) + stat_halfeye() # also gather_draws(m, b_cyl, b_Intercept) ``` - [Extracting and visualizing tidy draws from brms models • tidybayes](https://mjskay.github.io/tidybayes/articles/tidy-brms.html) ## Interactions and conditional/marginal effects ```r library(brms) m <- brm(mpg ~ cyl * vs, mtcars) conditional_effects(m) # all effects # create conditions with brms conditions <- make_conditions(m, "vs") # default -1SD, mean, +1SD # manually create and name levels conditions <- data.frame(vs = c(-1, 1), cond__ = c("a", "b")) ce <- conditional_effects(m, effects = "cyl", conditions = conditions) plot(ce) # plot library(ggplot2) ggplot(ce$cyl, aes(x=cyl,y=estimate__, color=as.factor(vs))) + geom_ribbon(aes(ymax=upper__, ymin=lower__, fill = as.factor(vs)), alpha=0.2, col = NA) + geom_line(size=1) # with emmeans library(emmeans) mylist <- list(vs = c(-0.07, 0.44, 0.94)) # -1SD, mean, +1SD emtrends(m, ~vs, var = "cyl", at = mylist) emmeans(m, ~ cyl:vs, at = mylist) emmip(m, ~ cyl:vs, at = mylist) # plot mylist <- list(vs = c(-0.07, 0.44, 0.94), cyl = 4:8) datplot <- emmip(m, ~ cyl:vs, at = mylist, CIs = TRUE, plotit = FALSE) ggplot(data=datplot, aes(x=cyl,y=yvar, color=as.factor(vs))) + geom_ribbon(aes(ymax=UCL, ymin=LCL, fill = as.factor(vs)), alpha=0.2, col = NA) + geom_line(size=1) ``` ## predict vs fitted in marginal_effects function > “method = "predict"” accounts for the residual (observation-level) variance, the uncertainty in the fixed coefficients, and the uncertainty in the variance parameters for the grouping factors. “method = "fitted"” accounts only for the uncertainty in the fixed coefficients and the uncertainty in the variance parameters for the grouping factors. - [Difference between method = fitted vs. predict in marginal_effects function - Interfaces / brms - The Stan Forums](https://discourse.mc-stan.org/t/difference-between-method-fitted-vs-predict-in-marginal-effects-function/6901) - [Understanding predict function · Issue #82 · paul-buerkner/brms · GitHub](https://github.com/paul-buerkner/brms/issues/82#issuecomment-231440994) > Can you shed more light into what sources of uncertainty the predict function takes into account? As far as I understand, in a multilevel model, there are three sources of uncertainty: > 1. the residual (observation-level) variance > 2. the uncertainty in the fixed coefficient > 3. the uncertainty in the variance parameters for the grouping factors > Buerkner: To include all sources of uncertainty call predict(.). To exclude 1. call fitted(.). To exclude 3. (possible partially if you want), use the re_formula argument. Are all these uncertainties taken into account? ## Speed up sampling with stanc O1 optimization ```r # for cmdstan_model() add # stanc_options = list("O1") # for brm() add # backend = "cmdstanr", stan_model_args = list(stanc_options = list("O1")) brm(..., cores = 4, backend = "cmdstanr", stan_model_args = list(stanc_options = list("O1"))) # or add to CmdStan make/local # STANCFLAGS += --O1 ``` - [30%-40% drop in sampling time using stanc O1 optimizations - General - The Stan Forums](https://discourse.mc-stan.org/t/30-40-drop-in-sampling-time-using-stanc-o1-optimizations/28347?page=2) - [https://twitter.com/avehtari/status/1551459810469724161](https://twitter.com/avehtari/status/1551459810469724161) - [sampling speed improvements · Issue #1382 · paul-buerkner/brms · GitHub](https://github.com/paul-buerkner/brms/issues/1382) # Resources - [Bayesian Regression Models using Stan • brms](https://paul-buerkner.github.io/brms/) - [[Sorensen 2016 Bayesian linear mixed models using Stan]]