Tutorial 7.5b - Analysis of Covariance (Bayesian)

12 Jan 2018

Overview

Previous tutorials have concentrated on designs for either continuous (Regression) or categorical (ANOVA) predictor variables. Analysis of covariance (ANCOVA) models are essentially ANOVA models that incorporate one or more continuous and categorical variables (covariates). Although the relationship between a response variable and a covariate may itself be of substantial biological interest, typically covariate(s) are incorporated to reduce the amount of unexplained variability in the model (analogous to blocking - see tutorials ) and thereby increase the power of any treatment effects.

Linear model

The generic linear model is presented here purely for revisory purposes. If, it is unfamiliar to you or you are unsure about what parameters are to be estimated and tested, you are strongly advised to review the the tutorial on frequentist analysis of covariance.

The linear models for two and three factor design are: $$y_{ijk}=\mu+\alpha_i + \beta_{j} + \varepsilon_{ijk}$$ where $\mu$ is the overall mean, $\alpha$ is the effect of Factor A, $\beta$ is the effect of the covariate (Factor B), and $\varepsilon$ is the random unexplained or residual component.

ANCOVA in R

Data and scenario

Consider an experimental design aimed at exploring the effects of a categorical variable with three levels (Group A, Group B and Group C) on a response. From previous studies, we know that the response is influenced by another variable (covariate). Unfortunately, it was not possible to ensure that all sampling units were the same degree of the covariate. Therefore, in an attempt to account for this anticipated extra source of variability, we measured the level of the covariate for each sampling unit. Actually, in allocating treatments to the various treatment groups, we tried to ensure a similar mean and range of the covariate within each group.

Details of data generation

Random data incorporating the following trends (effect parameters)

the sample size per treatment=10
the categorical $x$ variable with 3 levels
the first treatment group has a population mean of 40.
the other two treatments reduce the mean by 15 and 20 units respectively
the data are drawn from normal distributions with a mean of 0 and standard deviation of 4 ($\sigma^2=16$)
the covariate (B) is a continuous variable with a mean of 20 and a standard deviation of 15

set.seed(3)
n <- 10
p <- 3
A.eff <- c(40, -15, -20)
beta <- -0.45
sigma <- 4
B <- rnorm(n * p, 0, 15)
A <- gl(p, n, lab = paste("Group", LETTERS[1:3]))
mm <- model.matrix(~A + B)
data <- data.frame(A = A, B = B, Y = as.numeric(c(A.eff, beta) %*% t(mm)) + rnorm(n * p, 0, 4))
data$B <- data$B + 20
head(data)

        A         B        Y
1 Group A  5.570999 50.09555
2 Group A 15.612114 45.38163
3 Group A 23.881823 41.16404
4 Group A  2.718022 50.72290
5 Group A 22.936742 37.26995
6 Group A 20.451859 42.61873

Assumptions

As ANCOVA designs are essentially regular ANOVA designs that are first adjusted (centered) for the covariate(s), ANCOVA designs inherit all of the underlying assumptions of the appropriate ANOVA design. Readers should also eventually consult Tutorial 7.6a, Tutorial 9.2a, Tutorial 9.3a and Tutorial 9.4a. Specifically, hypothesis tests assume that:

the appropriate residuals are normally distributed. Boxplots using the appropriate scale of replication (reflecting the appropriate residuals/F-ratio denominator, see the above tables) should be used to explore normality. Scale transformations are often useful.
the appropriate residuals are equally varied. Boxplots and plots of means against variance (using the appropriate scale of replication) should be used to explore the spread of values. Residual plots should reveal no patterns. Scale transformations are often useful.
the appropriate residuals are independent of one another.
the relationship between the response variable and the covariate should be linear. Linearity can be explored using scatterplots and residual plots should reveal no patterns.
for repeated measures and other designs in which treatment levels within blocks can not be be randomly ordered, the variance/covariance matrix is assumed to display sphericity.
for designs that utilize blocking, it is assumed that there are no block by within block interactions
homogeneity of slopes. Since a single slope is estimated for the covariate (rather than a separate slope per treatment), it is important that the trends between the response and the covariate is similar for each Group.

Exploratory data analysis

library(car)
scatterplot(Y ~ B | A, data = data)

boxplot(Y ~ A, data)

# OR via ggplot
library(ggplot2)
ggplot(data, aes(y = Y, x = B, group = A)) + geom_point() + geom_smooth(method = "lm")

ggplot(data, aes(y = Y, x = A)) + geom_boxplot()

Conclusions:

there is no evidence of obvious non-normality
the assumption of linearity seems reasonable
the variability of the three groups seems approximately equal
the slopes (Y vs B trends) appear broadly similar for each treatment group

Homogeneity of slopes

We can explore inferential evidence of unequal slopes by examining estimated effects of the interaction between the categorical variable and the covariate. Note, pay no attention to the main effects - only the interaction. Even though I intend to illustrate Bayesian analyses here, for such a simple model, it is considerably simpler to use traditional OLS for testing for the presence of an interaction..

anova(lm(Y ~ B * A, data = data))

Analysis of Variance Table

Response: Y
          Df  Sum Sq Mean Sq F value    Pr(>F)    
B          1  354.80  354.80 23.9691 5.414e-05 ***
A          2 2772.56 1386.28 93.6531 4.609e-12 ***
B:A        2   55.08   27.54  1.8606    0.1773    
Residuals 24  355.26   14.80                      
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Conclusions: there is very little evidence to suggest that the assumption of equal slopes will be inappropriate.

Model fitting or statistical analysis

Consistent with Tutorial 7.2b we will explore Bayesian modelling of ANCOVA using a variety of tools (such as MCMCpack, JAGS, RSTAN, RSTANARM and BRMS). Whilst JAGS and RSTAN are extremely flexible and thus allow models to be formulated that contain not only the simple model, but also additional derivatives, the other approaches are more restrictive. Consequently, I will mostly restrict models to just the minimum necessary and all derivatives will instead be calculated in R itself from the returned posteriors.

General
MCMCpack
JAGS
STAN
RSTANARM
BRMS

The observed response ($y_i$) are assumed to be drawn from a normal distribution with a given mean ($\mu$) and standard deviation ($\sigma$). The expected values ($\mu$) are themselves determined by the linear predictor ($\mathbf{X}\boldsymbol{\beta}$). In this case, $\boldsymbol{\beta}$ represents the vector of $\beta$'s - the intercept associated with the first group, the (effects) differences between this intercept and the intercepts for each other group as well as the slope associated with the continuous covariate. $\mathbf{X}$ is the model matrix.

MCMC sampling requires priors on all parameters. We will employ weakly informative priors. Specifying 'uninformative' priors is always a bit of a balancing act. If the priors are too vague (wide) the MCMC sampler can wander off into nonscence areas of likelihood rather than concentrate around areas of highest likelihood (desired when wanting the outcomes to be largely driven by the data). On the other hand, if the priors are too strong, they may have an influence on the parameters. In such a simple model, this balance is very forgiving - it is for more complex models that prior choice becomes more important.

For this simple model, we will go with zero-centered Gaussian (normal) priors with relatively large standard deviations (100) for both the intercept and the treatment effect and a wide half-cauchy (scale=5) for the standard deviation. $$ \begin{align} y_{ij} &\sim{} N(\mu_{ij}, \sigma)\\ \mu_{ij} &= \beta_0 + \mathbf{X}\boldsymbol{\beta}\\[1em] \beta_0 &\sim{} N(0,100)\\ \beta &\sim{} N(0,10)\\ \sigma &\sim{} cauchy(0,5)\\ \end{align} $$ Note, exploratory data analysis suggests that while the intercept (intercept of Group A) and categorical predictor effects (differences between intercepts of each of the Group and Group A's intercept) could be drawn from a similar distribution (with mean in the 10's and variances in the 100's), the slope (effect associated with Group A linear relationship) is likely to be an order of magnitude less. We might therefore be tempted to provide different priors for the intercept, categorical effects and slope effect. For a simple model such as this, it is unlikely to be necessary. However, for more complex models, where prior specification becomes more critical, separate priors would probably be necessary.

library(MCMCpack)
data.mcmcpack <- MCMCregress(Y ~ A + B, data = data)

Define the model

modelString = "
  model {
  #Likelihood
  for (i in 1:n) {
  y[i]~dnorm(mean[i],tau)
  mean[i] <- inprod(beta[],X[i,])
  }
  #Priors
  for (i in 1:ngroups) {
  beta[i] ~ dnorm(0, 1.0E-6) 
  }
  sigma ~ dunif(0, 100)
  tau <- 1 / (sigma * sigma)
  }
  "

Define the data

Arrange the data as a list (as required by BUGS). As input, JAGS will need to be supplied with:

the response variable (y)
the predictor model matrix (X)
the total number of observed items (n)
the number of predictor terms (nX)

This all needs to be contained within a list object. We will create two data lists, one for each of the hypotheses.

X <- model.matrix(~A + B, data)
data.list <- with(data, list(y = Y, X = X, n = nrow(data), ngroups = ncol(X)))

Define the MCMC chain parameters

Next we should define the behavioural parameters of the MCMC sampling chains. Include the following:

the nodes (estimated parameters) to monitor (return samples for)
the number of MCMC chains (3)
the number of burnin steps (1000)
the thinning factor (10)
the number of MCMC iterations - determined by the number of samples to save, the rate of thinning and the number of chains

params <- c("beta", "sigma")
nChains = 3
burnInSteps = 3000
thinSteps = 10
numSavedSteps = 15000  #across all chains
nIter = ceiling(burnInSteps + (numSavedSteps * thinSteps)/nChains)
nIter

[1] 53000

Fit the model

Now run the JAGS code via the R2jags interface. Note that the first time jags is run after the R2jags package is loaded, it is often necessary to run any kind of randomization function just to initiate the .Random.seed variable.

library(R2jags)

data.r2jags <- jags(data = data.list, inits = NULL, parameters.to.save = params,
    model.file = textConnection(modelString), n.chains = nChains, n.iter = nIter,
    n.burnin = burnInSteps, n.thin = thinSteps)

Compiling model graph
   Resolving undeclared variables
   Allocating nodes
Graph information:
   Observed stochastic nodes: 30
   Unobserved stochastic nodes: 5
   Total graph size: 232

Initializing model

print(data.r2jags)

Inference for Bugs model at "5", fit using jags,
 3 chains, each with 53000 iterations (first 3000 discarded), n.thin = 10
 n.sims = 15000 iterations saved
         mu.vect sd.vect    2.5%     25%     50%     75%   97.5%  Rhat n.eff
beta[1]   50.201   1.803  46.617  49.021  50.207  51.404  53.741 1.001  8400
beta[2]  -17.141   1.879 -20.832 -18.393 -17.149 -15.910 -13.402 1.001  8400
beta[3]  -22.917   1.882 -26.640 -24.161 -22.922 -21.682 -19.181 1.001  5900
beta[4]   -0.414   0.065  -0.543  -0.457  -0.414  -0.372  -0.285 1.001 15000
sigma      4.175   0.619   3.178   3.735   4.108   4.531   5.587 1.001  7200
deviance 169.232   3.548 164.517 166.636 168.502 171.075 178.013 1.001 15000

For each parameter, n.eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor (at convergence, Rhat=1).

DIC info (using the rule, pD = var(deviance)/2)
pD = 6.3 and DIC = 175.5
DIC is an estimate of expected predictive error (lower deviance is better).

data.mcmc.list <- as.mcmc(data.r2jags)

Whilst Gibbs sampling provides an elegantly simple MCMC sampling routine, very complex hierarchical models can take enormous numbers of iterations (often prohibitory large) to converge on a stable posterior distribution.

To address this, Andrew Gelman (and other collaborators) have implemented a variation on Hamiltonian Monte Carlo (HMC: a sampler that selects subsequent samples in a way that reduces the correlation between samples, thereby speeding up convergence) called the No-U-Turn (NUTS) sampler. All of these developments are brought together into a tool called Stan ("Sampling Through Adaptive Neighborhoods").

By design (to appeal to the vast BUGS users), Stan models are defined in a manner reminiscent of BUGS. Stan first converts these models into C++ code which is then compiled to allow very rapid computation.

Consistent with the use of C++, the model must be accompanied by variable declarations for all inputs and parameters.

One important difference between Stan and JAGS is that whereas BUGS (and thus JAGS) use precision rather than variance, Stan uses variance.

Stan itself is a stand-alone command line application. However, conveniently, the authors of Stan have also developed an R interface to Stan called Rstan which can be used much like R2jags.

Model matrix formulation

The minimum model in Stan required to fit the above simple regression follows. Note the following modifications from the model defined in JAGS:

the normal distribution is defined by variance rather than precision
rather than using a uniform prior for sigma, I am using a half-Cauchy

We now translate the likelihood model into STAN code.
$$\begin{align} y_{ij}&\sim{}N(\mu_{ij}, \sigma)\\ \mu_{ij} &= \mathbf{X}\boldsymbol{\beta}\\ \beta_0&\sim{}N(0,100)\\ \beta&\sim{}N(0,10)\\ \sigma&\sim{}Cauchy(0,5)\\ \end{align} $$

Define the model

modelString = "
  data {
  int<lower=1> n;
  int<lower=1> nX;
  vector [n] y;
  matrix [n,nX] X;
  }
  parameters {
  vector[nX] beta;
  real<lower=0> sigma;
  }
  transformed parameters {
  vector[n] mu;

  mu = X*beta;
  }
  model {
  #Likelihood
  y~normal(mu,sigma);
  
  #Priors
  beta ~ normal(0,100);
  sigma~cauchy(0,5);
  }
  generated quantities {
  vector[n] log_lik;
  
  for (i in 1:n) {
  log_lik[i] = normal_lpdf(y[i] | mu[i], sigma); 
  }
  }
  "

Define the data

Arrange the data as a list (as required by BUGS). As input, JAGS will need to be supplied with:

the response variable (y)
the predictor model matrix (X)
the total number of observed items (n)
the number of predictor terms (nX)

This all needs to be contained within a list object. We will create two data lists, one for each of the hypotheses.

Xmat <- model.matrix(~A + B, data)
data.list <- with(data, list(y = Y, X = Xmat, nX = ncol(Xmat), n = nrow(data)))

Fit the model

## load the rstan package
library(rstan)

data.rstan <- stan(data = data.list, model_code = modelString, chains = 3,
    iter = 2000, warmup = 500, thin = 3)

In file included from /usr/local/lib/R/site-library/BH/include/boost/config.hpp:39:0,
                 from /usr/local/lib/R/site-library/BH/include/boost/math/tools/config.hpp:13,
                 from /usr/local/lib/R/site-library/StanHeaders/include/stan/math/rev/core/var.hpp:7,
                 from /usr/local/lib/R/site-library/StanHeaders/include/stan/math/rev/core/gevv_vvv_vari.hpp:5,
                 from /usr/local/lib/R/site-library/StanHeaders/include/stan/math/rev/core.hpp:12,
                 from /usr/local/lib/R/site-library/StanHeaders/include/stan/math/rev/mat.hpp:4,
                 from /usr/local/lib/R/site-library/StanHeaders/include/stan/math.hpp:4,
                 from /usr/local/lib/R/site-library/StanHeaders/include/src/stan/model/model_header.hpp:4,
                 from file4be61ee4e612.cpp:8:
/usr/local/lib/R/site-library/BH/include/boost/config/compiler/gcc.hpp:186:0: warning: "BOOST_NO_CXX11_RVALUE_REFERENCES" redefined
 #  define BOOST_NO_CXX11_RVALUE_REFERENCES
 ^
<command-line>:0:0: note: this is the location of the previous definition

SAMPLING FOR MODEL '3d2414c9dcf4b5e12be870eadd2c894a' NOW (CHAIN 1).

Gradient evaluation took 1.5e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.15 seconds.
Adjust your expectations accordingly!


Iteration:    1 / 2000 [  0%]  (Warmup)
Iteration:  200 / 2000 [ 10%]  (Warmup)
Iteration:  400 / 2000 [ 20%]  (Warmup)
Iteration:  501 / 2000 [ 25%]  (Sampling)
Iteration:  700 / 2000 [ 35%]  (Sampling)
Iteration:  900 / 2000 [ 45%]  (Sampling)
Iteration: 1100 / 2000 [ 55%]  (Sampling)
Iteration: 1300 / 2000 [ 65%]  (Sampling)
Iteration: 1500 / 2000 [ 75%]  (Sampling)
Iteration: 1700 / 2000 [ 85%]  (Sampling)
Iteration: 1900 / 2000 [ 95%]  (Sampling)
Iteration: 2000 / 2000 [100%]  (Sampling)

 Elapsed Time: 0.062651 seconds (Warm-up)
               0.061975 seconds (Sampling)
               0.124626 seconds (Total)


SAMPLING FOR MODEL '3d2414c9dcf4b5e12be870eadd2c894a' NOW (CHAIN 2).

Gradient evaluation took 7e-06 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.07 seconds.
Adjust your expectations accordingly!


Iteration:    1 / 2000 [  0%]  (Warmup)
Iteration:  200 / 2000 [ 10%]  (Warmup)
Iteration:  400 / 2000 [ 20%]  (Warmup)
Iteration:  501 / 2000 [ 25%]  (Sampling)
Iteration:  700 / 2000 [ 35%]  (Sampling)
Iteration:  900 / 2000 [ 45%]  (Sampling)
Iteration: 1100 / 2000 [ 55%]  (Sampling)
Iteration: 1300 / 2000 [ 65%]  (Sampling)
Iteration: 1500 / 2000 [ 75%]  (Sampling)
Iteration: 1700 / 2000 [ 85%]  (Sampling)
Iteration: 1900 / 2000 [ 95%]  (Sampling)
Iteration: 2000 / 2000 [100%]  (Sampling)

 Elapsed Time: 0.053706 seconds (Warm-up)
               0.057327 seconds (Sampling)
               0.111033 seconds (Total)


SAMPLING FOR MODEL '3d2414c9dcf4b5e12be870eadd2c894a' NOW (CHAIN 3).

Gradient evaluation took 7e-06 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.07 seconds.
Adjust your expectations accordingly!


Iteration:    1 / 2000 [  0%]  (Warmup)
Iteration:  200 / 2000 [ 10%]  (Warmup)
Iteration:  400 / 2000 [ 20%]  (Warmup)
Iteration:  501 / 2000 [ 25%]  (Sampling)
Iteration:  700 / 2000 [ 35%]  (Sampling)
Iteration:  900 / 2000 [ 45%]  (Sampling)
Iteration: 1100 / 2000 [ 55%]  (Sampling)
Iteration: 1300 / 2000 [ 65%]  (Sampling)
Iteration: 1500 / 2000 [ 75%]  (Sampling)
Iteration: 1700 / 2000 [ 85%]  (Sampling)
Iteration: 1900 / 2000 [ 95%]  (Sampling)
Iteration: 2000 / 2000 [100%]  (Sampling)

 Elapsed Time: 0.054613 seconds (Warm-up)
               0.064711 seconds (Sampling)
               0.119324 seconds (Total)

print(data.rstan, par = c("beta", "sigma"))

Inference for Stan model: 3d2414c9dcf4b5e12be870eadd2c894a.
3 chains, each with iter=2000; warmup=500; thin=3; 
post-warmup draws per chain=500, total post-warmup draws=1500.

          mean se_mean   sd   2.5%    25%    50%    75%  97.5% n_eff Rhat
beta[1]  50.22    0.06 1.79  46.60  49.03  50.24  51.40  53.83  1027    1
beta[2] -17.12    0.06 1.92 -20.88 -18.31 -17.14 -15.96 -13.19  1210    1
beta[3] -22.88    0.05 1.93 -26.78 -24.11 -22.86 -21.58 -19.24  1407    1
beta[4]  -0.42    0.00 0.06  -0.54  -0.46  -0.42  -0.37  -0.29  1282    1
sigma     4.11    0.02 0.58   3.12   3.70   4.05   4.45   5.37  1353    1

Samples were drawn using NUTS(diag_e) at Fri Nov  3 15:05:11 2017.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).

The STAN team has put together pre-compiled modules (functions) to make specifying and applying STAN models much simpler. Each function offers a consistent interface that is Also reminiscent of major frequentist linear modelling routines in R.

Whilst it is not necessary to specify priors when using rstanarm functions (as defaults will be generated), there is no guarantee that the routines for determining these defaults will persist over time. Furthermore, it is always better to define your own priors if for no other reason that it forces you to thing about what you re doing. Consistent with the pure STAN version, we will employ the following priors:

weakly informative Gaussian prior for the intercept $\beta_0 \sim{} N(0, 100)$
weakly informative Gaussian prior for the treatment effect $\beta_1 \sim{} N(0, 100)$
half-cauchy prior for the variance $\sigma \sim{} Cauchy(0, 5)$

Note, I am using the refresh=0 option so as to suppress the larger regular output in the interest of keeping output to what is necessary for this tutorial. When running outside of a tutorial context, the regular verbose output is useful as it provides a way to gauge progress.

library(rstanarm)
library(broom)
library(coda)

data.rstanarm = stan_glm(Y ~ A + B, data = data, iter = 2000, warmup = 500,
    chains = 3, thin = 2, refresh = 0, prior_intercept = normal(0, 100),
    prior = normal(0, 100), prior_aux = cauchy(0, 2))

Gradient evaluation took 5.5e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.55 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.149605 seconds (Warm-up)
               0.181861 seconds (Sampling)
               0.331466 seconds (Total)


Gradient evaluation took 1.2e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.12 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.120154 seconds (Warm-up)
               0.198657 seconds (Sampling)
               0.318811 seconds (Total)


Gradient evaluation took 1.4e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.14 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.182133 seconds (Warm-up)
               0.174642 seconds (Sampling)
               0.356775 seconds (Total)

print(data.rstanarm)

stan_glm
 family:  gaussian [identity]
 formula: Y ~ A + B
------

Estimates:
            Median MAD_SD
(Intercept)  50.2    1.8 
AGroup B    -17.2    1.9 
AGroup C    -22.9    1.9 
B            -0.4    0.1 
sigma         4.1    0.6 

Sample avg. posterior predictive 
distribution of y (X = xbar):
         Median MAD_SD
mean_PPD 30.0    1.1  

------
For info on the priors used see help('prior_summary.stanreg').

tidyMCMC(data.rstanarm, conf.int = TRUE, conf.method = "HPDinterval")

         term    estimate std.error    conf.low   conf.high
1 (Intercept)  50.2286902 1.9365801  46.4428738  54.1263008
2    AGroup B -17.2114440 1.9386491 -21.0066484 -13.4338089
3    AGroup C -22.9037572 1.9679361 -26.6757211 -19.0189639
4           B  -0.4138748 0.0681432  -0.5451839  -0.2852376
5       sigma   4.1990238 0.6317351   3.0581470   5.4328442

The brms package serves a similar goal to the rstanarm package - to provide a simple user interface to STAN. However, unlike the rstanarm implementation, brms simply converts the formula, data, priors and family into STAN model code and data before executing stan with those elements.

Whilst it is not necessary to specify priors when using brms functions (as defaults will be generated), there is no guarantee that the routines for determining these defaults will persist over time. Furthermore, it is always better to define your own priors if for no other reason that it forces you to thing about what you are doing. Consistent with the pure STAN version, we will employ the following priors:

weakly informative Gaussian prior for the intercept $\beta_0 \sim{} N(0, 100)$
weakly informative Gaussian prior for the treatment effect $\beta_1 \sim{} N(0, 100)$
half-cauchy prior for the variance $\sigma \sim{} Cauchy(0, 5)$

Note, I am using the refresh=0. option so as to suppress the larger regular output in the interest of keeping output to what is necessary for this tutorial. When running outside of a tutorial context, the regular verbose output is useful as it provides a way to gauge progress.

library(brms)
library(broom)
library(coda)

data.brms = brm(Y ~ A + B, data = data, iter = 2000, warmup = 500, chains = 3,
    thin = 2, refresh = 0, prior = c(prior(normal(0, 100), class = "Intercept"),
        prior(normal(0, 100), class = "b"), prior(cauchy(0, 5), class = "sigma")))

Gradient evaluation took 1.8e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.18 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.036403 seconds (Warm-up)
               0.043672 seconds (Sampling)
               0.080075 seconds (Total)


Gradient evaluation took 8e-06 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.08 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.033241 seconds (Warm-up)
               0.041044 seconds (Sampling)
               0.074285 seconds (Total)


Gradient evaluation took 7e-06 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.07 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.036154 seconds (Warm-up)
               0.041079 seconds (Sampling)
               0.077233 seconds (Total)

print(data.brms)

 Family: gaussian(identity) 
Formula: Y ~ A + B 
   Data: data (Number of observations: 30) 
Samples: 3 chains, each with iter = 2000; warmup = 500; thin = 2; 
         total post-warmup samples = 2250
    ICs: LOO = NA; WAIC = NA; R2 = NA
 
Population-Level Effects: 
          Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
Intercept    50.20      1.76    46.79    53.75       2177    1
AGroupB     -17.17      1.83   -20.81   -13.68       2158    1
AGroupC     -22.89      1.85   -26.43   -19.25       2127    1
B            -0.41      0.06    -0.54    -0.29       2129    1

Family Specific Parameters: 
      Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sigma     4.11      0.59     3.14     5.41       1854    1

Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample 
is a crude measure of effective sample size, and Rhat is the potential 
scale reduction factor on split chains (at convergence, Rhat = 1).

tidyMCMC(data.brms, conf.int = TRUE, conf.method = "HPDinterval")

         term    estimate  std.error    conf.low   conf.high
1 b_Intercept  50.1970335 1.75540856  46.6017008  53.5440980
2   b_AGroupB -17.1728328 1.83135058 -20.6803108 -13.5582289
3   b_AGroupC -22.8915861 1.85449337 -26.3846325 -19.2178870
4         b_B  -0.4122714 0.06370931  -0.5416812  -0.2961189
5       sigma   4.1081576 0.58644550   3.1075609   5.3375706

In addition to the regular model diagnostic checks (such as residual plots), for Bayesian analyses, it is necessary to explore the characteristics of the MCMC chains and the sampler in general. Recall that the purpose of MCMC sampling is to replicate the posterior distribution of the model likelihood and priors by drawing a known number of samples from this posterior (thereby formulating a probability distribution). This is only reliable if the MCMC samples accurately reflect the posterior.

Unfortunately, since we only know the posterior in the most trivial of circumstances, it is necessary to rely on indirect measures of how accurately the MCMC samples are likely to reflect the likelihood. I will breifly outline the most important diagnostics, however, please refer to Tutorial 4.3, Secton 3.1: Markov Chain Monte Carlo sampling for a discussion of these diagnostics.

Traceplots for each parameter illustrate the MCMC sample values after each successive iteration along the chain. Bad chain mixing (characterized by any sort of pattern) suggests that the MCMC sampling chains may not have completely traversed all features of the posterior distribution and that more iterations are required to ensure the distribution has been accurately represented.
Autocorrelation plot for each paramter illustrate the degree of correlation between MCMC samples separated by different lags. For example, a lag of 0 represents the degree of correlation between each MCMC sample and itself (obviously this will be a correlation of 1). A lag of 1 represents the degree of correlation between each MCMC sample and the next sample along the Chain and so on. In order to be able to generate unbiased estimates of parameters, the MCMC samples should be independent (uncorrelated). In the figures below, this would be violated in the top autocorrelation plot and met in the bottom autocorrelation plot.
Rhat statistic for each parameter provides a measure of sampling efficiency/effectiveness. Ideally, all values should be less than 1.05. If there are values of 1.05 or greater it suggests that the sampler was not very efficient or effective. Not only does this mean that the sampler was potentiall slower than it could have been, more importantly, it could indicate that the sampler spent time sampling in a region of the likelihood that is less informative. Such a situation can arise from either a misspecified model or overly vague priors that permit sampling in otherwise nonscence parameter space.

Prior to inspecting any summaries of the parameter estimates, it is prudent to inspect a range of chain convergence diagnostics.

Trace plots
View trace plots
library(MCMCpack) plot(data.mcmcpack)
Trace plots show no evidence that the chains have not reasonably traversed the entire multidimensional parameter space.

Raftery diagnostic

View Raftery diagnostic

library(MCMCpack)
raftery.diag(data.mcmcpack)

Quantile (q) = 0.025
Accuracy (r) = +/- 0.005
Probability (s) = 0.95 
                                                   
             Burn-in  Total Lower bound  Dependence
             (M)      (N)   (Nmin)       factor (I)
 (Intercept) 2        3834  3746         1.020     
 AGroup B    2        3771  3746         1.010     
 AGroup C    2        3962  3746         1.060     
 B           2        3741  3746         0.999     
 sigma2      2        3771  3746         1.010

The Raftery diagnostics estimate that we would require about 3900 samples to reach the specified level of confidence in convergence. As we have 10,000 samples, we can be confidence that convergence has occurred.

Autocorrelation diagnostic

View autocorrelations

library(MCMCpack)
autocorr.diag(data.mcmcpack)

        (Intercept)      AGroup B     AGroup C            B       sigma2
Lag 0   1.000000000  1.000000e+00  1.000000000  1.000000000  1.000000000
Lag 1  -0.003346420 -4.064355e-03 -0.002480710  0.002480390  0.132659644
Lag 5   0.016500991  1.299476e-02 -0.003939996  0.001939800 -0.018030548
Lag 10 -0.005233676 -6.125242e-06 -0.013564685 -0.006005238  0.006407515
Lag 50  0.006330549 -9.343650e-03 -0.011855223 -0.002673821  0.007502186

A lag of 1 appears to be mainly sufficient to avoid autocorrelation.

Again, prior to examining the summaries, we should have explored the convergence diagnostics.

library(coda)
data.mcmc = as.mcmc(data.r2jags)

Trace plots
plot(data.mcmc)
Trace plots show no evidence that the chains have not reasonably traversed the entire multidimensional parameter space.
When there are a lot of parameters, this can result in a very large number of traceplots. To focus on just certain parameters (such as $\beta$s)
preds <- c("beta[1]", "beta[2]", "beta[3]", "beta[4]") plot(as.mcmc(data.r2jags)[, preds])

Raftery diagnostic

raftery.diag(data.mcmc)

[[1]]

Quantile (q) = 0.025
Accuracy (r) = +/- 0.005
Probability (s) = 0.95 
                                                
          Burn-in  Total Lower bound  Dependence
          (M)      (N)   (Nmin)       factor (I)
 beta[1]  20       36200 3746          9.66     
 beta[2]  20       38660 3746         10.30     
 beta[3]  20       38030 3746         10.20     
 beta[4]  20       36200 3746          9.66     
 deviance 20       37410 3746          9.99     
 sigma    20       38030 3746         10.20     


[[2]]

Quantile (q) = 0.025
Accuracy (r) = +/- 0.005
Probability (s) = 0.95 
                                                
          Burn-in  Total Lower bound  Dependence
          (M)      (N)   (Nmin)       factor (I)
 beta[1]  20       39300 3746         10.50     
 beta[2]  20       38660 3746         10.30     
 beta[3]  20       39950 3746         10.70     
 beta[4]  20       37410 3746          9.99     
 deviance 20       39300 3746         10.50     
 sigma    20       36800 3746          9.82     


[[3]]

Quantile (q) = 0.025
Accuracy (r) = +/- 0.005
Probability (s) = 0.95 
                                                
          Burn-in  Total Lower bound  Dependence
          (M)      (N)   (Nmin)       factor (I)
 beta[1]  20       38030 3746         10.20     
 beta[2]  20       36800 3746          9.82     
 beta[3]  20       37410 3746          9.99     
 beta[4]  20       38030 3746         10.20     
 deviance 20       36800 3746          9.82     
 sigma    20       36200 3746          9.66

The Raftery diagnostics for each chain estimate that we would require no more than 5000 samples to reach the specified level of confidence in convergence. As we have 16,667 samples, we can be confidence that convergence has occurred.

Autocorrelation diagnostic

autocorr.diag(data.mcmc)

              beta[1]       beta[2]      beta[3]      beta[4]      deviance         sigma
Lag 0    1.0000000000  1.0000000000  1.000000000  1.000000000  1.0000000000  1.0000000000
Lag 10  -0.0086789407  0.0002409218 -0.012834845 -0.014876564 -0.0155077394 -0.0009823887
Lag 50  -0.0133637002 -0.0146591155 -0.022266022  0.007955582 -0.0131845968 -0.0066436459
Lag 100  0.0007452402  0.0128511117 -0.001800657 -0.007853232  0.0008253937 -0.0101001901
Lag 500 -0.0029682531  0.0011976990  0.001848702 -0.005150388 -0.0016034261 -0.0118212907

A lag of 10 appears to be sufficient to avoid autocorrelation (poor mixing).

Again, prior to examining the summaries, we should have explored the convergence diagnostics. There are numerous ways of working with STAN model fits (for exploring diagnostics and summarization).

extract the mcmc samples and convert them into a mcmc.list to leverage the various coda routines
use the numerous routines that come with the rstan package
use the routines that come with the bayesplot package
explore the diagnostics interactively via shinystan

We will explore all of these:

via coda

Traceplots

library(coda)
s = as.array(data.rstan)
wch = grep("beta", dimnames(s)$parameters)
s = s[, , wch]
mcmc <- do.call(mcmc.list, plyr:::alply(s[, , -(length(s[1, 1, ]))], 2, as.mcmc))
plot(mcmc)

Autocorrelation

library(coda)
s = as.array(data.rstan)
wch = grep("beta", dimnames(s)$parameters)
s = s[, , wch]
mcmc <- do.call(mcmc.list, plyr:::alply(s[, , -(length(s[1, 1, ]))], 2, as.mcmc))
autocorr.diag(mcmc)

            beta[1]       beta[2]      beta[3]
Lag 0   1.000000000  1.0000000000  1.000000000
Lag 1   0.064013692  0.0605531343  0.026397084
Lag 5   0.007321517  0.0009337118  0.006726571
Lag 10 -0.016376224 -0.0098527765 -0.009376572
Lag 50 -0.017683148 -0.0290390559  0.010863075

via rstan
- Traceplots
  stan_trace(data.rstan)
  Trace plots show no evidence that the chains have not reasonably traversed the entire multidimensional parameter space.
- Raftery diagnostic
  raftery.diag(data.rstan)
  
  Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 You need a sample size of at least 3746 with these values of q, r and s
  The Raftery diagnostics for each chain estimate that we would require no more than 5000 samples to reach the specified level of confidence in convergence. As we have 16,667 samples, we can be confidence that convergence has occurred.
- Autocorrelation diagnostic
  stan_ac(data.rstan)
  A lag of 2 appears broadly sufficient to avoid autocorrelation (poor mixing).
- Rhat values. These values are a measure of sampling efficiency/effectiveness. Ideally, all values should be less than 1.05. If there are values of 1.05 or greater it suggests that the sampler was not very efficient or effective. Not only does this mean that the sampler was potentiall slower than it could have been, more importantly, it could indicate that the sampler spent time sampling in a region of the likelihood that is less informative. Such a situation can arise from either a misspecified model or overly vague priors that permit sampling in otherwise nonscence parameter space.
  stan_rhat(data.rstan)
  In this instance, all rhat values are well below 1.05 (a good thing).
- Another measure of sampling efficiency is Effective Sample Size (ess). ess indicate the number samples (or proportion of samples that the sampling algorithm deamed effective. The sampler rejects samples on the basis of certain criterion and when it does so, the previous sample value is used. Hence while the MCMC sampling chain may contain 1000 samples, if there are only 10 effective samples (1%), the estimated properties are not likely to be reliable.
  stan_ess(data.rstan)
  In this instance, most of the parameters have reasonably high effective samples and thus there is likely to be a good range of values from which to estimate paramter properties.
via bayesplot
- Trace plots and density plots
  library(bayesplot) mcmc_trace(as.matrix(data.rstan), regex_pars = "beta|sigma")
  library(bayesplot) mcmc_combo(as.matrix(data.rstan), regex_pars = "beta|sigma")
  Trace plots show no evidence that the chains have not reasonably traversed the entire multidimensional parameter space.
- Density plots
  library(bayesplot) mcmc_dens(as.matrix(data.rstan), regex_pars = "beta|sigma")
  Density plots sugggest mean or median would be appropriate to describe the fixed posteriors and median is appropriate for the sigma posterior.

via shinystan

library(shinystan)
launch_shinystan(data.rstan)

It is worth exploring the influence of our priors.

Again, prior to examining the summaries, we should have explored the convergence diagnostics. There are numerous ways of working with STANARM model fits (for exploring diagnostics and summarization).

extract the mcmc samples and convert them into a mcmc.list to leverage the various coda routines
use the numerous routines that come with the rstan package
use the routines that come with the bayesplot package
explore the diagnostics interactively via shinystan

We will explore all of these:

via coda

Traceplots

library(coda)
s = as.array(data.rstanarm)
mcmc <- do.call(mcmc.list, plyr:::alply(s[, , -(length(s[1, 1, ]))], 2, as.mcmc))
plot(mcmc)

plot of chunk tut7.5bRSTANARMcodaTraceplots

Autocorrelation

library(coda)
s = as.array(data.rstanarm)
mcmc <- do.call(mcmc.list, plyr:::alply(s[, , -(length(s[1, 1, ]))], 2, as.mcmc))
autocorr.diag(mcmc)

         (Intercept)     AGroup B     AGroup C            B
Lag 0   1.0000000000  1.000000000  1.000000000  1.000000000
Lag 1   0.0290559273  0.001379705 -0.023247646 -0.018452103
Lag 5  -0.0035230620 -0.003667234  0.004820692  0.008991159
Lag 10  0.0180539117 -0.016292304  0.028751812  0.024889288
Lag 50  0.0008045354 -0.013900306  0.008315425  0.009651236

via rstan
- Traceplots
  stan_trace(data.rstanarm)
  Trace plots show no evidence that the chains have not reasonably traversed the entire multidimensional parameter space.
- Raftery diagnostic
  raftery.diag(data.rstanarm)
  
  Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 You need a sample size of at least 3746 with these values of q, r and s
  The Raftery diagnostics for each chain estimate that we would require no more than 5000 samples to reach the specified level of confidence in convergence. As we have 16,667 samples, we can be confidence that convergence has occurred.
- Autocorrelation diagnostic
  stan_ac(data.rstanarm)
  A lag of 2 appears broadly sufficient to avoid autocorrelation (poor mixing).
- Rhat values. These values are a measure of sampling efficiency/effectiveness. Ideally, all values should be less than 1.05. If there are values of 1.05 or greater it suggests that the sampler was not very efficient or effective. Not only does this mean that the sampler was potentiall slower than it could have been, more importantly, it could indicate that the sampler spent time sampling in a region of the likelihood that is less informative. Such a situation can arise from either a misspecified model or overly vague priors that permit sampling in otherwise nonscence parameter space.
  stan_rhat(data.rstanarm)
  In this instance, all rhat values are well below 1.05 (a good thing).
- Another measure of sampling efficiency is Effective Sample Size (ess). ess indicate the number samples (or proportion of samples that the sampling algorithm deamed effective. The sampler rejects samples on the basis of certain criterion and when it does so, the previous sample value is used. Hence while the MCMC sampling chain may contain 1000 samples, if there are only 10 effective samples (1%), the estimated properties are not likely to be reliable.
  stan_ess(data.rstanarm)
  In this instance, most of the parameters have reasonably high effective samples and thus there is likely to be a good range of values from which to estimate paramter properties.
via bayesplot
- Trace plots and density plots
  library(bayesplot) mcmc_trace(as.array(data.rstanarm), regex_pars = "Intercept|x|sigma")
  mcmc_combo(as.array(data.rstanarm))
  Trace plots show no evidence that the chains have not reasonably traversed the entire multidimensional parameter space.
- Density plots
  mcmc_dens(as.array(data.rstanarm))
  Density plots sugggest mean or median would be appropriate to describe the fixed posteriors and median is appropriate for the sigma posterior.

via rstanarm
The rstanarm package provides additional posterior checks.

Posterior vs Prior - this compares the posterior estimate for each parameter against the associated prior. If the spread of the priors is small relative to the posterior, then it is likely that the priors are too influential. On the other hand, overly wide priors can lead to computational issues.

library(rstanarm)
posterior_vs_prior(data.rstanarm, color_by = "vs", group_by = TRUE,
    facet_args = list(scales = "free_y"))

Gradient evaluation took 3.6e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.36 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.318686 seconds (Warm-up)
               0.056968 seconds (Sampling)
               0.375654 seconds (Total)


Gradient evaluation took 1e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.1 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.371641 seconds (Warm-up)
               0.056263 seconds (Sampling)
               0.427904 seconds (Total)

plot of chunk tut7.5bRSTANARMposterorvsprior

via shinystan

						  library(shinystan) 
						  launch_shinystan(data.rstanarm))

Again, prior to examining the summaries, we should have explored the convergence diagnostics. Rather than dublicate this for both additive and multiplicative models, we will only explore the multiplicative model. There are numerous ways of working with STAN model fits (for exploring diagnostics and summarization).

extract the mcmc samples and convert them into a mcmc.list to leverage the various coda routines
use the numerous routines that come with the rstan package
use the routines that come with the bayesplot package
explore the diagnostics interactively via shinystan

We will explore all of these:

via coda
- Traceplots
- Autocorrelation
via rstan
- Traceplots
  stan_trace(data.brms$fit)
  Trace plots show no evidence that the chains have not reasonably traversed the entire multidimensional parameter space.
- Raftery diagnostic
  raftery.diag(data.brms)
  
  Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 You need a sample size of at least 3746 with these values of q, r and s
  The Raftery diagnostics for each chain estimate that we would require no more than 5000 samples to reach the specified level of confidence in convergence. As we have 16,667 samples, we can be confidence that convergence has occurred.
- Autocorrelation diagnostic
  stan_ac(data.brms$fit)
  A lag of 2 appears broadly sufficient to avoid autocorrelation (poor mixing).
- Rhat values. These values are a measure of sampling efficiency/effectiveness. Ideally, all values should be less than 1.05. If there are values of 1.05 or greater it suggests that the sampler was not very efficient or effective. Not only does this mean that the sampler was potentiall slower than it could have been, more importantly, it could indicate that the sampler spent time sampling in a region of the likelihood that is less informative. Such a situation can arise from either a misspecified model or overly vague priors that permit sampling in otherwise nonscence parameter space.
  stan_rhat(data.brms$fit)
  In this instance, all rhat values are well below 1.05 (a good thing).
- Another measure of sampling efficiency is Effective Sample Size (ess). ess indicate the number samples (or proportion of samples that the sampling algorithm deamed effective. The sampler rejects samples on the basis of certain criterion and when it does so, the previous sample value is used. Hence while the MCMC sampling chain may contain 1000 samples, if there are only 10 effective samples (1%), the estimated properties are not likely to be reliable.
  stan_ess(data.brms$fit)
  In this instance, most of the parameters have reasonably high effective samples and thus there is likely to be a good range of values from which to estimate paramter properties.

Model validation

General
MCMCpack
JAGS
STAN
RSTANARM
BRMS

Model validation involves exploring the model diagnostics and fit to ensure that the model is broadly appropriate for the data. As such, exploration of the residuals should be routine.

For more complex models (those that contain multiple effects, it is also advisable to plot the residuals against each of the individual predictors. For sampling designs that involve sample collection over space or time, it is also a good idea to explore whether there are any temporal or spatial patterns in the residuals.

There are numerous situations (e.g. when applying specific variance-covariance structures to a model) where raw residuals do not reflect the interior workings of the model. Typically, this is because they do not take into account the variance-covariance matrix or assume a very simple variance-covariance matrix. Since the purpose of exploring residuals is to evaluate the model, for these cases, it is arguably better to draw conclusions based on standardized (or studentized) residuals.

Unfortunately the definitions of standardized and studentized residuals appears to vary and the two terms get used interchangeably. I will adopt the following definitions:

Standardized residuals:		the raw residuals divided by the true standard deviation of the residuals (which of course is rarely known).
Studentized residuals:		the raw residuals divided by the standard deviation of the residuals. Note that externally studentized residuals are calculated by dividing the raw residuals by a unique standard deviation for each observation that is calculated from regressions having left each successive observation out.
Pearson residuals:		the raw residuals divided by the standard deviation of the response variable.

The mark of a good model is being able to predict well. In an ideal world, we would have sufficiently large sample size as to permit us to hold a fraction (such as 25%) back thereby allowing us to train the model on 75% of the data and then see how well the model can predict the withheld 25%. Unfortunately, such a luxury is still rare in ecology.

The next best option is to see how well the model can predict the observed data. Models tend to struggle most with the extremes of trends and have particular issues when the extremes approach logical boundaries (such as zero for count data and standard deviations). We can use the fitted model to generate random predicted observations and then explore some properties of these compared to the actual observed data.

Rather than dublicate this for both additive and multiplicative models, we will only explore the multiplicative model.

Residuals are not computed directly within MCMCpack. However, we can calculate them manually form the posteriors.

mcmc = as.data.frame(data.mcmcpack)
# generate a model matrix
newdata = data
Xmat = model.matrix(~A + B, newdata)
## get median parameter estimates
head(mcmc)

  (Intercept)  AGroup B  AGroup C          B   sigma2
1    50.18839 -18.23379 -22.87414 -0.4394640 14.11917
2    49.39556 -15.64877 -20.98086 -0.4528605 13.22004
3    49.44854 -16.36367 -23.66886 -0.3887300 15.57976
4    51.46891 -16.26642 -24.26313 -0.4709924 12.01597
5    50.55617 -17.54613 -22.44262 -0.4252965 16.35805
6    50.73340 -17.54078 -23.75846 -0.4598987 10.84731

coefs = apply(mcmc[, 1:4], 2, median)
fit = as.vector(coefs %*% t(Xmat))
resid = data$Y - fit
ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))

Residuals against predictors

mcmc = as.data.frame(data.mcmcpack)
# generate a model matrix
newdata = newdata
Xmat = model.matrix(~A + B, newdata)
## get median parameter estimates
coefs = apply(mcmc[, 1:4], 2, median)
fit = as.vector(coefs %*% t(Xmat))
resid = data$Y - fit
newdata = newdata %>% cbind(fit, resid)
ggplot(newdata) + geom_point(aes(y = resid, x = A))

ggplot(newdata) + geom_point(aes(y = resid, x = B))

And now for studentized residuals

mcmc = as.data.frame(data.mcmcpack)
# generate a model matrix
newdata = data
Xmat = model.matrix(~A + B, newdata)
## get median parameter estimates
coefs = apply(mcmc[, 1:4], 2, median)
fit = as.vector(coefs %*% t(Xmat))
resid = data$Y - fit
sresid = resid/sd(resid)
ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))

Conclusions: for this simple model, the studentized residuals yield the same pattern as the raw residuals (or the Pearson residuals for that matter).

Lets see how well data simulated from the model reflects the raw data

mcmc = as.matrix(data.mcmcpack)
# generate a model matrix
Xmat = model.matrix(~A + B, data)
## get median parameter estimates
coefs = mcmc[, 1:4]
fit = coefs %*% t(Xmat)
## draw samples from this model
yRep = sapply(1:nrow(mcmc), function(i) rnorm(nrow(data), fit[i,
    ], sqrt(mcmc[i, "sigma2"])))
newdata = data.frame(A = data$A, B = data$B, yRep) %>% gather(key = Sample,
    value = Value, -A, -B)
ggplot(newdata) + geom_violin(aes(y = Value, x = A, fill = "Model"),
    alpha = 0.5) + geom_violin(data = data, aes(y = Y, x = A,
    fill = "Obs"), alpha = 0.5) + geom_point(data = data, aes(y = Y,
    x = A), position = position_jitter(width = 0.1, height = 0),
    color = "black")

ggplot(newdata) + geom_violin(aes(y = Value, x = B, fill = "Model",
    group = B, color = A), alpha = 0.5) + geom_point(data = data,
    aes(y = Y, x = B, group = B, color = A))

Conclusions the predicted trends do encapsulate the actual data, suggesting that the model is a reasonable representation of the underlying processes. Note, these are prediction intervals rather than confidence intervals as we are seeking intervals within which we can predict individual observations rather than means.

We can also explore the posteriors of each parameter.

library(bayesplot)
mcmc_intervals(as.matrix(data.mcmcpack), regex_pars = "Intercept|^A|^B|sigma")

mcmc_areas(as.matrix(data.mcmcpack), regex_pars = "Intercept|^A|^B|sigma")

Rather than dublicate this for both additive and multiplicative models, we will only explore the multiplicative model.

Residuals are not computed directly within JAGS. However, we can calculate them manually form the posteriors.

mcmc = data.r2jags$BUGSoutput$sims.matrix %>% as.data.frame %>%
    dplyr:::select(contains("beta"), sigma) %>% as.matrix
# generate a model matrix
newdata = data
Xmat = model.matrix(~A + B, newdata)
## get median parameter estimates
coefs = apply(mcmc[, 1:4], 2, median)
fit = as.vector(coefs %*% t(Xmat))
resid = data$Y - fit
ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))

Residuals against predictors

mcmc = data.r2jags$BUGSoutput$sims.matrix %>% as.data.frame %>%
    dplyr:::select(contains("beta"), sigma) %>% as.matrix
# generate a model matrix
newdata = newdata
Xmat = model.matrix(~A + B, newdata)
## get median parameter estimates
coefs = apply(mcmc[, 1:4], 2, median)
fit = as.vector(coefs %*% t(Xmat))
resid = data$Y - fit
newdata = newdata %>% cbind(fit, resid)
ggplot(newdata) + geom_point(aes(y = resid, x = A))

ggplot(newdata) + geom_point(aes(y = resid, x = B))

And now for studentized residuals

mcmc = data.r2jags$BUGSoutput$sims.matrix %>% as.data.frame %>%
    dplyr:::select(contains("beta"), sigma) %>% as.matrix
# generate a model matrix
newdata = data
Xmat = model.matrix(~A + B, newdata)
## get median parameter estimates
coefs = apply(mcmc[, 1:4], 2, median)
fit = as.vector(coefs %*% t(Xmat))
resid = data$Y - fit
sresid = resid/sd(resid)
ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))

Conclusions: for this simple model, the studentized residuals yield the same pattern as the raw residuals (or the Pearson residuals for that matter).

Lets see how well data simulated from the model reflects the raw data

mcmc = data.r2jags$BUGSoutput$sims.matrix %>% as.data.frame %>%
    dplyr:::select(contains("beta"), sigma) %>% as.matrix
# generate a model matrix
Xmat = model.matrix(~A + B, data)
## get median parameter estimates
coefs = mcmc[, 1:4]
fit = coefs %*% t(Xmat)
## draw samples from this model
yRep = sapply(1:nrow(mcmc), function(i) rnorm(nrow(data), fit[i,
    ], mcmc[i, "sigma"]))
newdata = data.frame(A = data$A, B = data$B, yRep) %>% gather(key = Sample,
    value = Value, -A, -B)
ggplot(newdata) + geom_violin(aes(y = Value, x = A, fill = "Model"),
    alpha = 0.5) + geom_violin(data = data, aes(y = Y, x = A,
    fill = "Obs"), alpha = 0.5) + geom_point(data = data, aes(y = Y,
    x = A), position = position_jitter(width = 0.1, height = 0),
    color = "black")

ggplot(newdata) + geom_violin(aes(y = Value, x = B, fill = "Model",
    group = B, color = A), alpha = 0.5) + geom_point(data = data,
    aes(y = Y, x = B, group = B, color = A))

We can also explore the posteriors of each parameter.

library(bayesplot)
mcmc_intervals(data.r2jags$BUGSoutput$sims.matrix, regex_pars = "beta|sigma")

mcmc_areas(data.r2jags$BUGSoutput$sims.matrix, regex_pars = "beta|sigma")

Rather than dublicate this for both additive and multiplicative models, we will only explore the multiplicative model.

Residuals are not computed directly within RSTAN. However, we can calculate them manually form the posteriors.

mcmc = as.data.frame(data.rstan) %>% dplyr:::select(contains("beta"),
    sigma) %>% as.matrix
# generate a model matrix
newdata = data
Xmat = model.matrix(~A + B, newdata)
## get median parameter estimates
coefs = apply(mcmc[, 1:4], 2, median)
fit = as.vector(coefs %*% t(Xmat))
resid = data$Y - fit
ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))

Residuals against predictors

mcmc = as.data.frame(data.rstan) %>% dplyr:::select(contains("beta"),
    sigma) %>% as.matrix
# generate a model matrix
newdata = newdata
Xmat = model.matrix(~A + B, newdata)
## get median parameter estimates
coefs = apply(mcmc[, 1:4], 2, median)
fit = as.vector(coefs %*% t(Xmat))
resid = data$Y - fit
newdata = newdata %>% cbind(fit, resid)
ggplot(newdata) + geom_point(aes(y = resid, x = A))

ggplot(newdata) + geom_point(aes(y = resid, x = B))

And now for studentized residuals

mcmc = as.data.frame(data.rstan) %>% dplyr:::select(contains("beta"),
    sigma) %>% as.matrix
# generate a model matrix
newdata = data
Xmat = model.matrix(~A + B, newdata)
## get median parameter estimates
coefs = apply(mcmc[, 1:4], 2, median)
fit = as.vector(coefs %*% t(Xmat))
resid = data$Y - fit
sresid = resid/sd(resid)
ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))

Conclusions: for this simple model, the studentized residuals yield the same pattern as the raw residuals (or the Pearson residuals for that matter).

Lets see how well data simulated from the model reflects the raw data

mcmc = as.data.frame(data.rstan) %>% dplyr:::select(contains("beta"),
    sigma) %>% as.matrix
# generate a model matrix
Xmat = model.matrix(~A + B, data)
## get median parameter estimates
coefs = mcmc[, 1:4]
fit = coefs %*% t(Xmat)
## draw samples from this model
yRep = sapply(1:nrow(mcmc), function(i) rnorm(nrow(data), fit[i,
    ], mcmc[i, "sigma"]))
newdata = data.frame(A = data$A, B = data$B, yRep) %>% gather(key = Sample,
    value = Value, -A, -B)
ggplot(newdata) + geom_violin(aes(y = Value, x = A, fill = "Model"),
    alpha = 0.5) + geom_violin(data = data, aes(y = Y, x = A,
    fill = "Obs"), alpha = 0.5) + geom_point(data = data, aes(y = Y,
    x = A), position = position_jitter(width = 0.1, height = 0),
    color = "black")

ggplot(newdata) + geom_violin(aes(y = Value, x = B, fill = "Model",
    group = B, color = A), alpha = 0.5) + geom_point(data = data,
    aes(y = Y, x = B, group = B, color = A))

We can also explore the posteriors of each parameter.

library(bayesplot)
mcmc_intervals(as.matrix(data.rstan), regex_pars = "beta|sigma")

mcmc_areas(as.matrix(data.rstan), regex_pars = "beta|sigma")

Rather than dublicate this for both additive and multiplicative models, we will only explore the multiplicative model.

Residuals are not computed directly within RSTANARM. However, we can calculate them manually form the posteriors.

resid = resid(data.rstanarm)
fit = fitted(data.rstanarm)
ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))

Residuals against predictors

resid = resid(data.rstanarm)
dat = data %>% mutate(resid = resid)
ggplot(dat) + geom_point(aes(y = resid, x = A))

ggplot(dat) + geom_point(aes(y = resid, x = B))

And now for studentized residuals

resid = resid(data.rstanarm)
sresid = resid/sd(resid)
fit = fitted(data.rstanarm)
ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))

Conclusions: for this simple model, the studentized residuals yield the same pattern as the raw residuals (or the Pearson residuals for that matter).

Lets see how well data simulated from the model reflects the raw data

y_pred = posterior_predict(data.rstanarm)
newdata = data %>% cbind(t(y_pred)) %>% gather(key = "Rep", value = "Value",
    -Y:-B)
ggplot(newdata) + geom_violin(aes(y = Value, x = A, fill = "Model"),
    alpha = 0.5) + geom_violin(data = data, aes(y = Y, x = A,
    fill = "Obs"), alpha = 0.5) + geom_point(data = data, aes(y = Y,
    x = A), position = position_jitter(width = 0.1, height = 0),
    color = "black")

Error: Aesthetics must be either length 1 or the same as the data (67530): x, y, fill

ggplot(newdata) + geom_violin(aes(y = Value, x = B, fill = "Model",
    group = B, color = A), alpha = 0.5) + geom_point(data = data,
    aes(y = Y, x = B, group = B, color = A))

Error: Aesthetics must be either length 1 or the same as the data (67530): x, y, fill, group, colour

We can also explore the posteriors of each parameter.

library(bayesplot)
mcmc_intervals(as.matrix(data.rstanarm), regex_pars = "Intercept|^A|^B|sigma")

mcmc_areas(as.matrix(data.rstanarm), regex_pars = "Intercept|^A|^B|sigma")

Rather than dublicate this for both additive and multiplicative models, we will only explore the multiplicative model.

Residuals are not computed directly within BRMS. However, we can calculate them manually form the posteriors.

resid = resid(data.brms)[, "Estimate"]
fit = fitted(data.brms)[, "Estimate"]
ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))

Residuals against predictors

resid = resid(data.brms)[, "Estimate"]
dat = data %>% mutate(resid = resid)
ggplot(dat) + geom_point(aes(y = resid, x = A))

ggplot(dat) + geom_point(aes(y = resid, x = B))

And now for studentized residuals

resid = resid(data.brms)[, "Estimate"]
sresid = resid/sd(resid)
fit = fitted(data.brms)[, "Estimate"]
ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))

Conclusions: for this simple model, the studentized residuals yield the same pattern as the raw residuals (or the Pearson residuals for that matter).

Lets see how well data simulated from the model reflects the raw data

y_pred = posterior_predict(data.brms)
newdata = data %>% cbind(t(y_pred)) %>% gather(key = "Rep", value = "Value",
    -Y:-B)
ggplot(newdata) + geom_violin(aes(y = Value, x = A, fill = "Model"),
    alpha = 0.5) + geom_violin(data = data, aes(y = Y, x = A,
    fill = "Obs"), alpha = 0.5) + geom_point(data = data, aes(y = Y,
    x = A), position = position_jitter(width = 0.1, height = 0),
    color = "black")

Error: Aesthetics must be either length 1 or the same as the data (67530): x, y, fill

ggplot(newdata) + geom_violin(aes(y = Value, x = B, fill = "Model",
    group = B, color = A), alpha = 0.5) + geom_point(data = data,
    aes(y = Y, x = B, group = B, color = A))

Error: Aesthetics must be either length 1 or the same as the data (67530): x, y, fill, group, colour

We can also explore the posteriors of each parameter.

library(bayesplot)
mcmc_intervals(as.matrix(data.brms), regex_pars = "b_|sigma")

mcmc_areas(as.matrix(data.brms), regex_pars = "b_|sigma")

Parameter estimates (posterior summaries)

General
MCMCpack
JAGS
STAN
RSTANARM
BRMS

Although all parameters in a Bayesian analysis are considered random and are considered a distribution, rarely would it be useful to present tables of all the samples from each distribution. On the other hand, plots of the posterior distributions are do have some use. Nevertheless, most workers prefer to present simple statistical summaries of the posteriors. Popular choices include the median (or mean) and 95% credibility intervals.

library(coda)
mcmcpvalue <- function(samp) {
    ## elementary version that creates an empirical p-value for the
    ## hypothesis that the columns of samp have mean zero versus a general
    ## multivariate distribution with elliptical contours.

    ## differences from the mean standardized by the observed
    ## variance-covariance factor

    ## Note, I put in the bit for single terms
    if (length(dim(samp)) == 0) {
        std <- backsolve(chol(var(samp)), cbind(0, t(samp)) - mean(samp),
            transpose = TRUE)
        sqdist <- colSums(std * std)
        sum(sqdist[-1] > sqdist[1])/length(samp)
    } else {
        std <- backsolve(chol(var(samp)), cbind(0, t(samp)) - colMeans(samp),
            transpose = TRUE)
        sqdist <- colSums(std * std)
        sum(sqdist[-1] > sqdist[1])/nrow(samp)
    }

}

Matrix model (MCMCpack)

summary(data.mcmcpack)

Iterations = 1001:11000
Thinning interval = 1 
Number of chains = 1 
Sample size per chain = 10000 

1. Empirical mean and standard deviation for each variable,
   plus standard error of the mean:

                Mean     SD Naive SE Time-series SE
(Intercept)  50.2015 1.7846 0.017846      0.0178458
AGroup B    -17.1433 1.8782 0.018782      0.0187821
AGroup C    -22.9133 1.8682 0.018682      0.0186817
B            -0.4138 0.0637 0.000637      0.0006543
sigma2       17.0079 5.0889 0.050889      0.0581569

2. Quantiles for each variable:

                2.5%      25%      50%      75%    97.5%
(Intercept)  46.7714  49.0274  50.2019  51.3437  53.7977
AGroup B    -20.8797 -18.3726 -17.1445 -15.8884 -13.4552
AGroup C    -26.6231 -24.1304 -22.9030 -21.6955 -19.1900
B            -0.5397  -0.4556  -0.4134  -0.3714  -0.2885
sigma2        9.7749  13.4022  16.1334  19.6823  29.3438

# OR
library(broom)
tidyMCMC(data.mcmcpack, conf.int = TRUE, conf.method = "HPDinterval")

         term    estimate  std.error    conf.low   conf.high
1 (Intercept)  50.2014506 1.78458442  46.7660798  53.7766054
2    AGroup B -17.1433463 1.87821127 -20.7503649 -13.3485818
3    AGroup C -22.9132573 1.86817189 -26.6020778 -19.1852142
4           B  -0.4137879 0.06369625  -0.5369301  -0.2860994
5      sigma2  17.0078626 5.08890812   8.9411967  27.1577316

Conclusions:

the intercept of the first group (Group A) is 50.2014506
the mean of the second group (Group B) is -17.1433463 units greater than (A)
the mean of the third group (Group C) is -22.9132573 units greater than (A)
a one unit increase in B in Group A is associated with a -0.4137879 units increase in Y

While workers attempt to become comfortable with a new statistical framework, it is only natural that they like to evaluate and comprehend new structures and output alongside more familiar concepts. One way to facilitate this is via Bayesian p-values that are somewhat analogous to the frequentist p-values for investigating the hypothesis that a parameter is equal to zero.

## since values are less than zero
mcmcpvalue(data.mcmcpack[, 2])  # effect of (B-A = 0)

[1] 0

mcmcpvalue(data.mcmcpack[, 3])  # effect of (C-A = 0)

[1] 0

mcmcpvalue(data.mcmcpack[, 4])  # effect of (slope = 0)

[1] 0

mcmcpvalue(data.mcmcpack[, 2:4])  # effect of (model)

[1] 0

There is evidence that the reponse differs between the groups.

Matrix model (JAGS)

print(data.r2jags)

Inference for Bugs model at "5", fit using jags,
 3 chains, each with 53000 iterations (first 3000 discarded), n.thin = 10
 n.sims = 15000 iterations saved
         mu.vect sd.vect    2.5%     25%     50%     75%   97.5%  Rhat n.eff
beta[1]   50.201   1.803  46.617  49.021  50.207  51.404  53.741 1.001  8400
beta[2]  -17.141   1.879 -20.832 -18.393 -17.149 -15.910 -13.402 1.001  8400
beta[3]  -22.917   1.882 -26.640 -24.161 -22.922 -21.682 -19.181 1.001  5900
beta[4]   -0.414   0.065  -0.543  -0.457  -0.414  -0.372  -0.285 1.001 15000
sigma      4.175   0.619   3.178   3.735   4.108   4.531   5.587 1.001  7200
deviance 169.232   3.548 164.517 166.636 168.502 171.075 178.013 1.001 15000

For each parameter, n.eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor (at convergence, Rhat=1).

DIC info (using the rule, pD = var(deviance)/2)
pD = 6.3 and DIC = 175.5
DIC is an estimate of expected predictive error (lower deviance is better).

# OR
library(broom)
tidyMCMC(as.mcmc(data.r2jags), conf.int = TRUE, conf.method = "HPDinterval")

      term    estimate  std.error    conf.low   conf.high
1  beta[1]  50.2007407 1.80285405  46.5994039  53.6923687
2  beta[2] -17.1407725 1.87934256 -20.8989661 -13.5038133
3  beta[3] -22.9165431 1.88243441 -26.5907610 -19.1480417
4  beta[4]  -0.4139048 0.06529536  -0.5434197  -0.2853424
5 deviance 169.2319386 3.54753309 163.9376050 176.1387130
6    sigma   4.1749569 0.61875758   3.0490746   5.3897463

Conclusions:

the intercept of the first group (Group A) is 50.2007407
the mean of the second group (Group B) is -17.1407725 units greater than (A)
the mean of the third group (Group C) is -22.9165431 units greater than (A)
a one unit increase in B in Group A is associated with a -0.4139048 units increase in Y

## since values are less than zero
mcmcpvalue(data.r2jags$BUGSoutput$sims.matrix[, "beta[2]"])  # effect of (B-A = 0)

[1] 0

mcmcpvalue(data.r2jags$BUGSoutput$sims.matrix[, "beta[3]"])  # effect of (C-A = 0)

[1] 0

mcmcpvalue(data.r2jags$BUGSoutput$sims.matrix[, "beta[4]"])  # effect of (slope = 0)

[1] 0

mcmcpvalue(data.r2jags$BUGSoutput$sims.matrix[, 2:4])  # effect of (model)

[1] 0

There is evidence that the reponse differs between the groups.

Matrix model (RSTAN)

print(data.rstan, pars = c("beta", "sigma"))

Inference for Stan model: 3d2414c9dcf4b5e12be870eadd2c894a.
3 chains, each with iter=2000; warmup=500; thin=3; 
post-warmup draws per chain=500, total post-warmup draws=1500.

          mean se_mean   sd   2.5%    25%    50%    75%  97.5% n_eff Rhat
beta[1]  50.22    0.06 1.79  46.60  49.03  50.24  51.40  53.83  1027    1
beta[2] -17.12    0.06 1.92 -20.88 -18.31 -17.14 -15.96 -13.19  1210    1
beta[3] -22.88    0.05 1.93 -26.78 -24.11 -22.86 -21.58 -19.24  1407    1
beta[4]  -0.42    0.00 0.06  -0.54  -0.46  -0.42  -0.37  -0.29  1282    1
sigma     4.11    0.02 0.58   3.12   3.70   4.05   4.45   5.37  1353    1

Samples were drawn using NUTS(diag_e) at Fri Nov  3 15:05:11 2017.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).

# OR
library(broom)
tidyMCMC(data.rstan, conf.int = TRUE, conf.method = "HPDinterval", pars = c("beta", "sigma"))

     term    estimate  std.error    conf.low   conf.high
1 beta[1]  50.2161150 1.78767398  46.4168478  53.5737947
2 beta[2] -17.1206004 1.91600172 -20.7216779 -13.0233814
3 beta[3] -22.8794576 1.92793878 -26.8184381 -19.3130173
4 beta[4]  -0.4157079 0.06337683  -0.5361929  -0.2891039
5   sigma   4.1094504 0.58022288   3.0779992   5.3039276

Conclusions:

the intercept of the first group (Group A) is 50.216115
the mean of the second group (Group B) is -17.1206004 units greater than (A)
the mean of the third group (Group C) is -22.8794576 units greater than (A)
a one unit increase in B in Group A is associated with a -0.4157079 units increase in Y

## since values are less than zero
mcmcpvalue(as.matrix(data.rstan)[, "beta[2]"])  # effect of (B-A = 0)

[1] 0

mcmcpvalue(as.matrix(data.rstan)[, "beta[3]"])  # effect of (C-A = 0)

[1] 0

mcmcpvalue(as.matrix(data.rstan)[, "beta[4]"])  # effect of (slope = 0)

[1] 0

mcmcpvalue(as.matrix(data.rstan)[, 2:4])  # effect of (model)

[1] 0

There is evidence that the reponse differs between the groups.

library(loo)
(full = loo(extract_log_lik(data.rstan)))

Computed from 1500 by 30 log-likelihood matrix

         Estimate  SE
elpd_loo    -87.3 3.5
p_loo         4.6 1.0
looic       174.5 7.1

Pareto k diagnostic values:
                         Count  Pct 
(-Inf, 0.5]   (good)     29    96.7%
 (0.5, 0.7]   (ok)        1     3.3%
   (0.7, 1]   (bad)       0     0.0%
   (1, Inf)   (very bad)  0     0.0%

All Pareto k estimates are ok (k < 0.7)
See help('pareto-k-diagnostic') for details.

# now fit a model without main factor
modelString = "
  data {
  int<lower=1> n;
  int<lower=1> nX;
  vector [n] y;
  matrix [n,nX] X;
  }
  parameters {
  vector[nX] beta;
  real<lower=0> sigma;
  }
  transformed parameters {
  vector[n] mu;

  mu = X*beta;
  }
  model {
  #Likelihood
  y~normal(mu,sigma);
  
  #Priors
  beta ~ normal(0,1000);
  sigma~cauchy(0,5);
  }
  generated quantities {
  vector[n] log_lik;
  
  for (i in 1:n) {
  log_lik[i] = normal_lpdf(y[i] | mu[i], sigma); 
  }
  }
  "

Xmat <- model.matrix(~1, data)
data.list <- with(data, list(y = Y, X = Xmat, n = nrow(data), nX = ncol(Xmat)))
data.rstan.red <- stan(data = data.list, model_code = modelString, chains = 3,
    iter = 2000, warmup = 500, thin = 3)

In file included from /usr/local/lib/R/site-library/BH/include/boost/config.hpp:39:0,
                 from /usr/local/lib/R/site-library/BH/include/boost/math/tools/config.hpp:13,
                 from /usr/local/lib/R/site-library/StanHeaders/include/stan/math/rev/core/var.hpp:7,
                 from /usr/local/lib/R/site-library/StanHeaders/include/stan/math/rev/core/gevv_vvv_vari.hpp:5,
                 from /usr/local/lib/R/site-library/StanHeaders/include/stan/math/rev/core.hpp:12,
                 from /usr/local/lib/R/site-library/StanHeaders/include/stan/math/rev/mat.hpp:4,
                 from /usr/local/lib/R/site-library/StanHeaders/include/stan/math.hpp:4,
                 from /usr/local/lib/R/site-library/StanHeaders/include/src/stan/model/model_header.hpp:4,
                 from file12361ebf982d.cpp:8:
/usr/local/lib/R/site-library/BH/include/boost/config/compiler/gcc.hpp:186:0: warning: "BOOST_NO_CXX11_RVALUE_REFERENCES" redefined
 #  define BOOST_NO_CXX11_RVALUE_REFERENCES
 ^
<command-line>:0:0: note: this is the location of the previous definition

SAMPLING FOR MODEL '3b057d3d81cbed2078ce678376a94574' NOW (CHAIN 1).

Gradient evaluation took 1.2e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.12 seconds.
Adjust your expectations accordingly!


Iteration:    1 / 2000 [  0%]  (Warmup)
Iteration:  200 / 2000 [ 10%]  (Warmup)
Iteration:  400 / 2000 [ 20%]  (Warmup)
Iteration:  501 / 2000 [ 25%]  (Sampling)
Iteration:  700 / 2000 [ 35%]  (Sampling)
Iteration:  900 / 2000 [ 45%]  (Sampling)
Iteration: 1100 / 2000 [ 55%]  (Sampling)
Iteration: 1300 / 2000 [ 65%]  (Sampling)
Iteration: 1500 / 2000 [ 75%]  (Sampling)
Iteration: 1700 / 2000 [ 85%]  (Sampling)
Iteration: 1900 / 2000 [ 95%]  (Sampling)
Iteration: 2000 / 2000 [100%]  (Sampling)

 Elapsed Time: 0.018509 seconds (Warm-up)
               0.041875 seconds (Sampling)
               0.060384 seconds (Total)


SAMPLING FOR MODEL '3b057d3d81cbed2078ce678376a94574' NOW (CHAIN 2).

Gradient evaluation took 7e-06 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.07 seconds.
Adjust your expectations accordingly!


Iteration:    1 / 2000 [  0%]  (Warmup)
Iteration:  200 / 2000 [ 10%]  (Warmup)
Iteration:  400 / 2000 [ 20%]  (Warmup)
Iteration:  501 / 2000 [ 25%]  (Sampling)
Iteration:  700 / 2000 [ 35%]  (Sampling)
Iteration:  900 / 2000 [ 45%]  (Sampling)
Iteration: 1100 / 2000 [ 55%]  (Sampling)
Iteration: 1300 / 2000 [ 65%]  (Sampling)
Iteration: 1500 / 2000 [ 75%]  (Sampling)
Iteration: 1700 / 2000 [ 85%]  (Sampling)
Iteration: 1900 / 2000 [ 95%]  (Sampling)
Iteration: 2000 / 2000 [100%]  (Sampling)

 Elapsed Time: 0.017436 seconds (Warm-up)
               0.046029 seconds (Sampling)
               0.063465 seconds (Total)


SAMPLING FOR MODEL '3b057d3d81cbed2078ce678376a94574' NOW (CHAIN 3).

Gradient evaluation took 6e-06 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.06 seconds.
Adjust your expectations accordingly!


Iteration:    1 / 2000 [  0%]  (Warmup)
Iteration:  200 / 2000 [ 10%]  (Warmup)
Iteration:  400 / 2000 [ 20%]  (Warmup)
Iteration:  501 / 2000 [ 25%]  (Sampling)
Iteration:  700 / 2000 [ 35%]  (Sampling)
Iteration:  900 / 2000 [ 45%]  (Sampling)
Iteration: 1100 / 2000 [ 55%]  (Sampling)
Iteration: 1300 / 2000 [ 65%]  (Sampling)
Iteration: 1500 / 2000 [ 75%]  (Sampling)
Iteration: 1700 / 2000 [ 85%]  (Sampling)
Iteration: 1900 / 2000 [ 95%]  (Sampling)
Iteration: 2000 / 2000 [100%]  (Sampling)

 Elapsed Time: 0.012316 seconds (Warm-up)
               0.026582 seconds (Sampling)
               0.038898 seconds (Total)

(reduced = loo(extract_log_lik(data.rstan.red)))

Computed from 1500 by 30 log-likelihood matrix

         Estimate  SE
elpd_loo   -116.0 3.2
p_loo         1.6 0.4
looic       232.0 6.4

All Pareto k estimates are good (k < 0.5)
See help('pareto-k-diagnostic') for details.

par(mfrow = 1:2, mar = c(5, 3.8, 1, 0) + 0.1, las = 3)
plot(full, label_points = TRUE)
plot(reduced, label_points = TRUE)

compare_models(full, reduced)

Error in discrete == discrete[1]: comparison of these types is not implemented

Conclusions: the expected out-of-sample predictive accuracy is substantially lower for the model that includes x. This might be used to suggest that the inferential evidence for a general effect of x on y.

Matrix model (RSTANARM)

summary(data.rstanarm)

Model Info:

 function:  stan_glm
 family:    gaussian [identity]
 formula:   Y ~ A + B
 algorithm: sampling
 priors:    see help('prior_summary')
 sample:    2250 (posterior sample size)
 num obs:   30

Estimates:
                mean   sd     2.5%   25%    50%    75%    97.5%
(Intercept)     50.2    1.9   46.4   49.0   50.2   51.5   54.1 
AGroup B       -17.2    1.9  -21.0  -18.5  -17.2  -15.9  -13.3 
AGroup C       -22.9    2.0  -26.6  -24.2  -22.9  -21.6  -19.0 
B               -0.4    0.1   -0.5   -0.5   -0.4   -0.4   -0.3 
sigma            4.2    0.6    3.2    3.8    4.1    4.5    5.6 
mean_PPD        30.0    1.1   27.8   29.3   30.0   30.7   32.2 
log-posterior  -98.3    1.8 -103.1  -99.1  -97.9  -96.9  -95.9 

Diagnostics:
              mcse Rhat n_eff
(Intercept)   0.0  1.0  1980 
AGroup B      0.0  1.0  2145 
AGroup C      0.0  1.0  2250 
B             0.0  1.0  2250 
sigma         0.0  1.0  1200 
mean_PPD      0.0  1.0  1720 
log-posterior 0.1  1.0   960 

For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence Rhat=1).

# OR
library(broom)
tidyMCMC(data.rstanarm$stanfit, conf.int = TRUE, conf.method = "HPDinterval")

           term    estimate std.error     conf.low   conf.high
1   (Intercept)  50.2286902 1.9365801   46.4428738  54.1263008
2      AGroup B -17.2114440 1.9386491  -21.0066484 -13.4338089
3      AGroup C -22.9037572 1.9679361  -26.6757211 -19.0189639
4             B  -0.4138748 0.0681432   -0.5451839  -0.2852376
5         sigma   4.1990238 0.6317351    3.0581470   5.4328442
6      mean_PPD  30.0266842 1.1171901   27.8860300  32.2422092
7 log-posterior -98.2605466 1.8364590 -101.7802647 -95.7415132

Conclusions:

the intercept of the first group (Group A) is 50.2286902
the mean of the second group (Group B) is -17.211444 units greater than (A)
the mean of the third group (Group C) is -22.9037572 units greater than (A)
a one unit increase in B in Group A is associated with a -0.4138748 units increase in Y

## since values are less than zero
head(as.matrix(data.rstanarm))

          parameters
iterations (Intercept)  AGroup B  AGroup C          B    sigma
      [1,]    49.25161 -18.13674 -20.71374 -0.3963549 4.384061
      [2,]    50.28149 -20.18446 -23.60224 -0.3823662 4.601748
      [3,]    47.22865 -13.74348 -20.42610 -0.3439257 3.559008
      [4,]    50.36383 -18.72387 -23.75338 -0.3776768 4.211066
      [5,]    49.33028 -18.88190 -20.94786 -0.3634085 3.830714
      [6,]    49.08087 -14.40159 -22.87480 -0.4144611 3.793997

mcmcpvalue(as.matrix(data.rstanarm)[, "AGroup B"])  # effect of (B-A = 0)

[1] 0

mcmcpvalue(as.matrix(data.rstanarm)[, "AGroup C"])  # effect of (C-A = 0)

[1] 0

mcmcpvalue(as.matrix(data.rstanarm)[, "B"])  # effect of (slope = 0)

[1] 0

mcmcpvalue(as.matrix(data.rstanarm)[, 2:4])  # effect of (model)

[1] 0

There is evidence that the reponse differs between the groups. There is very little evidence that the response of group D differs from that of group A

library(loo)
(full = loo(data.rstanarm))

Computed from 2250 by 30 log-likelihood matrix

         Estimate  SE
elpd_loo    -87.2 3.3
p_loo         4.3 0.9
looic       174.4 6.7

All Pareto k estimates are good (k < 0.5)
See help('pareto-k-diagnostic') for details.

data.rstanarm.red = update(data.rstanarm, . ~ 1)

Gradient evaluation took 3.8e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.38 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.027383 seconds (Warm-up)
               0.039991 seconds (Sampling)
               0.067374 seconds (Total)


Gradient evaluation took 1.1e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.11 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.0218 seconds (Warm-up)
               0.039113 seconds (Sampling)
               0.060913 seconds (Total)


Gradient evaluation took 1.2e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.12 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.02501 seconds (Warm-up)
               0.043676 seconds (Sampling)
               0.068686 seconds (Total)

(reduced = loo(data.rstanarm.red))

Computed from 2250 by 30 log-likelihood matrix

         Estimate  SE
elpd_loo   -116.0 3.1
p_loo         1.5 0.4
looic       232.0 6.1

All Pareto k estimates are good (k < 0.5)
See help('pareto-k-diagnostic') for details.

par(mfrow = 1:2, mar = c(5, 3.8, 1, 0) + 0.1, las = 3)
plot(full, label_points = TRUE)
plot(reduced, label_points = TRUE)

compare_models(full, reduced)

elpd_diff        se 
    -28.8       4.5

Matrix model (BRMS)

summary(data.brms)

 Family: gaussian(identity) 
Formula: Y ~ A + B 
   Data: data (Number of observations: 30) 
Samples: 3 chains, each with iter = 2000; warmup = 500; thin = 2; 
         total post-warmup samples = 2250
    ICs: LOO = NA; WAIC = NA; R2 = NA
 
Population-Level Effects: 
          Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
Intercept    50.20      1.76    46.79    53.75       2177    1
AGroupB     -17.17      1.83   -20.81   -13.68       2158    1
AGroupC     -22.89      1.85   -26.43   -19.25       2127    1
B            -0.41      0.06    -0.54    -0.29       2129    1

Family Specific Parameters: 
      Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sigma     4.11      0.59     3.14     5.41       1854    1

Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample 
is a crude measure of effective sample size, and Rhat is the potential 
scale reduction factor on split chains (at convergence, Rhat = 1).

# OR
library(broom)
tidyMCMC(data.brms$fit, conf.int = TRUE, conf.method = "HPDinterval")

         term    estimate  std.error    conf.low   conf.high
1 b_Intercept  50.1970335 1.75540856  46.6017008  53.5440980
2   b_AGroupB -17.1728328 1.83135058 -20.6803108 -13.5582289
3   b_AGroupC -22.8915861 1.85449337 -26.3846325 -19.2178870
4         b_B  -0.4122714 0.06370931  -0.5416812  -0.2961189
5       sigma   4.1081576 0.58644550   3.1075609   5.3375706

Conclusions:

the intercept of the first group (Group A) is 50.1970335
the mean of the second group (Group B) is -17.1728328 units greater than (A)
the mean of the third group (Group C) is -22.8915861 units greater than (A)
a one unit increase in B in Group A is associated with a -0.4122714 units increase in Y

## since values are less than zero
head(as.matrix(data.brms))

          parameters
iterations b_Intercept b_AGroupB b_AGroupC        b_B    sigma      lp__
      [1,]    46.14429 -17.89799 -23.20626 -0.1747196 4.234569 -112.8789
      [2,]    50.46466 -17.00736 -21.94628 -0.4189975 3.964478 -105.7635
      [3,]    49.51423 -16.74688 -22.42555 -0.4566537 3.784797 -106.7988
      [4,]    52.03476 -20.63035 -26.45616 -0.3795796 4.637355 -108.6522
      [5,]    49.58570 -18.21573 -23.70766 -0.4119922 4.153369 -106.9716
      [6,]    51.87136 -19.76110 -26.12697 -0.4639165 3.485067 -109.3362

mcmcpvalue(as.matrix(data.brms)[, "b_AGroupB"])  # effect of (B-A = 0)

[1] 0

mcmcpvalue(as.matrix(data.brms)[, "b_AGroupC"])  # effect of (C-A = 0)

[1] 0

mcmcpvalue(as.matrix(data.brms)[, "b_B"])  # effect of (slope = 0)

[1] 0

mcmcpvalue(as.matrix(data.brms)[, 2:4])  # effect of (model)

[1] 0

There is evidence that the reponse differs between the groups. There is very little evidence that the response of group D differs from that of group A

library(loo)
(full = loo(data.brms))

  LOOIC   SE
 174.34 7.05

data.brms.red = update(data.brms, . ~ 1)

SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 1).

Gradient evaluation took 1e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.1 seconds.
Adjust your expectations accordingly!


Iteration:    1 / 2000 [  0%]  (Warmup)
Iteration:  200 / 2000 [ 10%]  (Warmup)
Iteration:  400 / 2000 [ 20%]  (Warmup)
Iteration:  600 / 2000 [ 30%]  (Warmup)
Iteration:  800 / 2000 [ 40%]  (Warmup)
Iteration: 1000 / 2000 [ 50%]  (Warmup)
Iteration: 1001 / 2000 [ 50%]  (Sampling)
Iteration: 1200 / 2000 [ 60%]  (Sampling)
Iteration: 1400 / 2000 [ 70%]  (Sampling)
Iteration: 1600 / 2000 [ 80%]  (Sampling)
Iteration: 1800 / 2000 [ 90%]  (Sampling)
Iteration: 2000 / 2000 [100%]  (Sampling)

 Elapsed Time: 0.019211 seconds (Warm-up)
               0.014546 seconds (Sampling)
               0.033757 seconds (Total)


SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 2).

Gradient evaluation took 4e-06 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.04 seconds.
Adjust your expectations accordingly!


Iteration:    1 / 2000 [  0%]  (Warmup)
Iteration:  200 / 2000 [ 10%]  (Warmup)
Iteration:  400 / 2000 [ 20%]  (Warmup)
Iteration:  600 / 2000 [ 30%]  (Warmup)
Iteration:  800 / 2000 [ 40%]  (Warmup)
Iteration: 1000 / 2000 [ 50%]  (Warmup)
Iteration: 1001 / 2000 [ 50%]  (Sampling)
Iteration: 1200 / 2000 [ 60%]  (Sampling)
Iteration: 1400 / 2000 [ 70%]  (Sampling)
Iteration: 1600 / 2000 [ 80%]  (Sampling)
Iteration: 1800 / 2000 [ 90%]  (Sampling)
Iteration: 2000 / 2000 [100%]  (Sampling)

 Elapsed Time: 0.015666 seconds (Warm-up)
               0.017803 seconds (Sampling)
               0.033469 seconds (Total)


SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 3).

Gradient evaluation took 4e-06 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.04 seconds.
Adjust your expectations accordingly!


Iteration:    1 / 2000 [  0%]  (Warmup)
Iteration:  200 / 2000 [ 10%]  (Warmup)
Iteration:  400 / 2000 [ 20%]  (Warmup)
Iteration:  600 / 2000 [ 30%]  (Warmup)
Iteration:  800 / 2000 [ 40%]  (Warmup)
Iteration: 1000 / 2000 [ 50%]  (Warmup)
Iteration: 1001 / 2000 [ 50%]  (Sampling)
Iteration: 1200 / 2000 [ 60%]  (Sampling)
Iteration: 1400 / 2000 [ 70%]  (Sampling)
Iteration: 1600 / 2000 [ 80%]  (Sampling)
Iteration: 1800 / 2000 [ 90%]  (Sampling)
Iteration: 2000 / 2000 [100%]  (Sampling)

 Elapsed Time: 0.016449 seconds (Warm-up)
               0.016509 seconds (Sampling)
               0.032958 seconds (Total)

(reduced = loo(data.brms.red))

  LOOIC   SE
 232.15 6.38

par(mfrow = 1:2, mar = c(5, 3.8, 1, 0) + 0.1, las = 3)
plot(full, label_points = TRUE)
plot(reduced, label_points = TRUE)

compare_models(full, reduced)

Error in discrete == discrete[1]: comparison of these types is not implemented

Graphical summaries

General
MCMCpack
JAGS
STAN
RSTANARM
BRMS

A nice graphic is often a great accompaniment to a statistical analysis. Although there are no fixed assumptions associated with graphing (in contrast to statistical analyses), we often want the graphical summaries to reflect the associated statistical analyses. After all, the sample is just one perspective on the population(s). What we are more interested in is being able to estimate and depict likely population parameters/trends.

Thus, whilst we could easily provide a plot displaying the raw data along with simple measures of location and spread, arguably, we should use estimates that reflect the fitted model. In this case, it would be appropriate to plot the credibility interval associated with each group.

Matrix model (MCMCpack)

mcmc = data.mcmcpack
## Calculate the fitted values
newdata = expand.grid(A = levels(data$A), B = seq(min(data$B), max(data$B),
    len = 100))
Xmat = model.matrix(~A + B, newdata)
coefs = mcmc[, 1:4]
fit = coefs %*% t(Xmat)
newdata = newdata %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.method = "HPDinterval"))

ggplot(newdata, aes(y = estimate, x = B, fill = A)) + geom_ribbon(aes(ymin = conf.low,
    ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") +
    scale_x_continuous("B") + theme_classic()

plot of chunk tut7.5bMCMCpackGraphicalSummaries

As this is simple single factor ANOVA, we can simple add the raw data to this figure. For more complex designs with additional predictors, it is necessary to plot partial residuals.

## Calculate partial residuals fitted values
fdata = rdata = data
fMat = rMat = model.matrix(~A + B, fdata)
fit = as.vector(apply(coefs, 2, median) %*% t(fMat))
resid = as.vector(data$Y - apply(coefs, 2, median) %*% t(rMat))
rdata = rdata %>% mutate(partial.resid = resid + fit)

ggplot(newdata, aes(y = estimate, x = B, fill = A)) + geom_point(data = rdata,
    aes(y = partial.resid, x = B, color = A)) + geom_ribbon(aes(ymin = conf.low,
    ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") +
    scale_x_continuous("B") + theme_classic()

plot of chunk tut7.5bMCMCpackGraphicalSummaries2

Matrix model (JAGS)

mcmc = data.r2jags$BUGSoutput$sims.matrix
## Calculate the fitted values
newdata = expand.grid(A = levels(data$A), B = seq(min(data$B), max(data$B),
    len = 100))
Xmat = model.matrix(~A + B, newdata)
coefs = mcmc[, c("beta[1]", "beta[2]", "beta[3]", "beta[4]")]
fit = coefs %*% t(Xmat)
newdata = newdata %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.method = "HPDinterval"))

ggplot(newdata, aes(y = estimate, x = B, fill = A)) + geom_ribbon(aes(ymin = conf.low,
    ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") +
    scale_x_continuous("B") + theme_classic()

plot of chunk tut7.5bJAGSGraphicalSummaries

As this is simple single factor ANOVA, we can simple add the raw data to this figure. For more complex designs with additional predictors, it is necessary to plot partial residuals.

## Calculate partial residuals fitted values
fdata = rdata = data
fMat = rMat = model.matrix(~A + B, fdata)
fit = as.vector(apply(coefs, 2, median) %*% t(fMat))
resid = as.vector(data$Y - apply(coefs, 2, median) %*% t(rMat))
rdata = rdata %>% mutate(partial.resid = resid + fit)

ggplot(newdata, aes(y = estimate, x = B, fill = A)) + geom_point(data = rdata,
    aes(y = partial.resid, x = B, color = A)) + geom_ribbon(aes(ymin = conf.low,
    ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") +
    scale_x_continuous("B") + theme_classic()

plot of chunk tut7.5bJAGSGraphicalSummaries2

Matrix model (RSTAN)

mcmc = as.matrix(data.rstan)
## Calculate the fitted values
newdata = expand.grid(A = levels(data$A), B = seq(min(data$B), max(data$B),
    len = 100))
Xmat = model.matrix(~A + B, newdata)
coefs = mcmc[, c("beta[1]", "beta[2]", "beta[3]", "beta[4]")]
fit = coefs %*% t(Xmat)
newdata = newdata %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.method = "HPDinterval"))

ggplot(newdata, aes(y = estimate, x = B, fill = A)) + geom_ribbon(aes(ymin = conf.low,
    ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") +
    scale_x_continuous("B") + theme_classic()

plot of chunk tut7.5bRSTANGraphicalSummaries

As this is simple single factor ANOVA, we can simple add the raw data to this figure. For more complex designs with additional predictors, it is necessary to plot partial residuals.

## Calculate partial residuals fitted values
fdata = rdata = data
fMat = rMat = model.matrix(~A + B, fdata)
fit = as.vector(apply(coefs, 2, median) %*% t(fMat))
resid = as.vector(data$Y - apply(coefs, 2, median) %*% t(rMat))
rdata = rdata %>% mutate(partial.resid = resid + fit)

ggplot(newdata, aes(y = estimate, x = B, fill = A)) + geom_point(data = rdata,
    aes(y = partial.resid, x = B, color = A)) + geom_ribbon(aes(ymin = conf.low,
    ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") +
    scale_x_continuous("B") + theme_classic()

plot of chunk tut7.5bRSTANGraphicalSummaries2

Matrix model (RSTANARM)

## Calculate the fitted values
newdata = expand.grid(A = levels(data$A), B = seq(min(data$B),
    max(data$B), len = 100))
fit = posterior_linpred(data.rstanarm, newdata = newdata)
newdata = newdata %>% cbind(tidyMCMC(as.mcmc(fit), conf.int = TRUE,
    conf.method = "HPDinterval"))

ggplot(newdata, aes(y = estimate, x = B, fill = A)) + geom_ribbon(aes(ymin = conf.low,
    ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") +
    scale_x_continuous("B") + theme_classic()

plot of chunk tut7.5bRSTANARMGraphicalSummaries

As this is simple single factor ANOVA, we can simple add the raw data to this figure. For more complex designs with additional predictors, it is necessary to plot partial residuals.

## Calculate partial residuals
rdata = data
pp = posterior_linpred(data.rstanarm, newdata = rdata)
fit = as.vector(apply(pp, 2, median))
resid = resid(data.rstanarm)
rdata = rdata %>% mutate(partial.resid = resid + fit)

ggplot(newdata, aes(y = estimate, x = B, fill = A)) + geom_point(data = rdata,
    aes(y = partial.resid, x = B, color = A)) + geom_ribbon(aes(ymin = conf.low,
    ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") +
    scale_x_continuous("B") + theme_classic()

plot of chunk tut7.5bRSTANARMGraphicalSummaries2

Matrix model (BRMS)

Although we could calculated the fitted values via matrix multiplication of the coefficients and the model matrix (as for MCMCpack, RJAGS and RSTAN), for more complex models, it is more convenient to use the marginal_effects function that comes with brms.

plot(marginal_effects(data.brms), points = TRUE)

plot of chunk tut7.5bBRMSGraphicalSummaries.a

# OR
eff = plot(marginal_effects(data.brms), points = TRUE, plot = FALSE)
eff

$A

$B

## Calculate the fitted values
newdata = expand.grid(A = levels(data$A), B = seq(min(data$B),
    max(data$B), len = 100))
fit = fitted(data.brms, newdata = newdata, summary = FALSE)
newdata = newdata %>% cbind(tidyMCMC(as.mcmc(fit), conf.int = TRUE,
    conf.method = "HPDinterval"))

ggplot(newdata, aes(y = estimate, x = B, fill = A)) + geom_ribbon(aes(ymin = conf.low,
    ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") +
    scale_x_continuous("B") + theme_classic()

plot of chunk tut7.5bBRMSGraphicalSummaries

As this is simple single factor ANOVA, we can simple add the raw data to this figure. For more complex designs with additional predictors, it is necessary to plot partial residuals.

## Calculate partial residuals
rdata = data
fit = fitted(data.brms, summary = TRUE)[, "Estimate"]
resid = resid(data.brms)[, "Estimate"]
rdata = rdata %>% mutate(partial.resid = resid + fit)

ggplot(newdata, aes(y = estimate, x = B, fill = A)) + geom_point(data = rdata,
    aes(y = partial.resid, x = B, color = A)) + geom_ribbon(aes(ymin = conf.low,
    ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") +
    scale_x_continuous("B") + theme_classic()

plot of chunk tut7.5bBRMSGraphicalSummaries2

Posteriors

In frequentist statistics, when we have more than two groups, we are typically not only interested in whether there is evidence for an overall "effect" of a factor - we are also interested in how various groups compare to one another.

To explore these trends, we either compare each group to each other in a pairwise manner (controlling for family-wise Type I error rates) or we explore an independent subset of the possible comparisons. Although these alternate approaches can adequately address a specific research agenda, often they impose severe limitations and compromises on the scope and breadth of questions that can be asked of your data. The reason for these limitations is that in a frequentist framework, any single hypothesis carries with it a (nominally) 5% chance of a false rejection (since it is based on long-run frequency). Thus, performing multiple tests are likely to compound this error rate. The point is, that each comparison is compared to its own probability distribution (and each carries a 5% error rate).

By contrast, in Bayesian statistics, all comparisons (contrasts) are drawn from the one (hopefully stable and convergent) posterior distribution and this posterior is invariant to the type and number of comparisons drawn. Hence, the theory clearly indicates that having generated our posterior distribution, we can then query this distribution in any way that we wish thereby allowing us to explore all of our research questions simultaneously.

Bayesian "contrasts" can be performed either:

within the Bayesian sampling model or
construct them from the returned MCMC samples (they are drawn from the posteriors)

Only the latter will be demonstrated as it povides a consistent approach across all routines.

General
MCMCpack
JAGS
STAN
RSTANARM
BRMS

In order to allow direct comparison to the frequentist equivalents, I will explore the same set of planned and "Tukey's" test comparisons described here. For the "planned comparison" we defined two contrasts:

group B vs group C
group A vs the average of groups B and C

Of course each of these could be explored at multiple values of B, however, since we fit an additive model (which assumes that the slopes are homogeneous), the contrasts will be constant throughout the domain of B.

Lets start by comparing each group to each other group in a pairwise manner. Arguably the most elegant way to do this is to generate a Tukey's contrast matrix. This is a model matrix specific to comparing each group to each other group. Again, since the lines are parallel, it does not really matter what level of B we estimate these efffects at - so lets use the mean B.

mcmc = data.mcmcpack
coefs <- as.matrix(mcmc)[, 1:4]
newdata = expand.grid(A = levels(data$A), B = mean(data$B))
# A Tukeys contrast matrix
library(multcomp)
tuk.mat <- contrMat(n = table(newdata$A), type = "Tukey")
tuk.mat

	 Multiple Comparisons of Means: Tukey Contrasts

                  Group A Group B Group C
Group B - Group A      -1       1       0
Group C - Group A      -1       0       1
Group C - Group B       0      -1       1

Xmat <- model.matrix(~A + B, data = newdata)
Xmat

  (Intercept) AGroup B AGroup C        B
1           1        0        0 16.50175
2           1        1        0 16.50175
3           1        0        1 16.50175
attr(,"assign")
[1] 0 1 1 2
attr(,"contrasts")
attr(,"contrasts")$A
[1] "contr.treatment"

pairwise.mat <- tuk.mat %*% Xmat
pairwise.mat

                  (Intercept) AGroup B AGroup C B
Group B - Group A           0        1        0 0
Group C - Group A           0        0        1 0
Group C - Group B           0       -1        1 0

mcmc_areas(coefs %*% t(pairwise.mat))

(comps = tidyMCMC(coefs %*% t(pairwise.mat), conf.int = TRUE, conf.method = "HPDinterval"))

               term   estimate std.error   conf.low  conf.high
1 Group B - Group A -17.143346  1.878211 -20.750365 -13.348582
2 Group C - Group A -22.913257  1.868172 -26.602078 -19.185214
3 Group C - Group B  -5.769911  1.821653  -9.552934  -2.271914

ggplot(comps, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low,
    ymax = conf.high)) + geom_hline(yintercept = 0, linetype = "dashed") +
    scale_y_continuous("Effect size") + scale_x_discrete("") + coord_flip() +
    theme_classic()

With a couple of modifications, we could also express this as percentage changes. A percentage change represents the change (difference between groups) divided by one of the groups (determined by which group you want to express the percentage change to). Hence, we generate an additional mcmc matrix that represents the cell means for the divisor group (group we want to express change relative to).

Since the tuk.mat defines comparisons as -1 and 1 pairs, if we simply replace all the -1 with 0, the eventual matrix multiplication will result in estimates of the divisor cell means instread of the difference. We can then divide the original mcmc matrix above with this new divisor mcmc matrix to yeild a mcmc matrix of percentage change.

# Modify the tuk.mat to replace -1 with 0.  This will allow us to get a
# mcmc matrix of ..
tuk.mat[tuk.mat == -1] = 0
comp.mat <- tuk.mat %*% Xmat
comp.mat

                  (Intercept) AGroup B AGroup C        B
Group B - Group A           1        1        0 16.50175
Group C - Group A           1        0        1 16.50175
Group C - Group B           1        0        1 16.50175

comp.mcmc = 100 * (coefs %*% t(pairwise.mat))/coefs %*% t(comp.mat)
(comps = tidyMCMC(comp.mcmc, conf.int = TRUE, conf.method = "HPDinterval"))

               term   estimate std.error   conf.low  conf.high
1 Group B - Group A  -65.78138  9.862131  -84.99925 -46.617429
2 Group C - Group A -112.84960 15.200206 -144.80969 -84.547452
3 Group C - Group B  -28.70578 10.358636  -50.05632  -9.004512

ggplot(comps, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low,
    ymax = conf.high)) + geom_hline(yintercept = 0, linetype = "dashed") +
    scale_y_continuous("Effect size (%)") + scale_x_discrete("") + coord_flip() +
    theme_classic()

And now for the specific planned comparisons (Group B vs Group C as well as Group A vs the average of Groups B and C). This is achieved by generating our own contrast matrix (defining the contributions of each group to each contrast).

c.mat = rbind(c(0, 1, -1), c(1/2, -1/3, -1/3))
c.mat

     [,1]       [,2]       [,3]
[1,]  0.0  1.0000000 -1.0000000
[2,]  0.5 -0.3333333 -0.3333333

mcmc = data.mcmcpack
coefs <- as.matrix(mcmc)[, 1:4]
newdata <- data.frame(A = levels(data$A), B = mean(data$B))
Xmat <- model.matrix(~A + B, data = newdata)
c.mat = c.mat %*% Xmat
c.mat

     (Intercept)   AGroup B   AGroup C         B
[1,]   0.0000000  1.0000000 -1.0000000  0.000000
[2,]  -0.1666667 -0.3333333 -0.3333333 -2.750292

(comps = tidyMCMC(as.mcmc(coefs %*% t(c.mat)), conf.int = TRUE, conf.method = "HPDinterval"))

  term estimate std.error conf.low conf.high
1 var1 5.769911 1.8216528 2.271914  9.552934
2 var2 6.123330 0.9187204 4.344498  7.965326

mcmc = data.r2jags$BUGSoutput$sims.matrix
coefs <- as.matrix(mcmc)[, 1:4]
newdata <- data.frame(A = levels(data$A), B = mean(data$B))
# A Tukeys contrast matrix
library(multcomp)
tuk.mat <- contrMat(n = table(newdata$A), type = "Tukey")
Xmat <- model.matrix(~A + B, data = newdata)
pairwise.mat <- tuk.mat %*% Xmat
pairwise.mat

                  (Intercept) AGroup B AGroup C B
Group B - Group A           0        1        0 0
Group C - Group A           0        0        1 0
Group C - Group B           0       -1        1 0

mcmc_areas(coefs %*% t(pairwise.mat))

(comps = tidyMCMC(coefs %*% t(pairwise.mat), conf.int = TRUE, conf.method = "HPDinterval"))

               term   estimate std.error   conf.low  conf.high
1 Group B - Group A -17.140773  1.879343 -20.898966 -13.503813
2 Group C - Group A -22.916543  1.882434 -26.590761 -19.148042
3 Group C - Group B  -5.775771  1.880763  -9.402033  -1.979244

ggplot(comps, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low,
    ymax = conf.high)) + geom_hline(yintercept = 0, linetype = "dashed") +
    scale_y_continuous("Effect size") + scale_x_discrete("") + coord_flip() +
    theme_classic()

# Modify the tuk.mat to replace -1 with 0.  This will allow us to get a
# mcmc matrix of ..
tuk.mat[tuk.mat == -1] = 0
comp.mat <- tuk.mat %*% Xmat
comp.mat

                  (Intercept) AGroup B AGroup C        B
Group B - Group A           1        1        0 16.50175
Group C - Group A           1        0        1 16.50175
Group C - Group B           1        0        1 16.50175

comp.mcmc = 100 * (coefs %*% t(pairwise.mat))/coefs %*% t(comp.mat)
(comps = tidyMCMC(comp.mcmc, conf.int = TRUE, conf.method = "HPDinterval"))

               term   estimate std.error   conf.low  conf.high
1 Group B - Group A  -65.77396  9.861029  -85.38307 -46.803881
2 Group C - Group A -112.95081 15.533036 -144.44128 -83.512153
3 Group C - Group B  -28.79007 10.740187  -50.07490  -8.241323

ggplot(comps, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low,
    ymax = conf.high)) + geom_hline(yintercept = 0, linetype = "dashed") +
    scale_y_continuous("Effect size (%)") + scale_x_discrete("") + coord_flip() +
    theme_classic()

c.mat = rbind(c(0, 1, -1), c(1/2, -1/3, -1/3))
c.mat

     [,1]       [,2]       [,3]
[1,]  0.0  1.0000000 -1.0000000
[2,]  0.5 -0.3333333 -0.3333333

mcmc = data.r2jags$BUGSoutput$sims.matrix
coefs <- as.matrix(mcmc)[, 1:4]
newdata <- data.frame(A = levels(data$A), B = mean(data$B))
Xmat <- model.matrix(~A + B, data = newdata)
c.mat = c.mat %*% Xmat
c.mat

     (Intercept)   AGroup B   AGroup C         B
[1,]   0.0000000  1.0000000 -1.0000000  0.000000
[2,]  -0.1666667 -0.3333333 -0.3333333 -2.750292

(comps = tidyMCMC(as.mcmc(coefs %*% t(c.mat)), conf.int = TRUE, conf.method = "HPDinterval"))

  term estimate std.error conf.low conf.high
1 var1 5.775771  1.880763 1.979244  9.402033
2 var2 6.124007  0.913995 4.278228  7.879558

mcmc = data.rstan
coefs <- as.matrix(mcmc)[, 1:4]
newdata <- data.frame(A = levels(data$A), B = mean(data$B))
# A Tukeys contrast matrix
library(multcomp)
tuk.mat <- contrMat(n = table(newdata$A), type = "Tukey")
Xmat <- model.matrix(~A + B, data = newdata)
pairwise.mat <- tuk.mat %*% Xmat
pairwise.mat

                  (Intercept) AGroup B AGroup C B
Group B - Group A           0        1        0 0
Group C - Group A           0        0        1 0
Group C - Group B           0       -1        1 0

mcmc_areas(coefs %*% t(pairwise.mat))

(comps = tidyMCMC(coefs %*% t(pairwise.mat), conf.int = TRUE, conf.method = "HPDinterval"))

               term   estimate std.error   conf.low  conf.high
1 Group B - Group A -17.120600  1.916002 -20.721678 -13.023381
2 Group C - Group A -22.879458  1.927939 -26.818438 -19.313017
3 Group C - Group B  -5.758857  1.908307  -9.465031  -1.972144

ggplot(comps, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low,
    ymax = conf.high)) + geom_hline(yintercept = 0, linetype = "dashed") +
    scale_y_continuous("Effect size") + scale_x_discrete("") + coord_flip() +
    theme_classic()

# Modify the tuk.mat to replace -1 with 0.  This will allow us to get a
# mcmc matrix of ..
tuk.mat[tuk.mat == -1] = 0
comp.mat <- tuk.mat %*% Xmat
comp.mat

                  (Intercept) AGroup B AGroup C        B
Group B - Group A           1        1        0 16.50175
Group C - Group A           1        0        1 16.50175
Group C - Group B           1        0        1 16.50175

comp.mcmc = 100 * (coefs %*% t(pairwise.mat))/coefs %*% t(comp.mat)
(comps = tidyMCMC(comp.mcmc, conf.int = TRUE, conf.method = "HPDinterval"))

               term   estimate std.error   conf.low  conf.high
1 Group B - Group A  -65.68005  9.938618  -86.57138 -46.874022
2 Group C - Group A -112.65860 15.767824 -145.18904 -84.337691
3 Group C - Group B  -28.69047 10.825916  -48.80312  -6.685448

ggplot(comps, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low,
    ymax = conf.high)) + geom_hline(yintercept = 0, linetype = "dashed") +
    scale_y_continuous("Effect size (%)") + scale_x_discrete("") + coord_flip() +
    theme_classic()

c.mat = rbind(c(0, 1, -1), c(1/2, -1/3, -1/3))
c.mat

     [,1]       [,2]       [,3]
[1,]  0.0  1.0000000 -1.0000000
[2,]  0.5 -0.3333333 -0.3333333

mcmc = data.rstan
coefs <- as.matrix(mcmc)[, 1:4]
newdata <- data.frame(A = levels(data$A), B = mean(data$B))
Xmat <- model.matrix(~A + B, data = newdata)
c.mat = c.mat %*% Xmat
c.mat

     (Intercept)   AGroup B   AGroup C         B
[1,]   0.0000000  1.0000000 -1.0000000  0.000000
[2,]  -0.1666667 -0.3333333 -0.3333333 -2.750292

(comps = tidyMCMC(as.mcmc(coefs %*% t(c.mat)), conf.int = TRUE, conf.method = "HPDinterval"))

  term estimate std.error conf.low conf.high
1 var1 5.758857 1.9083068 1.972144  9.465031
2 var2 6.107318 0.9312412 4.373877  8.159077

mcmc = data.rstanarm
coefs <- as.matrix(mcmc)[, 1:4]
newdata <- data.frame(A = levels(data$A), B = mean(data$B))
# A Tukeys contrast matrix
library(multcomp)
tuk.mat <- contrMat(n = table(newdata$A), type = "Tukey")
Xmat <- model.matrix(~A + B, data = newdata)
pairwise.mat <- tuk.mat %*% Xmat
pairwise.mat

                  (Intercept) AGroup B AGroup C B
Group B - Group A           0        1        0 0
Group C - Group A           0        0        1 0
Group C - Group B           0       -1        1 0

mcmc_areas(coefs %*% t(pairwise.mat))

(comps = tidyMCMC(coefs %*% t(pairwise.mat), conf.int = TRUE, conf.method = "HPDinterval"))

               term   estimate std.error   conf.low  conf.high
1 Group B - Group A -17.211444  1.938649 -21.006648 -13.433809
2 Group C - Group A -22.903757  1.967936 -26.675721 -19.018964
3 Group C - Group B  -5.692313  1.892174  -9.159957  -1.799432

ggplot(comps, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low,
    ymax = conf.high)) + geom_hline(yintercept = 0, linetype = "dashed") +
    scale_y_continuous("Effect size") + scale_x_discrete("") + coord_flip() +
    theme_classic()

# Modify the tuk.mat to replace -1 with 0.  This will allow us to get a
# mcmc matrix of ..
tuk.mat[tuk.mat == -1] = 0
comp.mat <- tuk.mat %*% Xmat
comp.mat

                  (Intercept) AGroup B AGroup C        B
Group B - Group A           1        1        0 16.50175
Group C - Group A           1        0        1 16.50175
Group C - Group B           1        0        1 16.50175

comp.mcmc = 100 * (coefs %*% t(pairwise.mat))/coefs %*% t(comp.mat)
(comps = tidyMCMC(comp.mcmc, conf.int = TRUE, conf.method = "HPDinterval"))

               term   estimate std.error   conf.low  conf.high
1 Group B - Group A  -66.16030  10.09871  -86.76507 -46.675359
2 Group C - Group A -112.70201  15.95929 -144.71321 -82.068118
3 Group C - Group B  -28.33604  10.75415  -50.15437  -8.304887

ggplot(comps, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low,
    ymax = conf.high)) + geom_hline(yintercept = 0, linetype = "dashed") +
    scale_y_continuous("Effect size (%)") + scale_x_discrete("") + coord_flip() +
    theme_classic()

c.mat = rbind(c(0, 1, -1), c(1/2, -1/3, -1/3))
c.mat

     [,1]       [,2]       [,3]
[1,]  0.0  1.0000000 -1.0000000
[2,]  0.5 -0.3333333 -0.3333333

mcmc = data.rstanarm
coefs <- as.matrix(mcmc)[, 1:4]
newdata <- data.frame(A = levels(data$A), B = mean(data$B))
Xmat <- model.matrix(~A + B, data = newdata)
c.mat = c.mat %*% Xmat
c.mat

     (Intercept)   AGroup B   AGroup C         B
[1,]   0.0000000  1.0000000 -1.0000000  0.000000
[2,]  -0.1666667 -0.3333333 -0.3333333 -2.750292

(comps = tidyMCMC(as.mcmc(coefs %*% t(c.mat)), conf.int = TRUE, conf.method = "HPDinterval"))

  term estimate std.error conf.low conf.high
1 var1 5.692313 1.8921744 1.799432  9.159957
2 var2 6.138562 0.9558204 4.235113  7.985014

mcmc = data.brms
coefs <- as.matrix(mcmc)[, 1:4]
newdata <- data.frame(A = levels(data$A), B = mean(data$B))
# A Tukeys contrast matrix
library(multcomp)
tuk.mat <- contrMat(n = table(newdata$A), type = "Tukey")
Xmat <- model.matrix(~A + B, data = newdata)
pairwise.mat <- tuk.mat %*% Xmat
pairwise.mat

                  (Intercept) AGroup B AGroup C B
Group B - Group A           0        1        0 0
Group C - Group A           0        0        1 0
Group C - Group B           0       -1        1 0

mcmc_areas(coefs %*% t(pairwise.mat))

(comps = tidyMCMC(coefs %*% t(pairwise.mat), conf.int = TRUE, conf.method = "HPDinterval"))

               term   estimate std.error   conf.low  conf.high
1 Group B - Group A -17.172833  1.831351 -20.680311 -13.558229
2 Group C - Group A -22.891586  1.854493 -26.384632 -19.217887
3 Group C - Group B  -5.718753  1.846079  -9.169838  -2.065845

ggplot(comps, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low,
    ymax = conf.high)) + geom_hline(yintercept = 0, linetype = "dashed") +
    scale_y_continuous("Effect size") + scale_x_discrete("") + coord_flip() +
    theme_classic()

# Modify the tuk.mat to replace -1 with 0.  This will allow us to get a
# mcmc matrix of ..
tuk.mat[tuk.mat == -1] = 0
comp.mat <- tuk.mat %*% Xmat
comp.mat

                  (Intercept) AGroup B AGroup C        B
Group B - Group A           1        1        0 16.50175
Group C - Group A           1        0        1 16.50175
Group C - Group B           1        0        1 16.50175

comp.mcmc = 100 * (coefs %*% t(pairwise.mat))/coefs %*% t(comp.mat)
(comps = tidyMCMC(comp.mcmc, conf.int = TRUE, conf.method = "HPDinterval"))

               term   estimate std.error   conf.low  conf.high
1 Group B - Group A  -65.90339  9.650503  -84.05345 -46.582017
2 Group C - Group A -112.55948 15.341280 -143.66383 -84.530670
3 Group C - Group B  -28.43318 10.498519  -47.72283  -7.691681

ggplot(comps, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low,
    ymax = conf.high)) + geom_hline(yintercept = 0, linetype = "dashed") +
    scale_y_continuous("Effect size (%)") + scale_x_discrete("") + coord_flip() +
    theme_classic()

c.mat = rbind(c(0, 1, -1), c(1/2, -1/3, -1/3))
c.mat

     [,1]       [,2]       [,3]
[1,]  0.0  1.0000000 -1.0000000
[2,]  0.5 -0.3333333 -0.3333333

mcmc = data.brms
coefs <- as.matrix(mcmc)[, 1:4]
newdata <- data.frame(A = levels(data$A), B = mean(data$B))
Xmat <- model.matrix(~A + B, data = newdata)
c.mat = c.mat %*% Xmat
c.mat

     (Intercept)   AGroup B   AGroup C         B
[1,]   0.0000000  1.0000000 -1.0000000  0.000000
[2,]  -0.1666667 -0.3333333 -0.3333333 -2.750292

(comps = tidyMCMC(as.mcmc(coefs %*% t(c.mat)), conf.int = TRUE, conf.method = "HPDinterval"))

  term estimate std.error conf.low conf.high
1 var1 5.718753 1.8460785 2.065845  9.169838
2 var2 6.122501 0.8984429 4.344599  7.808558

Finite Population Standard Deviations

General
MCMCpack
JAGS
STAN
RSTANARM
BRMS

Variance components, the amount of added variance attributed to each influence, are traditionally estimated for so called random effects. These are the effects for which the levels employed in the design are randomly selected to represent a broader range of possible levels. For such effects, effect sizes (differences between each level and a reference level) are of little value. Instead, the 'importance' of the variables are measured in units of variance components.

On the other hand, regular variance components for fixed factors (those whose measured levels represent the only levels of interest) are not logical - since variance components estimate variance as if the levels are randomly selected from a larger population. Nevertheless, in order to compare and contrast the scale of variability of both fixed and random factors, it is necessary to measure both on the same scale (sample or population based variance).

Finite-population variance components (Gelman, 2005) assume that the levels of all factors (fixed and random) in the design are all the possible levels available. In other words, they are assumed to represent finite populations of levels. Sample (rather than population) statistics are then used to calculate these finite-population variances (or standard deviations).

Since standard deviation (and variance) are bound at zero, standard deviation posteriors are typically non-normal. Consequently, medians and HPD intervals are more robust estimates.

library(broom)
mcmc = data.mcmcpack
head(mcmc)

Markov Chain Monte Carlo (MCMC) output:
Start = 1001 
End = 1007 
Thinning interval = 1 
     (Intercept)  AGroup B  AGroup C          B   sigma2
[1,]    50.18839 -18.23379 -22.87414 -0.4394640 14.11917
[2,]    49.39556 -15.64877 -20.98086 -0.4528605 13.22004
[3,]    49.44854 -16.36367 -23.66886 -0.3887300 15.57976
[4,]    51.46891 -16.26642 -24.26313 -0.4709924 12.01597
[5,]    50.55617 -17.54613 -22.44262 -0.4252965 16.35805
[6,]    50.73340 -17.54078 -23.75846 -0.4598987 10.84731
[7,]    51.16774 -17.41150 -22.41515 -0.3996571 17.57170

# A
wch = grep("^A", colnames(mcmc))
# Get the rowwise standard deviations between effects parameters
sd.A = apply(mcmc[, wch], 1, sd)
# B
wch = grep("^B", colnames(mcmc))
sd.B = sd(data$B) * abs(mcmc[, wch])
# generate a model matrix
newdata = data
Xmat = model.matrix(~A + B, newdata)
## get median parameter estimates
wch = grep("(Intercept)|^A|^B", colnames(mcmc))
coefs = mcmc[, wch]
fit = coefs %*% t(Xmat)
resid = sweep(fit, 2, data$Y, "-")
sd.resid = apply(resid, 1, sd)

sd.all = cbind(sd.A, sd.B, sd.resid)
(fpsd = tidyMCMC(sd.all, conf.int = TRUE, conf.method = "HPDinterval"))

      term estimate std.error conf.low conf.high
1     sd.A 4.080665 1.2858143 1.606486  6.754944
2     sd.B 5.030454 0.7743607 3.478134  6.527505
3 sd.resid 3.983878 0.1906139 3.761980  4.356994

# OR expressed as a percentage
(fpsd.p = tidyMCMC(100 * sd.all/rowSums(sd.all), estimate.method = "median",
    conf.int = TRUE, conf.method = "HPDinterval"))

      term estimate std.error conf.low conf.high
1     sd.A 31.28801  7.123578 16.55312  44.15266
2     sd.B 38.61193  5.120548 28.29820  48.32862
3 sd.resid 30.05025  4.065060 24.60465  38.59845

## we can even plot this as a Bayesian ANOVA table
ggplot(fpsd, aes(y = estimate, x = term)) + geom_hline(yintercept = 0,
    linetype = "dashed") + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) +
    geom_text(aes(label = sprintf("%.2f%%", fpsd.p$estimate), vjust = -1)) +
    scale_y_continuous("Finite population standard deviation") + scale_x_discrete() +
    coord_flip() + theme_classic()

plot of chunk tut7.5bMCMCpackFinitePopulation

Conclusions: Approximately 31.3% of the total finite population standard deviation is due to x.

library(broom)
mcmc = data.r2jags$BUGSoutput$sims.matrix
head(mcmc)

      beta[1]   beta[2]   beta[3]    beta[4] deviance    sigma
[1,] 52.60917 -18.82408 -23.38489 -0.4880702 167.0879 4.283265
[2,] 53.75233 -18.89312 -24.92630 -0.5191816 171.4824 5.124059
[3,] 51.15722 -16.81356 -23.25841 -0.5336914 170.4625 3.705732
[4,] 50.32853 -16.98682 -23.41316 -0.4149829 165.9603 4.545555
[5,] 48.78829 -15.18438 -24.19762 -0.4395482 172.1889 4.458376
[6,] 51.55891 -17.64261 -22.37656 -0.5040106 166.7893 4.155680

# A
wch = grep("beta.[2-3]", colnames(mcmc))
# Get the rowwise standard deviations between effects parameters
sd.A = apply(mcmc[, wch], 1, sd)
# B
wch = grep("beta.4", colnames(mcmc))
sd.B = sd(data$B) * abs(mcmc[, wch])
# generate a model matrix
newdata = data
Xmat = model.matrix(~A + B, newdata)
## get median parameter estimates
wch = grep("beta.[1-4]", colnames(mcmc))
coefs = mcmc[, wch]
fit = coefs %*% t(Xmat)
resid = sweep(fit, 2, data$Y, "-")
sd.resid = apply(resid, 1, sd)

sd.all = cbind(sd.A, sd.B, sd.resid)
(fpsd = tidyMCMC(sd.all, conf.int = TRUE, conf.method = "HPDinterval"))

      term estimate std.error conf.low conf.high
1     sd.A 4.085929 1.3242287 1.399537  6.648241
2     sd.B 5.031876 0.7938012 3.468931  6.606399
3 sd.resid 3.991447 0.1970424 3.762837  4.379866

# OR expressed as a percentage
(fpsd.p = tidyMCMC(100 * sd.all/rowSums(sd.all), estimate.method = "median",
    conf.int = TRUE, conf.method = "HPDinterval"))

      term estimate std.error conf.low conf.high
1     sd.A 31.33381  7.376887 15.57164  43.97278
2     sd.B 38.65910  5.313671 27.75097  48.62132
3 sd.resid 29.97544  4.188131 24.58803  39.19370

## we can even plot this as a Bayesian ANOVA table
ggplot(fpsd, aes(y = estimate, x = term)) + geom_hline(yintercept = 0,
    linetype = "dashed") + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) +
    geom_text(aes(label = sprintf("%.2f%%", fpsd.p$estimate), vjust = -1)) +
    scale_y_continuous("Finite population standard deviation") + scale_x_discrete() +
    coord_flip() + theme_classic()

plot of chunk tut7.5bR2JAGSFinitePopulation

Conclusions: Approximately 31.3% of the total finite population standard deviation is due to x.

library(broom)
mcmc = as.matrix(data.rstan)
head(mcmc)

          parameters
iterations  beta[1]   beta[2]   beta[3]    beta[4]    sigma    mu[1]    mu[2]    mu[3]    mu[4]
      [1,] 52.68797 -15.89902 -22.48872 -0.5325084 4.190521 49.72136 44.37439 39.97070 51.24060
      [2,] 51.04073 -14.86065 -23.63103 -0.4404722 3.869872 48.58686 44.16402 40.52145 49.84351
      [3,] 52.08091 -17.19621 -23.79866 -0.4785798 4.179663 49.41474 44.60927 40.65155 50.78012
      [4,] 51.54719 -19.52916 -24.99191 -0.3788141 4.283020 49.43681 45.63310 42.50041 50.51756
      [5,] 49.24231 -15.18013 -21.04165 -0.3906899 3.719325 47.06578 43.14281 39.91192 48.18041
      [6,] 49.65178 -16.39404 -19.21428 -0.4629938 3.691510 47.07244 42.42347 38.59465 48.39336
          parameters
iterations    mu[5]    mu[6]    mu[7]    mu[8]    mu[9]   mu[10]   mu[11]   mu[12]   mu[13]
      [1,] 40.47396 41.79718 41.35552 33.11874 51.77358 31.91453 32.08782 35.17453 31.86079
      [2,] 40.93773 42.03225 41.66692 34.85374 50.28437 33.85767 32.29147 34.84469 32.10367
      [3,] 41.10385 42.29307 41.89613 34.49351 51.25912 33.41126 30.65966 33.43378 30.45562
      [4,] 42.85842 43.79973 43.48554 37.62609 50.89671 36.76944 28.67375 30.86957 28.51224
      [5,] 40.28116 41.25197 40.92793 34.88479 48.57144 34.00129 30.61307 32.87772 30.44650
      [6,] 39.03221 40.18270 39.79869 32.63715 48.85676 31.59015 29.17031 31.85408 28.97291
          parameters
iterations   mu[14]   mu[15]   mu[16]   mu[17]   mu[18]   mu[19]   mu[20]   mu[21]   mu[22]
      [1,] 24.12069 24.92430 28.59623 33.75113 31.31670 16.35942 24.54276 24.16979 27.07583
      [2,] 25.70134 26.36605 29.40334 33.66730 31.65363 19.28149 26.05046 22.42234 24.82611
      [3,] 23.49939 24.22161 27.52168 32.15453 29.96664 16.52413 23.87872 22.86342 25.47515
      [4,] 23.00612 23.57779 26.18991 29.85699 28.12520 17.48493 23.30637 22.26606 24.33334
      [5,] 24.76775 25.35734 28.05136 31.83340 30.04731 19.07348 25.07742 23.77698 25.90907
      [6,] 22.24322 22.94193 26.13451 30.61648 28.49985 15.49512 22.61020 25.19514 27.72182
          parameters
iterations   mu[23]   mu[24]   mu[25]   mu[26]   mu[27]   mu[28]   mu[29]   mu[30] log_lik[1]
      [1,] 21.17639 32.86026 23.41873 25.46849 10.27852 11.46507 20.12482 28.62927  -2.355750
      [2,] 19.94630 29.61079 21.80108 23.49658 10.93196 11.91344 19.07648 26.11106  -2.348154
      [3,] 20.17316 30.67378 22.18841 24.03059 10.37895 11.44533 19.22809 26.87127  -2.362435
      [4,] 20.13662 28.44826 21.73177 23.18992 12.38413 13.22821 19.38856 25.43843  -2.385424
      [5,] 21.58078 30.15298 23.22594 24.72981 13.58525 14.45580 20.80927 27.04880  -2.564270
      [6,] 22.59250 32.75114 24.54212 26.32431 13.11726 14.14891 21.67820 29.07247  -2.560301
          parameters
iterations log_lik[2] log_lik[3] log_lik[4] log_lik[5] log_lik[6] log_lik[7] log_lik[8] log_lik[9]
      [1,]  -2.380651  -2.392311  -2.359395  -2.644059  -2.370981  -2.656101  -2.358963  -3.937789
      [2,]  -2.321658  -2.285946  -2.297979  -2.721302  -2.283643  -2.564277  -2.439362  -3.463785
      [3,]  -2.366243  -2.356686  -2.349263  -2.769866  -2.352204  -2.562282  -2.450072  -3.731235
      [4,]  -2.375320  -2.422274  -2.374746  -3.224846  -2.411614  -2.408977  -3.057791  -3.556057
      [5,]  -2.413647  -2.289148  -2.466128  -2.560216  -2.299999  -2.726477  -2.418549  -2.888813
      [6,]  -2.546048  -2.467202  -2.424089  -2.338922  -2.442708  -3.079584  -2.224991  -2.983441
          parameters
iterations log_lik[10] log_lik[11] log_lik[12] log_lik[13] log_lik[14] log_lik[15] log_lik[16]
      [1,]   -2.560221   -2.385313   -2.758740   -2.999655   -2.808351   -2.357355   -5.539627
      [2,]   -2.291578   -2.298123   -2.669736   -2.956470   -3.313616   -2.305454   -6.602203
      [3,]   -2.391007   -2.530008   -2.468268   -3.440631   -2.676762   -2.386746   -4.935821
      [4,]   -2.499489   -2.925431   -2.381090   -4.170168   -2.601230   -2.460896   -4.195091
      [5,]   -2.246331   -2.469401   -2.312064   -3.614919   -3.014530   -2.232484   -5.873203
      [6,]   -2.561867   -2.812913   -2.232745   -4.376755   -2.390970   -2.440833   -4.643851
          parameters
iterations log_lik[17] log_lik[18] log_lik[19] log_lik[20] log_lik[21] log_lik[22] log_lik[23]
      [1,]   -2.602428   -2.541644   -2.411276   -2.925312   -2.550560   -2.796703   -2.397480
      [2,]   -2.549710   -2.440502   -2.909103   -3.472430   -2.915533   -2.369028   -2.480363
      [3,]   -2.402926   -2.791773   -2.423397   -2.767720   -2.795438   -2.507550   -2.496696
      [4,]   -2.397022   -3.282272   -2.553796   -2.661806   -2.936898   -2.413541   -2.518648
      [5,]   -2.272281   -2.768729   -2.857917   -3.144357   -2.565447   -2.513088   -2.259383
      [6,]   -2.226004   -3.294590   -2.237378   -2.464606   -2.320908   -3.001039   -2.225789
          parameters
iterations log_lik[24] log_lik[25] log_lik[26] log_lik[27] log_lik[28] log_lik[29] log_lik[30]
      [1,]   -4.438382   -3.307734   -2.499837   -2.496503   -3.808212   -2.814545   -2.829593
      [2,]   -3.213944   -2.854593   -2.275338   -2.554512   -3.772549   -3.133717   -2.355333
      [3,]   -3.512033   -2.945362   -2.369487   -2.507912   -3.821285   -3.044316   -2.505695
      [4,]   -2.842706   -2.833433   -2.373597   -2.891791   -3.165134   -2.993189   -2.395956
      [5,]   -3.470845   -3.366605   -2.318398   -3.350387   -2.858380   -2.637408   -2.461301
      [6,]   -4.845740   -3.980841   -2.585876   -3.176836   -2.957508   -2.450304   -2.981159
          parameters
iterations      lp__
      [1,] -56.99328
      [2,] -56.17719
      [3,] -54.73138
      [4,] -55.42088
      [5,] -54.94832
      [6,] -56.99186

# A
wch = grep("beta.[2-3]", colnames(mcmc))
# Get the rowwise standard deviations between effects parameters
sd.A = apply(mcmc[, wch], 1, sd)
# B
wch = grep("beta.[4]", colnames(mcmc))
sd.B = sd(data$B) * abs(mcmc[, wch])
# generate a model matrix
newdata = data
Xmat = model.matrix(~A + B, newdata)
## get median parameter estimates
wch = grep("beta", colnames(mcmc))
coefs = mcmc[, wch]
fit = coefs %*% t(Xmat)
resid = sweep(fit, 2, data$Y, "-")
sd.resid = apply(resid, 1, sd)

sd.all = cbind(sd.A, sd.B, sd.resid)
(fpsd = tidyMCMC(sd.all, conf.int = TRUE, conf.method = "HPDinterval"))

      term estimate std.error conf.low conf.high
1     sd.A 4.072432 1.3484539 1.394517  6.692788
2     sd.B 5.053796 0.7704775 3.514660  6.518542
3 sd.resid 3.993504 0.1978560 3.764060  4.389092

# OR expressed as a percentage
(fpsd.p = tidyMCMC(100 * sd.all/rowSums(sd.all), estimate.method = "median",
    conf.int = TRUE, conf.method = "HPDinterval"))

      term estimate std.error conf.low conf.high
1     sd.A 31.42998  7.687620 14.88253  43.87950
2     sd.B 38.57359  5.482235 27.76693  48.93887
3 sd.resid 30.05401  4.027620 25.10582  39.01512

## we can even plot this as a Bayesian ANOVA table
ggplot(fpsd, aes(y = estimate, x = term)) + geom_hline(yintercept = 0,
    linetype = "dashed") + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) +
    geom_text(aes(label = sprintf("%.2f%%", fpsd.p$estimate), vjust = -1)) +
    scale_y_continuous("Finite population standard deviation") + scale_x_discrete() +
    coord_flip() + theme_classic()

plot of chunk tut7.5bR2RSTANFinitePopulation

Conclusions: Approximately 31.4% of the total finite population standard deviation is due to x.

library(broom)
mcmc = as.matrix(data.rstanarm)
head(mcmc)

          parameters
iterations (Intercept)  AGroup B  AGroup C          B    sigma
      [1,]    49.25161 -18.13674 -20.71374 -0.3963549 4.384061
      [2,]    50.28149 -20.18446 -23.60224 -0.3823662 4.601748
      [3,]    47.22865 -13.74348 -20.42610 -0.3439257 3.559008
      [4,]    50.36383 -18.72387 -23.75338 -0.3776768 4.211066
      [5,]    49.33028 -18.88190 -20.94786 -0.3634085 3.830714
      [6,]    49.08087 -14.40159 -22.87480 -0.4144611 3.793997

# A
wch = grep("^A", colnames(mcmc))
# Get the rowwise standard deviations between effects parameters
sd.A = apply(mcmc[, wch], 1, sd)
# B
wch = grep("^B", colnames(mcmc))
sd.B = sd(data$B) * abs(mcmc[, wch])
# generate a model matrix
newdata = data
Xmat = model.matrix(~A + B, newdata)
## get median parameter estimates
wch = grep("(Intercept)|^A|^B", colnames(mcmc))
coefs = mcmc[, wch]
fit = coefs %*% t(Xmat)
resid = sweep(fit, 2, data$Y, "-")
sd.resid = apply(resid, 1, sd)

sd.all = cbind(sd.A, sd.B, sd.resid)
(fpsd = tidyMCMC(sd.all, conf.int = TRUE, conf.method = "HPDinterval"))

      term estimate std.error conf.low conf.high
1     sd.A 4.026941 1.3323358 1.272390  6.477067
2     sd.B 5.031510 0.8284226 3.467658  6.627846
3 sd.resid 4.004512 0.2119432 3.762475  4.420304

# OR expressed as a percentage
(fpsd.p = tidyMCMC(100 * sd.all/rowSums(sd.all), estimate.method = "median",
    conf.int = TRUE, conf.method = "HPDinterval"))

      term estimate std.error conf.low conf.high
1     sd.A 31.14949  7.543440 13.88979  43.33205
2     sd.B 38.63719  5.482952 27.23552  49.17710
3 sd.resid 30.23369  4.336975 24.98450  40.19062

## we can even plot this as a Bayesian ANOVA table
ggplot(fpsd, aes(y = estimate, x = term)) + geom_hline(yintercept = 0,
    linetype = "dashed") + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) +
    geom_text(aes(label = sprintf("%.2f%%", fpsd.p$estimate), vjust = -1)) +
    scale_y_continuous("Finite population standard deviation") + scale_x_discrete() +
    coord_flip() + theme_classic()

plot of chunk tut7.5bR2RSTANARMFinitePopulation

Conclusions: Approximately 31.1% of the total finite population standard deviation is due to x.

library(broom)
mcmc = as.matrix(data.brms)
head(mcmc)

          parameters
iterations b_Intercept b_AGroupB b_AGroupC        b_B    sigma      lp__
      [1,]    46.14429 -17.89799 -23.20626 -0.1747196 4.234569 -112.8789
      [2,]    50.46466 -17.00736 -21.94628 -0.4189975 3.964478 -105.7635
      [3,]    49.51423 -16.74688 -22.42555 -0.4566537 3.784797 -106.7988
      [4,]    52.03476 -20.63035 -26.45616 -0.3795796 4.637355 -108.6522
      [5,]    49.58570 -18.21573 -23.70766 -0.4119922 4.153369 -106.9716
      [6,]    51.87136 -19.76110 -26.12697 -0.4639165 3.485067 -109.3362

# A
wch = grep("^b_A", colnames(mcmc))
# Get the rowwise standard deviations between effects parameters
sd.A = apply(mcmc[, wch], 1, sd)
# B
wch = grep("^b_B", colnames(mcmc))
sd.B = sd(data$B) * abs(mcmc[, wch])
# generate a model matrix
newdata = data
Xmat = model.matrix(~A + B, newdata)
## get median parameter estimates
wch = c(grep("(b_Intercept)|^b_", colnames(mcmc)))
coefs = mcmc[, wch]
fit = coefs %*% t(Xmat)
resid = sweep(fit, 2, data$Y, "-")
sd.resid = apply(resid, 1, sd)

sd.all = cbind(sd.A, sd.B, sd.resid)
(fpsd = tidyMCMC(sd.all, conf.int = TRUE, conf.method = "HPDinterval"))

      term estimate std.error conf.low conf.high
1     sd.A 4.043769 1.3053747 1.460773  6.484054
2     sd.B 5.012018 0.7745195 3.599942  6.585264
3 sd.resid 3.982644 0.1855119 3.761674  4.356373

# OR expressed as a percentage
(fpsd.p = tidyMCMC(100 * sd.all/rowSums(sd.all), estimate.method = "median",
    conf.int = TRUE, conf.method = "HPDinterval"))

      term estimate std.error conf.low conf.high
1     sd.A 31.29667  7.332669 15.34081  43.31182
2     sd.B 38.43989  5.149650 28.75407  49.05275
3 sd.resid 30.14396  4.349195 24.45494  39.31055

## we can even plot this as a Bayesian ANOVA table
ggplot(fpsd, aes(y = estimate, x = term)) + geom_hline(yintercept = 0,
    linetype = "dashed") + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) +
    geom_text(aes(label = sprintf("%.2f%%", fpsd.p$estimate), vjust = -1)) +
    scale_y_continuous("Finite population standard deviation") + scale_x_discrete() +
    coord_flip() + theme_classic()

plot of chunk tut7.5bR2BRMSFinitePopulation

Conclusions: Approximately 31.3% of the total finite population standard deviation is due to x.

$R^2$

General
MCMCpack
JAGS
STAN
RSTANARM
BRMS

In a frequentist context, the $R^2$ value is seen as a useful indicator of goodness of fit. Whilst it has long been acknowledged that this measure is not appropriate for comparing models (for such purposes information criterion such as AIC are more appropriate), it is nevertheless useful for estimating the amount (percent) of variance explained by the model.

In a frequentist context, $R^2$ is calculated as the variance in predicted values divided by the variance in the observed (response) values. Unfortunately, this classical formulation does not translate simply into a Bayesian context since the equivalently calculated numerator can be larger than the an equivalently calculated denominator - thereby resulting in an $R^2$ greater than 100%. Gelman, Goodrich, Gabry, and Ali (2017) proposed an alternative formulation in which the denominator comprises the sum of the explained variance and the variance of the residuals.

So in the standard regression model notation of: $$ \begin{align} y_i \sim{}& N(\mu_i, \sigma)\\ \mu_i =& \mathbf{X}\boldsymbol{\beta} \end{align} $$ The $R^2$ could be formulated as: $$ R^2 = \frac{\sigma^2_f}{\sigma^2_f + \sigma^2_e} $$ where $\sigma^2_f = var(\mu)$, ($\mu = \mathbf{X}\boldsymbol{\beta})$) and for Gaussian models $\sigma^2_e = var(y-\mu)$

library(broom)
mcmc <- data.mcmcpack
Xmat = model.matrix(~A + B, data)
wch = grep("(Intercept)|^A|^B", colnames(mcmc))
coefs = mcmc[, wch]
fit = coefs %*% t(Xmat)
resid = sweep(fit, 2, data$Y, "-")
var_f = apply(fit, 1, var)
var_e = apply(resid, 1, var)
R2 = var_f/(var_f + var_e)
tidyMCMC(as.mcmc(R2), conf.int = TRUE, conf.method = "HPDinterval")

  term  estimate  std.error  conf.low conf.high
1 var1 0.8712662 0.02114025 0.8316521 0.8958785

# for comparison with frequentist
summary(lm(Y ~ A + B, data))

Call:
lm(formula = Y ~ A + B, data = data)

Residuals:
   Min     1Q Median     3Q    Max 
-8.671 -2.911  0.328  2.188  7.407 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)  50.20598    1.71450  29.283  < 2e-16 ***
AGroup B    -17.15033    1.78618  -9.602 4.91e-10 ***
AGroup C    -22.91188    1.79775 -12.745 1.09e-12 ***
B            -0.41403    0.06143  -6.740 3.76e-07 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 3.973 on 26 degrees of freedom
Multiple R-squared:  0.884,	Adjusted R-squared:  0.8706 
F-statistic: 66.05 on 3 and 26 DF,  p-value: 2.719e-12

library(broom)
mcmc <- data.r2jags$BUGSoutput$sims.matrix
Xmat = model.matrix(~A + B, data)
wch = grep("beta", colnames(mcmc))
coefs = mcmc[, wch]
fit = coefs %*% t(Xmat)
resid = sweep(fit, 2, data$Y, "-")
var_f = apply(fit, 1, var)
var_e = apply(resid, 1, var)
R2 = var_f/(var_f + var_e)
tidyMCMC(as.mcmc(R2), conf.int = TRUE, conf.method = "HPDinterval")

  term  estimate  std.error  conf.low conf.high
1 var1 0.8708952 0.02144096 0.8290317 0.8960092

# for comparison with frequentist
summary(lm(Y ~ A + B, data))

Call:
lm(formula = Y ~ A + B, data = data)

Residuals:
   Min     1Q Median     3Q    Max 
-8.671 -2.911  0.328  2.188  7.407 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)  50.20598    1.71450  29.283  < 2e-16 ***
AGroup B    -17.15033    1.78618  -9.602 4.91e-10 ***
AGroup C    -22.91188    1.79775 -12.745 1.09e-12 ***
B            -0.41403    0.06143  -6.740 3.76e-07 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 3.973 on 26 degrees of freedom
Multiple R-squared:  0.884,	Adjusted R-squared:  0.8706 
F-statistic: 66.05 on 3 and 26 DF,  p-value: 2.719e-12

library(broom)
mcmc <- as.matrix(data.rstan)
Xmat = model.matrix(~A + B, data)
wch = grep("beta", colnames(mcmc))
coefs = mcmc[, wch]
fit = coefs %*% t(Xmat)
resid = sweep(fit, 2, data$Y, "-")
var_f = apply(fit, 1, var)
var_e = apply(resid, 1, var)
R2 = var_f/(var_f + var_e)
tidyMCMC(as.mcmc(R2), conf.int = TRUE, conf.method = "HPDinterval")

  term  estimate  std.error  conf.low conf.high
1 var1 0.8706419 0.02178395 0.8297436 0.8959157

# for comparison with frequentist
summary(lm(Y ~ A + B, data))

Call:
lm(formula = Y ~ A + B, data = data)

Residuals:
   Min     1Q Median     3Q    Max 
-8.671 -2.911  0.328  2.188  7.407 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)  50.20598    1.71450  29.283  < 2e-16 ***
AGroup B    -17.15033    1.78618  -9.602 4.91e-10 ***
AGroup C    -22.91188    1.79775 -12.745 1.09e-12 ***
B            -0.41403    0.06143  -6.740 3.76e-07 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 3.973 on 26 degrees of freedom
Multiple R-squared:  0.884,	Adjusted R-squared:  0.8706 
F-statistic: 66.05 on 3 and 26 DF,  p-value: 2.719e-12

library(broom)
mcmc <- as.matrix(data.rstanarm)
Xmat = model.matrix(~A + B, data)
wch = grep("Intercept|^A|^B", colnames(mcmc))
coefs = mcmc[, wch]
fit = coefs %*% t(Xmat)
resid = sweep(fit, 2, data$Y, "-")
var_f = apply(fit, 1, var)
var_e = apply(resid, 1, var)
R2 = var_f/(var_f + var_e)
tidyMCMC(as.mcmc(R2), conf.int = TRUE, conf.method = "HPDinterval")

  term  estimate std.error  conf.low conf.high
1 var1 0.8699886 0.0241677 0.8256175 0.8960312

# for comparison with frequentist
summary(lm(Y ~ A + B, data))

Call:
lm(formula = Y ~ A + B, data = data)

Residuals:
   Min     1Q Median     3Q    Max 
-8.671 -2.911  0.328  2.188  7.407 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)  50.20598    1.71450  29.283  < 2e-16 ***
AGroup B    -17.15033    1.78618  -9.602 4.91e-10 ***
AGroup C    -22.91188    1.79775 -12.745 1.09e-12 ***
B            -0.41403    0.06143  -6.740 3.76e-07 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 3.973 on 26 degrees of freedom
Multiple R-squared:  0.884,	Adjusted R-squared:  0.8706 
F-statistic: 66.05 on 3 and 26 DF,  p-value: 2.719e-12

library(broom)
mcmc <- as.matrix(data.brms)
Xmat = model.matrix(~A + B, data)
wch = grep("b_", colnames(mcmc))
coefs = mcmc[, wch]
fit = coefs %*% t(Xmat)
resid = sweep(fit, 2, data$Y, "-")
var_f = apply(fit, 1, var)
var_e = apply(resid, 1, var)
R2 = var_f/(var_f + var_e)
tidyMCMC(as.mcmc(R2), conf.int = TRUE, conf.method = "HPDinterval")

  term  estimate  std.error  conf.low conf.high
1 var1 0.8712322 0.02007052 0.8322119 0.8956157

# for comparison with frequentist
summary(lm(Y ~ A + B, data))

Call:
lm(formula = Y ~ A + B, data = data)

Residuals:
   Min     1Q Median     3Q    Max 
-8.671 -2.911  0.328  2.188  7.407 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)  50.20598    1.71450  29.283  < 2e-16 ***
AGroup B    -17.15033    1.78618  -9.602 4.91e-10 ***
AGroup C    -22.91188    1.79775 -12.745 1.09e-12 ***
B            -0.41403    0.06143  -6.740 3.76e-07 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 3.973 on 26 degrees of freedom
Multiple R-squared:  0.884,	Adjusted R-squared:  0.8706 
F-statistic: 66.05 on 3 and 26 DF,  p-value: 2.719e-12

Dealing with heterogeneous slopes

Details of data generation

Random data incorporating the following trends (effect parameters)

the sample size per treatment=10
the categorical $x$ variable with 3 levels
the first treatment group has a population mean of 40.
the other two treatments reduce the mean by 15 and 20 units respectively
the data are drawn from normal distributions with a mean of 0 and standard deviation of 4 ($\sigma^2=16$)
the covariate (B) is a continuous variable with a mean of 20 and a standard deviation of 15

set.seed(6)
n <- 10
p <- 3
A.eff <- c(40, -15, -20)
beta <- c(-0.45, -0.1, 0.5)
sigma <- 4
B <- rnorm(n * p, 0, 15)
A <- gl(p, n, lab = paste("Group", LETTERS[1:3]))
mm <- model.matrix(~A * B)
data1 <- data.frame(A = A, B = B, Y = as.numeric(c(A.eff, beta) %*% t(mm)) + rnorm(n * p, 0, 4))
data1$B <- data1$B + 20
head(data1)

        A        B        Y
1 Group A 24.04409 35.49432
2 Group A 10.55022 46.22216
3 Group A 33.02990 29.41898
4 Group A 45.90793 24.10656
5 Group A 20.36281 44.38834
6 Group A 25.52038 36.87477

Exploratory data analysis

library(car)
scatterplot(Y ~ B | A, data = data1)

boxplot(Y ~ A, data1)

# OR via ggplot
library(ggplot2)
ggplot(data1, aes(y = Y, x = B, group = A)) + geom_point() + geom_smooth(method = "lm")

ggplot(data1, aes(y = Y, x = A)) + geom_boxplot()

Conclusions:

there is no evidence of obvious non-normality
the assumption of linearity seems reasonable
the variability of the three groups seems approximately equal
the slopes (Y vs B trends) do appear to differ between treatment groups - in particular, Group C seems to portray a different trend to Groups A and B.

Homogeneity of slopes

anova(lm(Y ~ B * A, data = data1))

Analysis of Variance Table

Response: Y
          Df  Sum Sq Mean Sq F value    Pr(>F)    
B          1 1257.14 1257.14  98.520 5.685e-10 ***
A          2 2042.02 1021.01  80.015 2.420e-11 ***
B:A        2  510.02  255.01  19.985 7.778e-06 ***
Residuals 24  306.25   12.76                      
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Conclusions: there is strong evidence to suggest that the assumption of equal slopes is violated.

Model fitting or statistical analysis

General
MCMCpack
JAGS
STAN
RSTANARM
BRMS

The multiplicative model is constructed and specified in the same way as the additive model. The difference is that the model matrix ($\mathbf{X}$) contains the interaction terms.

$$ \begin{align} y_{ij} &\sim{} N(\mu_{ij}, \sigma)\\ \mu_{ij} &= \beta_0 + \mathbf{X}\boldsymbol{\beta}\\[1em] \beta_0 &\sim{} N(0,100)\\ \beta &\sim{} N(0,10)\\ \sigma &\sim{} cauchy(0,5)\\ \end{align} $$

library(MCMCpack)
data1.mcmcpack <- MCMCregress(Y ~ A * B, data = data1)

Define the model

modelString = "
  model {
  #Likelihood
  for (i in 1:n) {
  y[i]~dnorm(mean[i],tau)
  mean[i] <- inprod(beta[],X[i,])
  }
  #Priors
  for (i in 1:ngroups) {
  beta[i] ~ dnorm(0, 1.0E-6) 
  }
  sigma ~ dunif(0, 100)
  tau <- 1 / (sigma * sigma)
  }
  "

Define the data

Arrange the data as a list (as required by BUGS). As input, JAGS will need to be supplied with:

the response variable (y)
the predictor model matrix (X)
the total number of observed items (n)
the number of predictor terms (nX)

This all needs to be contained within a list object. We will create two data lists, one for each of the hypotheses.

X <- model.matrix(~A * B, data1)
data1.list <- with(data1, list(y = Y, X = X, n = nrow(data1), ngroups = ncol(X)))

Define the MCMC chain parameters

Next we should define the behavioural parameters of the MCMC sampling chains. Include the following:

the nodes (estimated parameters) to monitor (return samples for)
the number of MCMC chains (3)
the number of burnin steps (1000)
the thinning factor (10)
the number of MCMC iterations - determined by the number of samples to save, the rate of thinning and the number of chains

params <- c("beta", "sigma")
nChains = 3
burnInSteps = 3000
thinSteps = 10
numSavedSteps = 15000  #across all chains
nIter = ceiling(burnInSteps + (numSavedSteps * thinSteps)/nChains)
nIter

[1] 53000

Fit the model

library(R2jags)

data1.r2jags <- jags(data = data1.list, inits = NULL, parameters.to.save = params,
    model.file = textConnection(modelString), n.chains = nChains, n.iter = nIter,
    n.burnin = burnInSteps, n.thin = thinSteps)

Compiling model graph
   Resolving undeclared variables
   Allocating nodes
Graph information:
   Observed stochastic nodes: 30
   Unobserved stochastic nodes: 7
   Total graph size: 298

Initializing model

print(data1.r2jags)

Inference for Bugs model at "5", fit using jags,
 3 chains, each with 53000 iterations (first 3000 discarded), n.thin = 10
 n.sims = 15000 iterations saved
         mu.vect sd.vect    2.5%     25%     50%     75%   97.5%  Rhat n.eff
beta[1]   49.553   2.336  45.049  48.003  49.538  51.092  54.162 1.001 15000
beta[2]  -12.227   3.093 -18.248 -14.285 -12.228 -10.182  -6.142 1.001 11000
beta[3]  -28.270   2.993 -34.198 -30.236 -28.275 -26.304 -22.340 1.001 15000
beta[4]   -0.495   0.093  -0.680  -0.556  -0.494  -0.433  -0.313 1.001 15000
beta[5]   -0.121   0.115  -0.347  -0.197  -0.122  -0.046   0.105 1.001 15000
beta[6]    0.410   0.111   0.193   0.336   0.410   0.482   0.636 1.001 15000
sigma      3.772   0.585   2.827   3.354   3.707   4.106   5.105 1.001 15000
deviance 163.101   4.379 156.875 159.924 162.329 165.512 173.639 1.001 15000

For each parameter, n.eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor (at convergence, Rhat=1).

DIC info (using the rule, pD = var(deviance)/2)
pD = 9.6 and DIC = 172.7
DIC is an estimate of expected predictive error (lower deviance is better).

data1.mcmc.list <- as.mcmc(data1.r2jags)

Model matrix formulation

The minimum model in Stan required to fit the above simple regression follows. Note the following modifications from the model defined in JAGS:

the normal distribution is defined by variance rather than precision
rather than using a uniform prior for sigma, I am using a half-Cauchy

Define the model

modelString = "
  data {
  int<lower=1> n;
  int<lower=1> nX;
  vector [n] y;
  matrix [n,nX] X;
  }
  parameters {
  vector[nX] beta;
  real<lower=0> sigma;
  }
  transformed parameters {
  vector[n] mu;

  mu = X*beta;
  }
  model {
  #Likelihood
  y~normal(mu,sigma);
  
  #Priors
  beta ~ normal(0,100);
  sigma~cauchy(0,5);
  }
  generated quantities {
  vector[n] log_lik;
  
  for (i in 1:n) {
  log_lik[i] = normal_lpdf(y[i] | mu[i], sigma); 
  }
  }
  "

Define the data

Arrange the data as a list (as required by BUGS). As input, JAGS will need to be supplied with:

the response variable (y)
the predictor model matrix (X)
the total number of observed items (n)
the number of predictor terms (nX)

This all needs to be contained within a list object. We will create two data lists, one for each of the hypotheses.

Xmat <- model.matrix(~A * B, data1)
data1.list <- with(data1, list(y = Y, X = Xmat, nX = ncol(Xmat), n = nrow(data1)))

Fit the model

## load the rstan package
library(rstan)

data1.rstan <- stan(data = data1.list, model_code = modelString, chains = 3,
    iter = 2000, warmup = 500, thin = 3)

SAMPLING FOR MODEL '3d2414c9dcf4b5e12be870eadd2c894a' NOW (CHAIN 1).

Gradient evaluation took 2.2e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.22 seconds.
Adjust your expectations accordingly!


Iteration:    1 / 2000 [  0%]  (Warmup)
Iteration:  200 / 2000 [ 10%]  (Warmup)
Iteration:  400 / 2000 [ 20%]  (Warmup)
Iteration:  501 / 2000 [ 25%]  (Sampling)
Iteration:  700 / 2000 [ 35%]  (Sampling)
Iteration:  900 / 2000 [ 45%]  (Sampling)
Iteration: 1100 / 2000 [ 55%]  (Sampling)
Iteration: 1300 / 2000 [ 65%]  (Sampling)
Iteration: 1500 / 2000 [ 75%]  (Sampling)
Iteration: 1700 / 2000 [ 85%]  (Sampling)
Iteration: 1900 / 2000 [ 95%]  (Sampling)
Iteration: 2000 / 2000 [100%]  (Sampling)

 Elapsed Time: 0.099447 seconds (Warm-up)
               0.151706 seconds (Sampling)
               0.251153 seconds (Total)


SAMPLING FOR MODEL '3d2414c9dcf4b5e12be870eadd2c894a' NOW (CHAIN 2).

Gradient evaluation took 7e-06 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.07 seconds.
Adjust your expectations accordingly!


Iteration:    1 / 2000 [  0%]  (Warmup)
Iteration:  200 / 2000 [ 10%]  (Warmup)
Iteration:  400 / 2000 [ 20%]  (Warmup)
Iteration:  501 / 2000 [ 25%]  (Sampling)
Iteration:  700 / 2000 [ 35%]  (Sampling)
Iteration:  900 / 2000 [ 45%]  (Sampling)
Iteration: 1100 / 2000 [ 55%]  (Sampling)
Iteration: 1300 / 2000 [ 65%]  (Sampling)
Iteration: 1500 / 2000 [ 75%]  (Sampling)
Iteration: 1700 / 2000 [ 85%]  (Sampling)
Iteration: 1900 / 2000 [ 95%]  (Sampling)
Iteration: 2000 / 2000 [100%]  (Sampling)

 Elapsed Time: 0.106983 seconds (Warm-up)
               0.116191 seconds (Sampling)
               0.223174 seconds (Total)


SAMPLING FOR MODEL '3d2414c9dcf4b5e12be870eadd2c894a' NOW (CHAIN 3).

Gradient evaluation took 7e-06 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.07 seconds.
Adjust your expectations accordingly!


Iteration:    1 / 2000 [  0%]  (Warmup)
Iteration:  200 / 2000 [ 10%]  (Warmup)
Iteration:  400 / 2000 [ 20%]  (Warmup)
Iteration:  501 / 2000 [ 25%]  (Sampling)
Iteration:  700 / 2000 [ 35%]  (Sampling)
Iteration:  900 / 2000 [ 45%]  (Sampling)
Iteration: 1100 / 2000 [ 55%]  (Sampling)
Iteration: 1300 / 2000 [ 65%]  (Sampling)
Iteration: 1500 / 2000 [ 75%]  (Sampling)
Iteration: 1700 / 2000 [ 85%]  (Sampling)
Iteration: 1900 / 2000 [ 95%]  (Sampling)
Iteration: 2000 / 2000 [100%]  (Sampling)

 Elapsed Time: 0.142543 seconds (Warm-up)
               0.116691 seconds (Sampling)
               0.259234 seconds (Total)

print(data1.rstan, par = c("beta", "sigma"))

Inference for Stan model: 3d2414c9dcf4b5e12be870eadd2c894a.
3 chains, each with iter=2000; warmup=500; thin=3; 
post-warmup draws per chain=500, total post-warmup draws=1500.

          mean se_mean   sd   2.5%    25%    50%    75%  97.5% n_eff Rhat
beta[1]  49.50    0.08 2.34  44.55  48.01  49.58  51.01  54.11   902    1
beta[2] -12.29    0.09 3.04 -18.60 -14.30 -12.17 -10.22  -6.70  1094    1
beta[3] -28.21    0.10 2.97 -33.80 -30.23 -28.34 -26.40 -22.04   957    1
beta[4]  -0.49    0.00 0.09  -0.67  -0.55  -0.49  -0.44  -0.31   978    1
beta[5]  -0.12    0.00 0.11  -0.34  -0.19  -0.12  -0.05   0.11  1164    1
beta[6]   0.41    0.00 0.11   0.18   0.34   0.41   0.48   0.62   953    1
sigma     3.73    0.02 0.58   2.81   3.32   3.66   4.05   5.07  1271    1

Samples were drawn using NUTS(diag_e) at Sat Nov  4 07:45:04 2017.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).

library(rstanarm)
library(broom)
library(coda)

data1.rstanarm = stan_glm(Y ~ A * B, data = data1, iter = 2000, warmup = 500,
    chains = 3, thin = 2, refresh = 0, prior_intercept = normal(0, 100),
    prior = normal(0, 100), prior_aux = cauchy(0, 2))

Gradient evaluation took 7.1e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.71 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.586557 seconds (Warm-up)
               0.34573 seconds (Sampling)
               0.932287 seconds (Total)


Gradient evaluation took 2.5e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.25 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.478284 seconds (Warm-up)
               0.353797 seconds (Sampling)
               0.832081 seconds (Total)


Gradient evaluation took 1.2e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.12 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.438293 seconds (Warm-up)
               0.338371 seconds (Sampling)
               0.776664 seconds (Total)

print(data1.rstanarm)

stan_glm
 family:  gaussian [identity]
 formula: Y ~ A * B
------

Estimates:
            Median MAD_SD
(Intercept)  49.6    2.3 
AGroup B    -12.3    3.1 
AGroup C    -28.3    3.0 
B            -0.5    0.1 
AGroup B:B   -0.1    0.1 
AGroup C:B    0.4    0.1 
sigma         3.7    0.6 

Sample avg. posterior predictive 
distribution of y (X = xbar):
         Median MAD_SD
mean_PPD 26.9    1.0  

------
For info on the priors used see help('prior_summary.stanreg').

tidyMCMC(data1.rstanarm, conf.int = TRUE, conf.method = "HPDinterval")

         term    estimate  std.error    conf.low   conf.high
1 (Intercept)  49.5438430 2.38328655  44.8579446  54.1707410
2    AGroup B -12.3097610 3.13510315 -18.8059933  -6.6018889
3    AGroup C -28.2751561 3.02810337 -33.8195233 -21.9065319
4           B  -0.4940998 0.09381214  -0.6922853  -0.3200262
5  AGroup B:B  -0.1204605 0.11724241  -0.3514843   0.1104638
6  AGroup C:B   0.4111006 0.11271108   0.1810014   0.6214744
7       sigma   3.7727701 0.59136185   2.7322401   4.9499110

library(brms)
library(broom)
library(coda)

data1.brms = brm(Y ~ A * B, data = data1, iter = 2000, warmup = 500, chains = 3,
    thin = 2, refresh = 0, prior = c(prior(normal(0, 100), class = "Intercept"),
        prior(normal(0, 100), class = "b"), prior(cauchy(0, 5), class = "sigma")))

Gradient evaluation took 2.9e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.29 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.094494 seconds (Warm-up)
               0.100503 seconds (Sampling)
               0.194997 seconds (Total)


Gradient evaluation took 8e-06 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.08 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.076777 seconds (Warm-up)
               0.104894 seconds (Sampling)
               0.181671 seconds (Total)


Gradient evaluation took 9e-06 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.09 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.091052 seconds (Warm-up)
               0.09965 seconds (Sampling)
               0.190702 seconds (Total)

print(data1.brms)

 Family: gaussian(identity) 
Formula: Y ~ A * B 
   Data: data1 (Number of observations: 30) 
Samples: 3 chains, each with iter = 2000; warmup = 500; thin = 2; 
         total post-warmup samples = 2250
    ICs: LOO = NA; WAIC = NA; R2 = NA
 
Population-Level Effects: 
          Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
Intercept    49.60      2.32    44.97    54.25       1182    1
AGroupB     -12.33      3.06   -18.49    -6.37       1216    1
AGroupC     -28.29      2.97   -34.18   -22.34       1246    1
B            -0.49      0.09    -0.67    -0.32       1160    1
AGroupB:B    -0.12      0.11    -0.35     0.11       1186    1
AGroupC:B     0.41      0.11     0.19     0.62       1248    1

Family Specific Parameters: 
      Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sigma     3.71      0.54     2.81     4.98       1423    1

Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample 
is a crude measure of effective sample size, and Rhat is the potential 
scale reduction factor on split chains (at convergence, Rhat = 1).

tidyMCMC(data1.brms, conf.int = TRUE, conf.method = "HPDinterval")

         term    estimate std.error    conf.low   conf.high
1 b_Intercept  49.6027270 2.3173028  44.8624049  54.0547491
2   b_AGroupB -12.3262056 3.0553552 -18.7775957  -6.7594663
3   b_AGroupC -28.2870152 2.9659002 -33.8736952 -22.1495177
4         b_B  -0.4926334 0.0908813  -0.6696454  -0.3177777
5 b_AGroupB:B  -0.1212027 0.1126979  -0.3497945   0.1053154
6 b_AGroupC:B   0.4067397 0.1097094   0.1878784   0.6203292
7       sigma   3.7129053 0.5394484   2.6776034   4.7406944

MCMC diagnostics

General
MCMCpack
JAGS
STAN
RSTANARM
BRMS

Traceplots for each parameter illustrate the MCMC sample values after each successive iteration along the chain. Bad chain mixing (characterized by any sort of pattern) suggests that the MCMC sampling chains may not have completely traversed all features of the posterior distribution and that more iterations are required to ensure the distribution has been accurately represented.
Autocorrelation plot for each paramter illustrate the degree of correlation between MCMC samples separated by different lags. For example, a lag of 0 represents the degree of correlation between each MCMC sample and itself (obviously this will be a correlation of 1). A lag of 1 represents the degree of correlation between each MCMC sample and the next sample along the Chain and so on. In order to be able to generate unbiased estimates of parameters, the MCMC samples should be independent (uncorrelated). In the figures below, this would be violated in the top autocorrelation plot and met in the bottom autocorrelation plot.
Rhat statistic for each parameter provides a measure of sampling efficiency/effectiveness. Ideally, all values should be less than 1.05. If there are values of 1.05 or greater it suggests that the sampler was not very efficient or effective. Not only does this mean that the sampler was potentiall slower than it could have been, more importantly, it could indicate that the sampler spent time sampling in a region of the likelihood that is less informative. Such a situation can arise from either a misspecified model or overly vague priors that permit sampling in otherwise nonscence parameter space.

Prior to inspecting any summaries of the parameter estimates, it is prudent to inspect a range of chain convergence diagnostics.

Trace plots
View trace plots
library(MCMCpack) plot(data1.mcmcpack)
Trace plots show no evidence that the chains have not reasonably traversed the entire multidimensional parameter space.

Raftery diagnostic

View Raftery diagnostic

library(MCMCpack)
raftery.diag(data1.mcmcpack)

Quantile (q) = 0.025
Accuracy (r) = +/- 0.005
Probability (s) = 0.95 
                                                   
             Burn-in  Total Lower bound  Dependence
             (M)      (N)   (Nmin)       factor (I)
 (Intercept) 2        3929  3746         1.050     
 AGroup B    2        3650  3746         0.974     
 AGroup C    2        3710  3746         0.990     
 B           2        3650  3746         0.974     
 AGroup B:B  2        3962  3746         1.060     
 AGroup C:B  2        3865  3746         1.030     
 sigma2      2        3929  3746         1.050

Autocorrelation diagnostic

View autocorrelations

library(MCMCpack)
autocorr.diag(data1.mcmcpack)

       (Intercept)     AGroup B     AGroup C            B   AGroup B:B    AGroup C:B      sigma2
Lag 0   1.00000000  1.000000000  1.000000000  1.000000000  1.000000000  1.0000000000 1.000000000
Lag 1  -0.01234075 -0.008277569 -0.011438936 -0.003198364 -0.011825247 -0.0003818285 0.217142614
Lag 5   0.01859622  0.019122855 -0.003823357  0.013434001  0.026688474  0.0020263629 0.005952516
Lag 10  0.01991699  0.021417875  0.011460174  0.013885559  0.022966909  0.0030683915 0.023030028
Lag 50 -0.00543317 -0.009493830 -0.007385782 -0.005974147 -0.008253629  0.0009582506 0.011223549

A lag of 1 appears to be mainly sufficient to avoid autocorrelation.

Again, prior to examining the summaries, we should have explored the convergence diagnostics.

library(coda)
data1.mcmc = as.mcmc(data1.r2jags)

Trace plots
plot(data1.mcmc)
Trace plots show no evidence that the chains have not reasonably traversed the entire multidimensional parameter space.
When there are a lot of parameters, this can result in a very large number of traceplots. To focus on just certain parameters (such as $\beta$s)
preds <- c("beta[1]", "beta[2]", "beta[3]", "beta[4]") plot(as.mcmc(data1.r2jags)[, preds])

Raftery diagnostic

raftery.diag(data1.mcmc)

[[1]]

Quantile (q) = 0.025
Accuracy (r) = +/- 0.005
Probability (s) = 0.95 
                                                
          Burn-in  Total Lower bound  Dependence
          (M)      (N)   (Nmin)       factor (I)
 beta[1]  20       37410 3746          9.99     
 beta[2]  20       38660 3746         10.30     
 beta[3]  20       36800 3746          9.82     
 beta[4]  20       38660 3746         10.30     
 beta[5]  20       37410 3746          9.99     
 beta[6]  20       38030 3746         10.20     
 deviance 20       39000 3746         10.40     
 sigma    20       36200 3746          9.66     


[[2]]

Quantile (q) = 0.025
Accuracy (r) = +/- 0.005
Probability (s) = 0.95 
                                                
          Burn-in  Total Lower bound  Dependence
          (M)      (N)   (Nmin)       factor (I)
 beta[1]  20       37410 3746          9.99     
 beta[2]  20       38030 3746         10.20     
 beta[3]  20       38050 3746         10.20     
 beta[4]  10       37410 3746          9.99     
 beta[5]  20       36800 3746          9.82     
 beta[6]  20       36800 3746          9.82     
 deviance 20       38030 3746         10.20     
 sigma    20       36800 3746          9.82     


[[3]]

Quantile (q) = 0.025
Accuracy (r) = +/- 0.005
Probability (s) = 0.95 
                                                
          Burn-in  Total Lower bound  Dependence
          (M)      (N)   (Nmin)       factor (I)
 beta[1]  20       36200 3746          9.66     
 beta[2]  20       37410 3746          9.99     
 beta[3]  20       39300 3746         10.50     
 beta[4]  20       39300 3746         10.50     
 beta[5]  20       36800 3746          9.82     
 beta[6]  20       36800 3746          9.82     
 deviance 20       39300 3746         10.50     
 sigma    20       36810 3746          9.83

Autocorrelation diagnostic

autocorr.diag(data1.mcmc)

             beta[1]      beta[2]      beta[3]      beta[4]      beta[5]      beta[6]      deviance
Lag 0    1.000000000  1.000000000  1.000000000  1.000000000  1.000000000  1.000000000  1.0000000000
Lag 10   0.004437399  0.002545394  0.005457633  0.002408673  0.005032132  0.003931065 -0.0005090962
Lag 50  -0.005470768 -0.006470215 -0.008316906 -0.004031177 -0.009725834 -0.003744444  0.0069346156
Lag 100 -0.006510539 -0.018290658 -0.009018695  0.003875208 -0.005127466  0.004179197 -0.0033519319
Lag 500 -0.002091585 -0.006610883  0.001302301 -0.000777923 -0.003703581  0.005995650  0.0081995570
               sigma
Lag 0    1.000000000
Lag 10   0.002181234
Lag 50  -0.002582261
Lag 100  0.004125329
Lag 500  0.008170166

A lag of 10 appears to be sufficient to avoid autocorrelation (poor mixing).

Again, prior to examining the summaries, we should have explored the convergence diagnostics. There are numerous ways of working with STAN model fits (for exploring diagnostics and summarization).

extract the mcmc samples and convert them into a mcmc.list to leverage the various coda routines
use the numerous routines that come with the rstan package
use the routines that come with the bayesplot package
explore the diagnostics interactively via shinystan

We will explore all of these:

via coda

Traceplots

library(coda)
s = as.array(data1.rstan)
wch = grep("beta", dimnames(s)$parameters)
s = s[, , wch]
mcmc <- do.call(mcmc.list, plyr:::alply(s[, , -(length(s[1, 1, ]))], 2, as.mcmc))
plot(mcmc)

plot of chunk tut7.5bSTAN2codaTraceplots

Autocorrelation

library(coda)
s = as.array(data1.rstan)
wch = grep("beta", dimnames(s)$parameters)
s = s[, , wch]
mcmc <- do.call(mcmc.list, plyr:::alply(s[, , -(length(s[1, 1, ]))], 2, as.mcmc))
autocorr.diag(mcmc)

           beta[1]      beta[2]      beta[3]      beta[4]      beta[5]
Lag 0   1.00000000  1.000000000  1.000000000  1.000000000  1.000000000
Lag 1   0.19962255  0.134452951  0.162486507  0.181592704  0.125717018
Lag 5  -0.01473278  0.013954338 -0.020209597 -0.007642484  0.007016602
Lag 10 -0.01802321 -0.008248189 -0.007702265 -0.019093851 -0.002760047
Lag 50 -0.03711757 -0.035737496 -0.030485873 -0.020661680 -0.014791965

via rstan
- Traceplots
  stan_trace(data1.rstan)
  Trace plots show no evidence that the chains have not reasonably traversed the entire multidimensional parameter space.
- Raftery diagnostic
  raftery.diag(data1.rstan)
  
  Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 You need a sample size of at least 3746 with these values of q, r and s
  The Raftery diagnostics for each chain estimate that we would require no more than 5000 samples to reach the specified level of confidence in convergence. As we have 16,667 samples, we can be confidence that convergence has occurred.
- Autocorrelation diagnostic
  stan_ac(data1.rstan)
  A lag of 2 appears borderline in a couple of cases suggesting some autocorrelation (poor mixing). It might be worth increasing the thining rate to 4. Feel free to do so, I am not going to bother for this demo.
- Rhat values. These values are a measure of sampling efficiency/effectiveness. Ideally, all values should be less than 1.05. If there are values of 1.05 or greater it suggests that the sampler was not very efficient or effective. Not only does this mean that the sampler was potentiall slower than it could have been, more importantly, it could indicate that the sampler spent time sampling in a region of the likelihood that is less informative. Such a situation can arise from either a misspecified model or overly vague priors that permit sampling in otherwise nonscence parameter space.
  stan_rhat(data1.rstan)
  In this instance, all rhat values are well below 1.05 (a good thing).
- Another measure of sampling efficiency is Effective Sample Size (ess). ess indicate the number samples (or proportion of samples that the sampling algorithm deamed effective. The sampler rejects samples on the basis of certain criterion and when it does so, the previous sample value is used. Hence while the MCMC sampling chain may contain 1000 samples, if there are only 10 effective samples (1%), the estimated properties are not likely to be reliable.
  stan_ess(data1.rstan)
  In this instance, most of the parameters have reasonably high effective samples and thus there is likely to be a good range of values from which to estimate paramter properties.
via bayesplot
- Trace plots and density plots
  library(bayesplot) mcmc_trace(as.matrix(data1.rstan), regex_pars = "beta|sigma")
  library(bayesplot) mcmc_combo(as.matrix(data1.rstan), regex_pars = "beta|sigma")
  Trace plots show no evidence that the chains have not reasonably traversed the entire multidimensional parameter space.
- Density plots
  library(bayesplot) mcmc_dens(as.matrix(data1.rstan), regex_pars = "beta|sigma")
  Density plots sugggest mean or median would be appropriate to describe the fixed posteriors and median is appropriate for the sigma posterior.

via shinystan

library(shinystan)
launch_shinystan(data1.rstan)

It is worth exploring the influence of our priors.

Again, prior to examining the summaries, we should have explored the convergence diagnostics. There are numerous ways of working with STANARM model fits (for exploring diagnostics and summarization).

extract the mcmc samples and convert them into a mcmc.list to leverage the various coda routines
use the numerous routines that come with the rstan package
use the routines that come with the bayesplot package
explore the diagnostics interactively via shinystan

We will explore all of these:

via coda

Traceplots

library(coda)
s = as.array(data1.rstanarm)
mcmc <- do.call(mcmc.list, plyr:::alply(s[, , -(length(s[1, 1, ]))], 2, as.mcmc))
plot(mcmc)

plot of chunk tut7.5bRSTANARM2codaTraceplots

Autocorrelation

library(coda)
s = as.array(data1.rstanarm)
mcmc <- do.call(mcmc.list, plyr:::alply(s[, , -(length(s[1, 1, ]))], 2, as.mcmc))
autocorr.diag(mcmc)

        (Intercept)     AGroup B     AGroup C           B   AGroup B:B   AGroup C:B
Lag 0   1.000000000  1.000000000  1.000000000 1.000000000  1.000000000  1.000000000
Lag 1   0.051997304  0.022839434  0.049122699 0.029332271  0.018600498  0.039303026
Lag 5  -0.009030666 -0.023573121 -0.021521616 0.002608995 -0.008629405 -0.008541973
Lag 10  0.022740083  0.005983251 -0.006472736 0.009874178 -0.009109182 -0.041524169
Lag 50  0.023718877  0.022629378  0.043072824 0.029933371  0.029119110  0.025530832

via rstan
- Traceplots
  stan_trace(data1.rstanarm)
  Trace plots show no evidence that the chains have not reasonably traversed the entire multidimensional parameter space.
- Raftery diagnostic
  raftery.diag(data1.rstanarm)
  
  Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 You need a sample size of at least 3746 with these values of q, r and s
  The Raftery diagnostics for each chain estimate that we would require no more than 5000 samples to reach the specified level of confidence in convergence. As we have 16,667 samples, we can be confidence that convergence has occurred.
- Autocorrelation diagnostic
  stan_ac(data1.rstanarm)
  A lag of 2 appears broadly sufficient to avoid autocorrelation (poor mixing).
- Rhat values. These values are a measure of sampling efficiency/effectiveness. Ideally, all values should be less than 1.05. If there are values of 1.05 or greater it suggests that the sampler was not very efficient or effective. Not only does this mean that the sampler was potentiall slower than it could have been, more importantly, it could indicate that the sampler spent time sampling in a region of the likelihood that is less informative. Such a situation can arise from either a misspecified model or overly vague priors that permit sampling in otherwise nonscence parameter space.
  stan_rhat(data1.rstanarm)
  In this instance, all rhat values are well below 1.05 (a good thing).
- Another measure of sampling efficiency is Effective Sample Size (ess). ess indicate the number samples (or proportion of samples that the sampling algorithm deamed effective. The sampler rejects samples on the basis of certain criterion and when it does so, the previous sample value is used. Hence while the MCMC sampling chain may contain 1000 samples, if there are only 10 effective samples (1%), the estimated properties are not likely to be reliable.
  stan_ess(data1.rstanarm)
  In this instance, most of the parameters have reasonably high effective samples and thus there is likely to be a good range of values from which to estimate paramter properties.
via bayesplot
- Trace plots and density plots
  library(bayesplot) mcmc_trace(as.array(data1.rstanarm), regex_pars = "Intercept|x|sigma")
  mcmc_combo(as.array(data1.rstanarm))
  Trace plots show no evidence that the chains have not reasonably traversed the entire multidimensional parameter space.
- Density plots
  mcmc_dens(as.array(data1.rstanarm))
  Density plots sugggest mean or median would be appropriate to describe the fixed posteriors and median is appropriate for the sigma posterior.

via rstanarm
The rstanarm package provides additional posterior checks.

library(rstanarm)
posterior_vs_prior(data1.rstanarm, color_by = "vs", group_by = TRUE,
    facet_args = list(scales = "free_y"))

Gradient evaluation took 6.9e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.69 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.403063 seconds (Warm-up)
               0.077008 seconds (Sampling)
               0.480071 seconds (Total)


Gradient evaluation took 1.1e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.11 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.280175 seconds (Warm-up)
               0.104319 seconds (Sampling)
               0.384494 seconds (Total)

plot of chunk tut7.5bRSTANARM2posterorvsprior

via shinystan

						  library(shinystan) 
						  launch_shinystan(data1.rstanarm))

Again, prior to examining the summaries, we should have explored the convergence diagnostics. Rather than duplicate this for both additive and multiplicative models, we will only explore the multiplicative model. There are numerous ways of working with STAN model fits (for exploring diagnostics and summarization).

extract the mcmc samples and convert them into a mcmc.list to leverage the various coda routines
use the numerous routines that come with the rstan package
use the routines that come with the bayesplot package
explore the diagnostics interactively via shinystan

We will explore all of these:

via coda
- Traceplots
- Autocorrelation
via rstan
- Traceplots
  stan_trace(data1.brms$fit)
  Trace plots show no evidence that the chains have not reasonably traversed the entire multidimensional parameter space.
- Raftery diagnostic
  raftery.diag(data1.brms)
  
  Quantile (q) = 0.025 Accuracy (r) = +/- 0.005 Probability (s) = 0.95 You need a sample size of at least 3746 with these values of q, r and s
  The Raftery diagnostics for each chain estimate that we would require no more than 5000 samples to reach the specified level of confidence in convergence. As we have 16,667 samples, we can be confidence that convergence has occurred.
- Autocorrelation diagnostic
  stan_ac(data1.brms$fit)
  A lag of 2 appears borderline in a couple of cases suggesting some autocorrelation (poor mixing). It might be worth increasing the thining rate to 4. Feel free to do so, I am not going to bother for this demo.
- Rhat values. These values are a measure of sampling efficiency/effectiveness. Ideally, all values should be less than 1.05. If there are values of 1.05 or greater it suggests that the sampler was not very efficient or effective. Not only does this mean that the sampler was potentiall slower than it could have been, more importantly, it could indicate that the sampler spent time sampling in a region of the likelihood that is less informative. Such a situation can arise from either a misspecified model or overly vague priors that permit sampling in otherwise nonscence parameter space.
  stan_rhat(data1.brms$fit)
  In this instance, all rhat values are well below 1.05 (a good thing).
- Another measure of sampling efficiency is Effective Sample Size (ess). ess indicate the number samples (or proportion of samples that the sampling algorithm deamed effective. The sampler rejects samples on the basis of certain criterion and when it does so, the previous sample value is used. Hence while the MCMC sampling chain may contain 1000 samples, if there are only 10 effective samples (1%), the estimated properties are not likely to be reliable.
  stan_ess(data1.brms$fit)
  In this instance, most of the parameters have reasonably high effective samples and thus there is likely to be a good range of values from which to estimate paramter properties.

Model validation

General
MCMCpack
JAGS
STAN
RSTANARM
BRMS

Model validation involves exploring the model diagnostics and fit to ensure that the model is broadly appropriate for the data. As such, exploration of the residuals should be routine.

Unfortunately the definitions of standardized and studentized residuals appears to vary and the two terms get used interchangeably. I will adopt the following definitions:

Standardized residuals:		the raw residuals divided by the true standard deviation of the residuals (which of course is rarely known).
Studentized residuals:		the raw residuals divided by the standard deviation of the residuals. Note that externally studentized residuals are calculated by dividing the raw residuals by a unique standard deviation for each observation that is calculated from regressions having left each successive observation out.
Pearson residuals:		the raw residuals divided by the standard deviation of the response variable.

Rather than dublicate this for both additive and multiplicative models, we will only explore the multiplicative model.

Residuals are not computed directly within MCMCpack. However, we can calculate them manually form the posteriors.

mcmc = as.data.frame(data1.mcmcpack)
# generate a model matrix
newdata1 = data1
Xmat = model.matrix(~A * B, newdata1)
## get median parameter estimates
head(mcmc)

  (Intercept)   AGroup B  AGroup C          B AGroup B:B AGroup C:B    sigma2
1    48.70652 -11.241113 -25.43591 -0.4057086 -0.1929317  0.2966322  9.505405
2    43.17996  -8.274010 -22.66362 -0.3300430 -0.1959714  0.2629181 13.109720
3    49.48138 -10.480517 -27.46155 -0.4829662 -0.1457857  0.3635239 14.426140
4    47.75332 -11.387909 -27.18387 -0.3879251 -0.2055644  0.3181976 10.130435
5    49.43988 -11.918492 -29.20874 -0.4436741 -0.2074017  0.3961033  9.917436
6    46.51075  -6.677204 -22.78787 -0.3225533 -0.3883363  0.1657746 11.947826

coefs = apply(mcmc[, 1:6], 2, median)
fit = as.vector(coefs %*% t(Xmat))
resid = data1$Y - fit
ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))

Residuals against predictors

mcmc = as.data.frame(data1.mcmcpack)
# generate a model matrix
newdata1 = newdata1
Xmat = model.matrix(~A * B, newdata1)
## get median parameter estimates
coefs = apply(mcmc[, 1:6], 2, median)
fit = as.vector(coefs %*% t(Xmat))
resid = data1$Y - fit
newdata1 = newdata1 %>% cbind(fit, resid)
ggplot(newdata1) + geom_point(aes(y = resid, x = A))

ggplot(newdata1) + geom_point(aes(y = resid, x = B))

And now for studentized residuals

mcmc = as.data.frame(data1.mcmcpack)
# generate a model matrix
newdata1 = data1
Xmat = model.matrix(~A * B, newdata1)
## get median parameter estimates
coefs = apply(mcmc[, 1:6], 2, median)
fit = as.vector(coefs %*% t(Xmat))
resid = data1$Y - fit
sresid = resid/sd(resid)
ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))

Conclusions: for this simple model, the studentized residuals yield the same pattern as the raw residuals (or the Pearson residuals for that matter).

Lets see how well data simulated from the model reflects the raw data

mcmc = as.matrix(data1.mcmcpack)
# generate a model matrix
Xmat = model.matrix(~A * B, data1)
## get median parameter estimates
coefs = mcmc[, 1:6]
fit = coefs %*% t(Xmat)
## draw samples from this model
yRep = sapply(1:nrow(mcmc), function(i) rnorm(nrow(data1), fit[i,
    ], sqrt(mcmc[i, "sigma2"])))
newdata1 = data.frame(A = data1$A, B = data1$B, yRep) %>% gather(key = Sample,
    value = Value, -A, -B)
ggplot(newdata1) + geom_violin(aes(y = Value, x = A, fill = "Model"),
    alpha = 0.5) + geom_violin(data = data1, aes(y = Y, x = A,
    fill = "Obs"), alpha = 0.5) + geom_point(data = data1, aes(y = Y,
    x = A), position = position_jitter(width = 0.1, height = 0),
    color = "black")

ggplot(newdata1) + geom_violin(aes(y = Value, x = B, fill = "Model",
    group = B, color = A), alpha = 0.5) + geom_point(data = data1,
    aes(y = Y, x = B, group = B, color = A))

We can also explore the posteriors of each parameter.

library(bayesplot)
mcmc_intervals(as.matrix(data1.mcmcpack), regex_pars = "Intercept|^A|^B|sigma")

mcmc_areas(as.matrix(data1.mcmcpack), regex_pars = "Intercept|^A|^B|sigma")

Rather than dublicate this for both additive and multiplicative models, we will only explore the multiplicative model.

Residuals are not computed directly within JAGS. However, we can calculate them manually form the posteriors.

mcmc = data1.r2jags$BUGSoutput$sims.matrix %>% as.data.frame %>%
    dplyr:::select(contains("beta"), sigma) %>% as.matrix
# generate a model matrix
newdata1 = data1
Xmat = model.matrix(~A * B, newdata1)
## get median parameter estimates
coefs = apply(mcmc[, 1:6], 2, median)
fit = as.vector(coefs %*% t(Xmat))
resid = data1$Y - fit
ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))

Residuals against predictors

mcmc = data1.r2jags$BUGSoutput$sims.matrix %>% as.data.frame %>%
    dplyr:::select(contains("beta"), sigma) %>% as.matrix
# generate a model matrix
newdata1 = newdata1
Xmat = model.matrix(~A * B, newdata1)
## get median parameter estimates
coefs = apply(mcmc[, 1:6], 2, median)
fit = as.vector(coefs %*% t(Xmat))
resid = data1$Y - fit
newdata1 = newdata1 %>% cbind(fit, resid)
ggplot(newdata1) + geom_point(aes(y = resid, x = A))

ggplot(newdata1) + geom_point(aes(y = resid, x = B))

And now for studentized residuals

mcmc = data1.r2jags$BUGSoutput$sims.matrix %>% as.data.frame %>%
    dplyr:::select(contains("beta"), sigma) %>% as.matrix
# generate a model matrix
newdata1 = data1
Xmat = model.matrix(~A * B, newdata1)
## get median parameter estimates
coefs = apply(mcmc[, 1:6], 2, median)
fit = as.vector(coefs %*% t(Xmat))
resid = data1$Y - fit
sresid = resid/sd(resid)
ggplot() + geom_point(data1 = NULL, aes(y = sresid, x = fit))

Conclusions: for this simple model, the studentized residuals yield the same pattern as the raw residuals (or the Pearson residuals for that matter).

Lets see how well data simulated from the model reflects the raw data

mcmc = data1.r2jags$BUGSoutput$sims.matrix %>% as.data.frame %>%
    dplyr:::select(contains("beta"), sigma) %>% as.matrix
# generate a model matrix
Xmat = model.matrix(~A * B, data1)
## get median parameter estimates
coefs = mcmc[, 1:6]
fit = coefs %*% t(Xmat)
## draw samples from this model
yRep = sapply(1:nrow(mcmc), function(i) rnorm(nrow(data1), fit[i,
    ], mcmc[i, "sigma"]))
newdata1 = data.frame(A = data1$A, B = data1$B, yRep) %>% gather(key = Sample,
    value = Value, -A, -B)
ggplot(newdata1) + geom_violin(aes(y = Value, x = A, fill = "Model"),
    alpha = 0.5) + geom_violin(data = data1, aes(y = Y, x = A,
    fill = "Obs"), alpha = 0.5) + geom_point(data = data1, aes(y = Y,
    x = A), position = position_jitter(width = 0.1, height = 0),
    color = "black")

ggplot(newdata1) + geom_violin(aes(y = Value, x = B, fill = "Model",
    group = B, color = A), alpha = 0.5) + geom_point(data = data1,
    aes(y = Y, x = B, group = B, color = A))

We can also explore the posteriors of each parameter.

library(bayesplot)
mcmc_intervals(data1.r2jags$BUGSoutput$sims.matrix, regex_pars = "beta|sigma")

mcmc_areas(data1.r2jags$BUGSoutput$sims.matrix, regex_pars = "beta|sigma")

Rather than dublicate this for both additive and multiplicative models, we will only explore the multiplicative model.

Residuals are not computed directly within RSTAN. However, we can calculate them manually form the posteriors.

mcmc = as.data.frame(data1.rstan) %>% dplyr:::select(contains("beta"),
    sigma) %>% as.matrix
# generate a model matrix
newdata1 = data1
Xmat = model.matrix(~A * B, newdata1)
## get median parameter estimates
coefs = apply(mcmc[, 1:6], 2, median)
fit = as.vector(coefs %*% t(Xmat))
resid = data1$Y - fit
ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))

Residuals against predictors

mcmc = as.data.frame(data1.rstan) %>% dplyr:::select(contains("beta"),
    sigma) %>% as.matrix
# generate a model matrix
newdata1 = newdata1
Xmat = model.matrix(~A * B, newdata1)
## get median parameter estimates
coefs = apply(mcmc[, 1:6], 2, median)
fit = as.vector(coefs %*% t(Xmat))
resid = data1$Y - fit
newdata1 = newdata1 %>% cbind(fit, resid)
ggplot(newdata1) + geom_point(aes(y = resid, x = A))

ggplot(newdata1) + geom_point(aes(y = resid, x = B))

And now for studentized residuals

mcmc = as.data.frame(data1.rstan) %>% dplyr:::select(contains("beta"),
    sigma) %>% as.matrix
# generate a model matrix
newdata1 = data1
Xmat = model.matrix(~A * B, newdata1)
## get median parameter estimates
coefs = apply(mcmc[, 1:6], 2, median)
fit = as.vector(coefs %*% t(Xmat))
resid = data1$Y - fit
sresid = resid/sd(resid)
ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))

Conclusions: for this simple model, the studentized residuals yield the same pattern as the raw residuals (or the Pearson residuals for that matter).

Lets see how well data simulated from the model reflects the raw data

mcmc = as.data.frame(data1.rstan) %>% dplyr:::select(contains("beta"),
    sigma) %>% as.matrix
# generate a model matrix
Xmat = model.matrix(~A * B, data1)
## get median parameter estimates
coefs = mcmc[, 1:6]
fit = coefs %*% t(Xmat)
## draw samples from this model
yRep = sapply(1:nrow(mcmc), function(i) rnorm(nrow(data1), fit[i,
    ], mcmc[i, "sigma"]))
newdata1 = data.frame(A = data1$A, B = data1$B, yRep) %>% gather(key = Sample,
    value = Value, -A, -B)
ggplot(newdata1) + geom_violin(aes(y = Value, x = A, fill = "Model"),
    alpha = 0.5) + geom_violin(data = data1, aes(y = Y, x = A,
    fill = "Obs"), alpha = 0.5) + geom_point(data = data1, aes(y = Y,
    x = A), position = position_jitter(width = 0.1, height = 0),
    color = "black")

ggplot(newdata1) + geom_violin(aes(y = Value, x = B, fill = "Model",
    group = B, color = A), alpha = 0.5) + geom_point(data = data1,
    aes(y = Y, x = B, group = B, color = A))

We can also explore the posteriors of each parameter.

library(bayesplot)
mcmc_intervals(as.matrix(data1.rstan), regex_pars = "beta|sigma")

mcmc_areas(as.matrix(data1.rstan), regex_pars = "beta|sigma")

Rather than dublicate this for both additive and multiplicative models, we will only explore the multiplicative model.

Residuals are not computed directly within RSTANARM. However, we can calculate them manually form the posteriors.

resid = resid(data1.rstanarm)
fit = fitted(data1.rstanarm)
ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))

Residuals against predictors

resid = resid(data1.rstanarm)
dat = data1 %>% mutate(resid = resid)
ggplot(dat) + geom_point(aes(y = resid, x = A))

ggplot(dat) + geom_point(aes(y = resid, x = B))

And now for studentized residuals

resid = resid(data1.rstanarm)
sresid = resid/sd(resid)
fit = fitted(data1.rstanarm)
ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))

Conclusions: for this simple model, the studentized residuals yield the same pattern as the raw residuals (or the Pearson residuals for that matter).

Lets see how well data simulated from the model reflects the raw data

y_pred = posterior_predict(data1.rstanarm)
newdata1 = data1 %>% cbind(t(y_pred)) %>% gather(key = "Rep",
    value = "Value", -A:-Y)
ggplot(newdata1) + geom_violin(aes(y = Value, x = A, fill = "Model"),
    alpha = 0.5) + geom_violin(data = data1, aes(y = Y, x = A,
    fill = "Obs"), alpha = 0.5) + geom_point(data = data1, aes(y = Y,
    x = A), position = position_jitter(width = 0.1, height = 0),
    color = "black")

ggplot(newdata1) + geom_violin(aes(y = Value, x = B, fill = "Model",
    group = B, color = A), alpha = 0.5) + geom_point(data = data1,
    aes(y = Y, x = B, group = B, color = A))

We can also explore the posteriors of each parameter.

library(bayesplot)
mcmc_intervals(as.matrix(data1.rstanarm), regex_pars = "Intercept|^A|^B|sigma")

mcmc_areas(as.matrix(data1.rstanarm), regex_pars = "Intercept|^A|^B|sigma")

Rather than dublicate this for both additive and multiplicative models, we will only explore the multiplicative model.

Residuals are not computed directly within BRMS. However, we can calculate them manually form the posteriors.

resid = resid(data1.brms)[, "Estimate"]
fit = fitted(data1.brms)[, "Estimate"]
ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))

Residuals against predictors

resid = resid(data1.brms)[, "Estimate"]
dat = data1 %>% mutate(resid = resid)
ggplot(dat) + geom_point(aes(y = resid, x = A))

ggplot(dat) + geom_point(aes(y = resid, x = B))

And now for studentized residuals

resid = resid(data1.brms)[, "Estimate"]
sresid = resid/sd(resid)
fit = fitted(data1.brms)[, "Estimate"]
ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))

Conclusions: for this simple model, the studentized residuals yield the same pattern as the raw residuals (or the Pearson residuals for that matter).

Lets see how well data simulated from the model reflects the raw data

y_pred = posterior_predict(data1.brms)
newdata1 = data1 %>% cbind(t(y_pred)) %>% gather(key = "Rep",
    value = "Value", -A:-Y)
ggplot(newdata1) + geom_violin(aes(y = Value, x = A, fill = "Model"),
    alpha = 0.5) + geom_violin(data = data1, aes(y = Y, x = A,
    fill = "Obs"), alpha = 0.5) + geom_point(data = data1, aes(y = Y,
    x = A), position = position_jitter(width = 0.1, height = 0),
    color = "black")

ggplot(newdata1) + geom_violin(aes(y = Value, x = B, fill = "Model",
    group = B, color = A), alpha = 0.5) + geom_point(data = data1,
    aes(y = Y, x = B, group = B, color = A))

We can also explore the posteriors of each parameter.

library(bayesplot)
mcmc_intervals(as.matrix(data1.brms), regex_pars = "b_|sigma")

mcmc_areas(as.matrix(data1.brms), regex_pars = "b_|sigma")

Parameter estimates (posterior summaries)

General
MCMCpack
JAGS
STAN
RSTANARM
BRMS

library(coda)
mcmcpvalue <- function(samp) {
    ## elementary version that creates an empirical p-value for the
    ## hypothesis that the columns of samp have mean zero versus a general
    ## multivariate distribution with elliptical contours.

    ## differences from the mean standardized by the observed
    ## variance-covariance factor

    ## Note, I put in the bit for single terms
    if (length(dim(samp)) == 0) {
        std <- backsolve(chol(var(samp)), cbind(0, t(samp)) - mean(samp),
            transpose = TRUE)
        sqdist <- colSums(std * std)
        sum(sqdist[-1] > sqdist[1])/length(samp)
    } else {
        std <- backsolve(chol(var(samp)), cbind(0, t(samp)) - colMeans(samp),
            transpose = TRUE)
        sqdist <- colSums(std * std)
        sum(sqdist[-1] > sqdist[1])/nrow(samp)
    }

}

Matrix model (MCMCpack)

summary(data1.mcmcpack)

Iterations = 1001:11000
Thinning interval = 1 
Number of chains = 1 
Sample size per chain = 10000 

1. Empirical mean and standard deviation for each variable,
   plus standard error of the mean:

                Mean      SD  Naive SE Time-series SE
(Intercept)  49.5353 2.29334 0.0229334      0.0226186
AGroup B    -12.2262 3.09179 0.0309179      0.0309179
AGroup C    -28.2439 2.93282 0.0293282      0.0293282
B            -0.4938 0.09038 0.0009038      0.0008764
AGroup B:B   -0.1211 0.11358 0.0011358      0.0011178
AGroup C:B    0.4089 0.10829 0.0010829      0.0010829
sigma2       13.8964 4.43200 0.0443200      0.0552651

2. Quantiles for each variable:

                2.5%      25%      50%       75%    97.5%
(Intercept)  45.0690  48.0206  49.5282  51.01720  54.0757
AGroup B    -18.3801 -14.2357 -12.1642 -10.19136  -6.1832
AGroup C    -34.0521 -30.1795 -28.2026 -26.30389 -22.5493
B            -0.6743  -0.5528  -0.4940  -0.43428  -0.3178
AGroup B:B   -0.3412  -0.1958  -0.1209  -0.04575   0.1030
AGroup C:B    0.1948   0.3382   0.4086   0.47902   0.6240
sigma2        7.7536  10.8320  13.0991  16.09205  24.8023

# OR
library(broom)
tidyMCMC(data1.mcmcpack, conf.int = TRUE, conf.method = "HPDinterval")

         term    estimate  std.error    conf.low   conf.high
1 (Intercept)  49.5353232 2.29333768  45.1289511  54.1244471
2    AGroup B -12.2262443 3.09178506 -18.4166155  -6.2375566
3    AGroup C -28.2438752 2.93282206 -33.9185333 -22.4861115
4           B  -0.4938007 0.09037721  -0.6733914  -0.3173760
5  AGroup B:B  -0.1211118 0.11358441  -0.3414655   0.1027879
6  AGroup C:B   0.4088582 0.10828868   0.1931050   0.6219178
7      sigma2  13.8964411 4.43200264   7.0646504  22.9045959

Conclusions:

the intercept of the first group (Group A) is 49.5353232
the mean of the second group (Group B) is -12.2262443 units greater than (A)
the mean of the third group (Group C) is -28.2438752 units greater than (A)
a one unit increase in B in Group A is associated with a -0.4938007 units increase in Y
difference in slope between Group B and Group A -0.1211118
difference in slope between Group C and Group A 0.4088582

The 95% confidence interval for the effects of Group B, Group C and the partial slope associated with B do not overlapp with 0 implying a significant difference between group A and groups B, C (at the mean level of predictor B) and a significant negative relationship with B (for Group A). The slope associated with Group B was not found to be significantly different from that associated with Group A, however, the slope associated with Group C was found to be significantly less negative than the slope associated with Group A.

## since values are less than zero
mcmcpvalue(data1.mcmcpack[, 2])  # effect of (B-A = 0)

[1] 0.001

mcmcpvalue(data1.mcmcpack[, 3])  # effect of (C-A = 0)

[1] 0

mcmcpvalue(data1.mcmcpack[, 4])  # effect of (slope = 0)

[1] 0

mcmcpvalue(data1.mcmcpack[, 5])  # effect of (slopeB - slopeA = 0)

[1] 0.2763

mcmcpvalue(data1.mcmcpack[, 6])  # effect of (slopeC - slopeA = 0)

[1] 7e-04

mcmcpvalue(data1.mcmcpack[, 2:6])  # effect of (model)

[1] 0

There is evidence that the reponse differs between the groups.

Matrix model (JAGS)

print(data1.r2jags)

Inference for Bugs model at "5", fit using jags,
 3 chains, each with 53000 iterations (first 3000 discarded), n.thin = 10
 n.sims = 15000 iterations saved
         mu.vect sd.vect    2.5%     25%     50%     75%   97.5%  Rhat n.eff
beta[1]   49.553   2.336  45.049  48.003  49.538  51.092  54.162 1.001 15000
beta[2]  -12.227   3.093 -18.248 -14.285 -12.228 -10.182  -6.142 1.001 11000
beta[3]  -28.270   2.993 -34.198 -30.236 -28.275 -26.304 -22.340 1.001 15000
beta[4]   -0.495   0.093  -0.680  -0.556  -0.494  -0.433  -0.313 1.001 15000
beta[5]   -0.121   0.115  -0.347  -0.197  -0.122  -0.046   0.105 1.001 15000
beta[6]    0.410   0.111   0.193   0.336   0.410   0.482   0.636 1.001 15000
sigma      3.772   0.585   2.827   3.354   3.707   4.106   5.105 1.001 15000
deviance 163.101   4.379 156.875 159.924 162.329 165.512 173.639 1.001 15000

For each parameter, n.eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor (at convergence, Rhat=1).

DIC info (using the rule, pD = var(deviance)/2)
pD = 9.6 and DIC = 172.7
DIC is an estimate of expected predictive error (lower deviance is better).

# OR
library(broom)
tidyMCMC(as.mcmc(data1.r2jags), conf.int = TRUE, conf.method = "HPDinterval")

      term    estimate  std.error    conf.low   conf.high
1  beta[1]  49.5534512 2.33556926  45.0234105  54.1328491
2  beta[2] -12.2268098 3.09333864 -18.3836687  -6.3089149
3  beta[3] -28.2700616 2.99256629 -34.2921382 -22.4881054
4  beta[4]  -0.4947167 0.09264266  -0.6838619  -0.3183081
5  beta[5]  -0.1214410 0.11480484  -0.3469758   0.1054090
6  beta[6]   0.4099855 0.11145489   0.1877435   0.6289609
7 deviance 163.1011311 4.37873657 156.1255607 171.7611813
8    sigma   3.7720815 0.58508086   2.7189187   4.9291545

Conclusions:

the intercept of the first group (Group A) is 49.5534512
the mean of the second group (Group B) is -12.2268098 units greater than (A)
the mean of the third group (Group C) is -28.2700616 units greater than (A)
a one unit increase in B in Group A is associated with a -0.4947167 units increase in Y
difference in slope between Group B and Group A -0.121441
difference in slope between Group C and Group A 0.4099855

## since values are less than zero
mcmcpvalue(data1.r2jags$BUGSoutput$sims.matrix[, "beta[2]"])  # effect of (B-A = 0)

[1] 0.0003333333

mcmcpvalue(data1.r2jags$BUGSoutput$sims.matrix[, "beta[3]"])  # effect of (C-A = 0)

[1] 0

mcmcpvalue(data1.r2jags$BUGSoutput$sims.matrix[, "beta[4]"])  # effect of (slope = 0)

[1] 0

mcmcpvalue(data1.r2jags$BUGSoutput$sims.matrix[, "beta[5]"])  # effect of (slopeB - slopeA = 0)

[1] 0.28

mcmcpvalue(data1.r2jags$BUGSoutput$sims.matrix[, "beta[6]"])  # effect of (slopeC - slopeA = 0)

[1] 0.0007333333

mcmcpvalue(data1.r2jags$BUGSoutput$sims.matrix[, 2:6])  # effect of (model)

[1] 0

There is evidence that the reponse differs between the groups.

Matrix model (RSTAN)

print(data1.rstan, pars = c("beta", "sigma"))

Inference for Stan model: 3d2414c9dcf4b5e12be870eadd2c894a.
3 chains, each with iter=2000; warmup=500; thin=3; 
post-warmup draws per chain=500, total post-warmup draws=1500.

          mean se_mean   sd   2.5%    25%    50%    75%  97.5% n_eff Rhat
beta[1]  49.50    0.08 2.34  44.55  48.01  49.58  51.01  54.11   902    1
beta[2] -12.29    0.09 3.04 -18.60 -14.30 -12.17 -10.22  -6.70  1094    1
beta[3] -28.21    0.10 2.97 -33.80 -30.23 -28.34 -26.40 -22.04   957    1
beta[4]  -0.49    0.00 0.09  -0.67  -0.55  -0.49  -0.44  -0.31   978    1
beta[5]  -0.12    0.00 0.11  -0.34  -0.19  -0.12  -0.05   0.11  1164    1
beta[6]   0.41    0.00 0.11   0.18   0.34   0.41   0.48   0.62   953    1
sigma     3.73    0.02 0.58   2.81   3.32   3.66   4.05   5.07  1271    1

Samples were drawn using NUTS(diag_e) at Sat Nov  4 07:45:04 2017.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).

# OR
library(broom)
tidyMCMC(data1.rstan, conf.int = TRUE, conf.method = "HPDinterval", pars = c("beta", "sigma"))

     term    estimate  std.error    conf.low    conf.high
1 beta[1]  49.5013545 2.34230542  44.9503609  54.24168440
2 beta[2] -12.2889364 3.04350709 -18.5232918  -6.66905473
3 beta[3] -28.2129289 2.96719159 -33.8515230 -22.12391688
4 beta[4]  -0.4929557 0.09152697  -0.6703121  -0.30402421
5 beta[5]  -0.1195490 0.11233322  -0.3536692   0.08719164
6 beta[6]   0.4082751 0.11091872   0.1750039   0.61796175
7   sigma   3.7255865 0.57520983   2.6288615   4.77664381

Conclusions:

the intercept of the first group (Group A) is 49.5013545
the mean of the second group (Group B) is -12.2889364 units greater than (A)
the mean of the third group (Group C) is -28.2129289 units greater than (A)
a one unit increase in B in Group A is associated with a -0.4929557 units increase in Y
difference in slope between Group B and Group A -0.119549
difference in slope between Group C and Group A 0.4082751

## since values are less than zero
mcmcpvalue(as.matrix(data1.rstan)[, "beta[2]"])  # effect of (B-A = 0)

[1] 0.0006666667

mcmcpvalue(as.matrix(data1.rstan)[, "beta[3]"])  # effect of (C-A = 0)

[1] 0

mcmcpvalue(as.matrix(data1.rstan)[, "beta[4]"])  # effect of (slope = 0)

[1] 0

mcmcpvalue(as.matrix(data1.rstan)[, "beta[5]"])  # effect of (slopeB - slopeA = 0)

[1] 0.278

mcmcpvalue(as.matrix(data1.rstan)[, "beta[6]"])  # effect of (slopeC - slopeA = 0)

[1] 0.0006666667

mcmcpvalue(as.matrix(data1.rstan)[, 2:6])  # effect of (model)

[1] 0

There is evidence that a model that includes an interaction is better than an additive model.

library(loo)
(full = loo(extract_log_lik(data1.rstan)))

Computed from 1500 by 30 log-likelihood matrix

         Estimate  SE
elpd_loo    -85.3 4.0
p_loo         6.3 1.6
looic       170.6 7.9

Pareto k diagnostic values:
                         Count  Pct 
(-Inf, 0.5]   (good)     26    86.7%
 (0.5, 0.7]   (ok)        4    13.3%
   (0.7, 1]   (bad)       0     0.0%
   (1, Inf)   (very bad)  0     0.0%

All Pareto k estimates are ok (k < 0.7)
See help('pareto-k-diagnostic') for details.

# now fit a model without main factor
modelString = "
  data {
  int<lower=1> n;
  int<lower=1> nX;
  vector [n] y;
  matrix [n,nX] X;
  }
  parameters {
  vector[nX] beta;
  real<lower=0> sigma;
  }
  transformed parameters {
  vector[n] mu;

  mu = X*beta;
  }
  model {
  #Likelihood
  y~normal(mu,sigma);
  
  #Priors
  beta ~ normal(0,1000);
  sigma~cauchy(0,5);
  }
  generated quantities {
  vector[n] log_lik;
  
  for (i in 1:n) {
  log_lik[i] = normal_lpdf(y[i] | mu[i], sigma); 
  }
  }
  "

Xmat <- model.matrix(~A + B, data1)
data1.list <- with(data1, list(y = Y, X = Xmat, n = nrow(data1), nX = ncol(Xmat)))
data1.rstan.red <- stan(data = data1.list, model_code = modelString, chains = 3,
    iter = 2000, warmup = 500, thin = 3)

SAMPLING FOR MODEL '3b057d3d81cbed2078ce678376a94574' NOW (CHAIN 1).

Gradient evaluation took 4.2e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.42 seconds.
Adjust your expectations accordingly!


Iteration:    1 / 2000 [  0%]  (Warmup)
Iteration:  200 / 2000 [ 10%]  (Warmup)
Iteration:  400 / 2000 [ 20%]  (Warmup)
Iteration:  501 / 2000 [ 25%]  (Sampling)
Iteration:  700 / 2000 [ 35%]  (Sampling)
Iteration:  900 / 2000 [ 45%]  (Sampling)
Iteration: 1100 / 2000 [ 55%]  (Sampling)
Iteration: 1300 / 2000 [ 65%]  (Sampling)
Iteration: 1500 / 2000 [ 75%]  (Sampling)
Iteration: 1700 / 2000 [ 85%]  (Sampling)
Iteration: 1900 / 2000 [ 95%]  (Sampling)
Iteration: 2000 / 2000 [100%]  (Sampling)

 Elapsed Time: 0.138805 seconds (Warm-up)
               0.122949 seconds (Sampling)
               0.261754 seconds (Total)


SAMPLING FOR MODEL '3b057d3d81cbed2078ce678376a94574' NOW (CHAIN 2).

Gradient evaluation took 1.1e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.11 seconds.
Adjust your expectations accordingly!


Iteration:    1 / 2000 [  0%]  (Warmup)
Iteration:  200 / 2000 [ 10%]  (Warmup)
Iteration:  400 / 2000 [ 20%]  (Warmup)
Iteration:  501 / 2000 [ 25%]  (Sampling)
Iteration:  700 / 2000 [ 35%]  (Sampling)
Iteration:  900 / 2000 [ 45%]  (Sampling)
Iteration: 1100 / 2000 [ 55%]  (Sampling)
Iteration: 1300 / 2000 [ 65%]  (Sampling)
Iteration: 1500 / 2000 [ 75%]  (Sampling)
Iteration: 1700 / 2000 [ 85%]  (Sampling)
Iteration: 1900 / 2000 [ 95%]  (Sampling)
Iteration: 2000 / 2000 [100%]  (Sampling)

 Elapsed Time: 0.106299 seconds (Warm-up)
               0.113143 seconds (Sampling)
               0.219442 seconds (Total)


SAMPLING FOR MODEL '3b057d3d81cbed2078ce678376a94574' NOW (CHAIN 3).

Gradient evaluation took 1.1e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.11 seconds.
Adjust your expectations accordingly!


Iteration:    1 / 2000 [  0%]  (Warmup)
Iteration:  200 / 2000 [ 10%]  (Warmup)
Iteration:  400 / 2000 [ 20%]  (Warmup)
Iteration:  501 / 2000 [ 25%]  (Sampling)
Iteration:  700 / 2000 [ 35%]  (Sampling)
Iteration:  900 / 2000 [ 45%]  (Sampling)
Iteration: 1100 / 2000 [ 55%]  (Sampling)
Iteration: 1300 / 2000 [ 65%]  (Sampling)
Iteration: 1500 / 2000 [ 75%]  (Sampling)
Iteration: 1700 / 2000 [ 85%]  (Sampling)
Iteration: 1900 / 2000 [ 95%]  (Sampling)
Iteration: 2000 / 2000 [100%]  (Sampling)

 Elapsed Time: 0.120129 seconds (Warm-up)
               0.110436 seconds (Sampling)
               0.230565 seconds (Total)

(reduced = loo(extract_log_lik(data1.rstan.red)))

Computed from 1500 by 30 log-likelihood matrix

         Estimate  SE
elpd_loo    -97.9 3.3
p_loo         5.0 1.1
looic       195.8 6.7

Pareto k diagnostic values:
                         Count  Pct 
(-Inf, 0.5]   (good)     28    93.3%
 (0.5, 0.7]   (ok)        2     6.7%
   (0.7, 1]   (bad)       0     0.0%
   (1, Inf)   (very bad)  0     0.0%

All Pareto k estimates are ok (k < 0.7)
See help('pareto-k-diagnostic') for details.

par(mfrow = 1:2, mar = c(5, 3.8, 1, 0) + 0.1, las = 3)
plot(full, label_points = TRUE)
plot(reduced, label_points = TRUE)

compare_models(full, reduced)

Error in discrete == discrete[1]: comparison of these types is not implemented

Conclusions: the expected out-of-sample predictive accuracy is substantially lower for the model that includes the interaction. This might be used to suggest that the inferential evidence for a general interaction effect.

Matrix model (RSTANARM)

summary(data1.rstanarm)

Model Info:

 function:  stan_glm
 family:    gaussian [identity]
 formula:   Y ~ A * B
 algorithm: sampling
 priors:    see help('prior_summary')
 sample:    2250 (posterior sample size)
 num obs:   30

Estimates:
                mean   sd     2.5%   25%    50%    75%    97.5%
(Intercept)     49.5    2.4   44.9   48.0   49.6   51.1   54.3 
AGroup B       -12.3    3.1  -18.3  -14.4  -12.3  -10.2   -6.1 
AGroup C       -28.3    3.0  -34.3  -30.3  -28.3  -26.3  -22.3 
B               -0.5    0.1   -0.7   -0.6   -0.5   -0.4   -0.3 
AGroup B:B      -0.1    0.1   -0.3   -0.2   -0.1    0.0    0.1 
AGroup C:B       0.4    0.1    0.2    0.3    0.4    0.5    0.6 
sigma            3.8    0.6    2.8    3.3    3.7    4.1    5.2 
mean_PPD        26.9    1.0   24.9   26.2   26.9   27.5   28.9 
log-posterior  -97.2    2.2 -102.5  -98.4  -96.8  -95.6  -94.1 

Diagnostics:
              mcse Rhat n_eff
(Intercept)   0.1  1.0  1938 
AGroup B      0.1  1.0  1974 
AGroup C      0.1  1.0  1902 
B             0.0  1.0  2126 
AGroup B:B    0.0  1.0  2068 
AGroup C:B    0.0  1.0  1989 
sigma         0.0  1.0  1226 
mean_PPD      0.0  1.0  1497 
log-posterior 0.1  1.0  1030 

For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence Rhat=1).

# OR
library(broom)
tidyMCMC(data1.rstanarm$stanfit, conf.int = TRUE, conf.method = "HPDinterval")

           term    estimate  std.error     conf.low   conf.high
1   (Intercept)  49.5438430 2.38328655   44.8579446  54.1707410
2      AGroup B -12.3097610 3.13510315  -18.8059933  -6.6018889
3      AGroup C -28.2751561 3.02810337  -33.8195233 -21.9065319
4             B  -0.4940998 0.09381214   -0.6922853  -0.3200262
5    AGroup B:B  -0.1204605 0.11724241   -0.3514843   0.1104638
6    AGroup C:B   0.4111006 0.11271108    0.1810014   0.6214744
7         sigma   3.7727701 0.59136185    2.7322401   4.9499110
8      mean_PPD  26.8713702 1.00446504   24.9737303  29.0017404
9 log-posterior -97.2208223 2.19450449 -101.6130115 -93.7209496

Conclusions:

the intercept of the first group (Group A) is 49.543843
the mean of the second group (Group B) is -12.309761 units greater than (A)
the mean of the third group (Group C) is -28.2751561 units greater than (A)
a one unit increase in B in Group A is associated with a -0.4940998 units increase in Y
difference in slope between Group B and Group A -0.1204605
difference in slope between Group C and Group A 0.4111006

## since values are less than zero
head(as.matrix(data1.rstanarm))

          parameters
iterations (Intercept)  AGroup B  AGroup C          B  AGroup B:B AGroup C:B    sigma
      [1,]    49.10544 -12.03203 -30.24254 -0.5408720 -0.07177445  0.5097346 3.368864
      [2,]    49.57108 -10.46682 -27.74189 -0.5167232 -0.15566016  0.4159785 3.202435
      [3,]    52.79242 -16.30169 -31.83083 -0.5749994 -0.08438700  0.4699279 3.849197
      [4,]    50.85134 -14.46853 -28.92998 -0.5535872  0.03970759  0.4676661 4.339336
      [5,]    52.41561 -18.30582 -30.55315 -0.4583573 -0.10973607  0.3507038 4.801621
      [6,]    52.90121 -18.31527 -30.77699 -0.6350685  0.13775093  0.4858842 3.541416

mcmcpvalue(as.matrix(data1.rstanarm)[, "AGroup B"])  # effect of (B-A = 0)

[1] 0

mcmcpvalue(as.matrix(data1.rstanarm)[, "AGroup C"])  # effect of (C-A = 0)

[1] 0

mcmcpvalue(as.matrix(data1.rstanarm)[, "B"])  # effect of (slope = 0)

[1] 0

mcmcpvalue(as.matrix(data1.rstanarm)[, "AGroup B:B"])  # effect of (slopeB - slopeA = 0)

[1] 0.2951111

mcmcpvalue(as.matrix(data1.rstanarm)[, "AGroup C:B"])  # effect of (slopeC - slopeA = 0)

[1] 0.001777778

mcmcpvalue(as.matrix(data1.rstanarm)[, 2:6])  # effect of (model)

[1] 0

There is evidence that the reponse differs between the groups. There is very little evidence that the response of group D differs from that of group A

library(loo)
(full = loo(data1.rstanarm))

Computed from 2250 by 30 log-likelihood matrix

         Estimate  SE
elpd_loo    -85.9 4.1
p_loo         6.8 1.8
looic       171.9 8.2

Pareto k diagnostic values:
                         Count  Pct 
(-Inf, 0.5]   (good)     24    80.0%
 (0.5, 0.7]   (ok)        5    16.7%
   (0.7, 1]   (bad)       1     3.3%
   (1, Inf)   (very bad)  0     0.0%
See help('pareto-k-diagnostic') for details.

data1.rstanarm.red = update(data1.rstanarm, . ~ A + B)

Gradient evaluation took 4.1e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.41 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.252022 seconds (Warm-up)
               0.130327 seconds (Sampling)
               0.382349 seconds (Total)


Gradient evaluation took 1.9e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.19 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.182518 seconds (Warm-up)
               0.152707 seconds (Sampling)
               0.335225 seconds (Total)


Gradient evaluation took 1.2e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.12 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.164992 seconds (Warm-up)
               0.158132 seconds (Sampling)
               0.323124 seconds (Total)

(reduced = loo(data1.rstanarm.red))

Computed from 2250 by 30 log-likelihood matrix

         Estimate  SE
elpd_loo    -97.7 3.1
p_loo         4.7 1.0
looic       195.5 6.3

Pareto k diagnostic values:
                         Count  Pct 
(-Inf, 0.5]   (good)     27    90.0%
 (0.5, 0.7]   (ok)        3    10.0%
   (0.7, 1]   (bad)       0     0.0%
   (1, Inf)   (very bad)  0     0.0%

All Pareto k estimates are ok (k < 0.7)
See help('pareto-k-diagnostic') for details.

par(mfrow = 1:2, mar = c(5, 3.8, 1, 0) + 0.1, las = 3)
plot(full, label_points = TRUE)
plot(reduced, label_points = TRUE)

compare_models(full, reduced)

elpd_diff        se 
    -11.8       4.2

Conclusions: the expected out-of-sample predictive accuracy is substantially lower for the model that includes the interaction (A:B). This might be used to suggest that the inferential evidence for a general interaction.

Matrix model (BRMS)

summary(data1.brms)

 Family: gaussian(identity) 
Formula: Y ~ A * B 
   Data: data1 (Number of observations: 30) 
Samples: 3 chains, each with iter = 2000; warmup = 500; thin = 2; 
         total post-warmup samples = 2250
    ICs: LOO = NA; WAIC = NA; R2 = NA
 
Population-Level Effects: 
          Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
Intercept    49.60      2.32    44.97    54.25       1182    1
AGroupB     -12.33      3.06   -18.49    -6.37       1216    1
AGroupC     -28.29      2.97   -34.18   -22.34       1246    1
B            -0.49      0.09    -0.67    -0.32       1160    1
AGroupB:B    -0.12      0.11    -0.35     0.11       1186    1
AGroupC:B     0.41      0.11     0.19     0.62       1248    1

Family Specific Parameters: 
      Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sigma     3.71      0.54     2.81     4.98       1423    1

Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample 
is a crude measure of effective sample size, and Rhat is the potential 
scale reduction factor on split chains (at convergence, Rhat = 1).

# OR
library(broom)
tidyMCMC(data1.brms$fit, conf.int = TRUE, conf.method = "HPDinterval")

         term    estimate std.error    conf.low   conf.high
1 b_Intercept  49.6027270 2.3173028  44.8624049  54.0547491
2   b_AGroupB -12.3262056 3.0553552 -18.7775957  -6.7594663
3   b_AGroupC -28.2870152 2.9659002 -33.8736952 -22.1495177
4         b_B  -0.4926334 0.0908813  -0.6696454  -0.3177777
5 b_AGroupB:B  -0.1212027 0.1126979  -0.3497945   0.1053154
6 b_AGroupC:B   0.4067397 0.1097094   0.1878784   0.6203292
7       sigma   3.7129053 0.5394484   2.6776034   4.7406944

Conclusions:

the intercept of the first group (Group A) is 49.602727
the mean of the second group (Group B) is -12.3262056 units greater than (A)
the mean of the third group (Group C) is -28.2870152 units greater than (A)
a one unit increase in B in Group A is associated with a -0.4926334 units increase in Y
difference in slope between Group B and Group A -0.1212027
difference in slope between Group C and Group A 0.4067397

## since values are less than zero
head(as.matrix(data1.brms))

          parameters
iterations b_Intercept  b_AGroupB b_AGroupC        b_B b_AGroupB:B b_AGroupC:B    sigma      lp__
      [1,]    47.70303 -10.255010 -25.32093 -0.3742985 -0.17137635   0.2600049 3.092922 -116.2110
      [2,]    50.55262 -14.941232 -30.12870 -0.4643934 -0.13460334   0.4086522 3.230887 -114.2718
      [3,]    48.22410  -9.538243 -25.80487 -0.4858593 -0.10874505   0.4073568 4.423214 -116.0591
      [4,]    48.07606 -10.397199 -26.96966 -0.4404635 -0.16751208   0.3346061 2.849359 -113.2790
      [5,]    49.43749 -10.676002 -28.03187 -0.5099977 -0.11303887   0.4276380 3.777916 -113.2435
      [6,]    50.64586 -13.340054 -27.57699 -0.5712098 -0.02596519   0.4449288 2.659352 -115.2269

mcmcpvalue(as.matrix(data1.brms)[, "b_AGroupB"])  # effect of (B-A = 0)

[1] 0.0004444444

mcmcpvalue(as.matrix(data1.brms)[, "b_AGroupC"])  # effect of (C-A = 0)

[1] 0

mcmcpvalue(as.matrix(data1.brms)[, "b_B"])  # effect of (slope = 0)

[1] 0

mcmcpvalue(as.matrix(data1.brms)[, "b_AGroupB:B"])  # effect of (slopeB - slopeA = 0)

[1] 0.264

mcmcpvalue(as.matrix(data1.brms)[, "b_AGroupC:B"])  # effect of (slopeB - slopeA = 0)

[1] 0.001333333

mcmcpvalue(as.matrix(data1.brms)[, 2:6])  # effect of (model)

[1] 0

There is evidence that the reponse differs between the groups. There is very little evidence that the response of group D differs from that of group A

library(loo)
(full = loo(data1.brms))

  LOOIC   SE
 170.84 8.04

data1.brms.red = update(data1.brms, . ~ A + B)

SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 1).

Gradient evaluation took 2.4e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.24 seconds.
Adjust your expectations accordingly!


Iteration:    1 / 2000 [  0%]  (Warmup)
Iteration:  200 / 2000 [ 10%]  (Warmup)
Iteration:  400 / 2000 [ 20%]  (Warmup)
Iteration:  600 / 2000 [ 30%]  (Warmup)
Iteration:  800 / 2000 [ 40%]  (Warmup)
Iteration: 1000 / 2000 [ 50%]  (Warmup)
Iteration: 1001 / 2000 [ 50%]  (Sampling)
Iteration: 1200 / 2000 [ 60%]  (Sampling)
Iteration: 1400 / 2000 [ 70%]  (Sampling)
Iteration: 1600 / 2000 [ 80%]  (Sampling)
Iteration: 1800 / 2000 [ 90%]  (Sampling)
Iteration: 2000 / 2000 [100%]  (Sampling)

 Elapsed Time: 0.059643 seconds (Warm-up)
               0.028751 seconds (Sampling)
               0.088394 seconds (Total)


SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 2).

Gradient evaluation took 8e-06 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.08 seconds.
Adjust your expectations accordingly!


Iteration:    1 / 2000 [  0%]  (Warmup)
Iteration:  200 / 2000 [ 10%]  (Warmup)
Iteration:  400 / 2000 [ 20%]  (Warmup)
Iteration:  600 / 2000 [ 30%]  (Warmup)
Iteration:  800 / 2000 [ 40%]  (Warmup)
Iteration: 1000 / 2000 [ 50%]  (Warmup)
Iteration: 1001 / 2000 [ 50%]  (Sampling)
Iteration: 1200 / 2000 [ 60%]  (Sampling)
Iteration: 1400 / 2000 [ 70%]  (Sampling)
Iteration: 1600 / 2000 [ 80%]  (Sampling)
Iteration: 1800 / 2000 [ 90%]  (Sampling)
Iteration: 2000 / 2000 [100%]  (Sampling)

 Elapsed Time: 0.05915 seconds (Warm-up)
               0.028504 seconds (Sampling)
               0.087654 seconds (Total)


SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 3).

Gradient evaluation took 8e-06 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.08 seconds.
Adjust your expectations accordingly!


Iteration:    1 / 2000 [  0%]  (Warmup)
Iteration:  200 / 2000 [ 10%]  (Warmup)
Iteration:  400 / 2000 [ 20%]  (Warmup)
Iteration:  600 / 2000 [ 30%]  (Warmup)
Iteration:  800 / 2000 [ 40%]  (Warmup)
Iteration: 1000 / 2000 [ 50%]  (Warmup)
Iteration: 1001 / 2000 [ 50%]  (Sampling)
Iteration: 1200 / 2000 [ 60%]  (Sampling)
Iteration: 1400 / 2000 [ 70%]  (Sampling)
Iteration: 1600 / 2000 [ 80%]  (Sampling)
Iteration: 1800 / 2000 [ 90%]  (Sampling)
Iteration: 2000 / 2000 [100%]  (Sampling)

 Elapsed Time: 0.056265 seconds (Warm-up)
               0.027707 seconds (Sampling)
               0.083972 seconds (Total)

(reduced = loo(data1.brms.red))

  LOOIC  SE
 195.67 6.7

par(mfrow = 1:2, mar = c(5, 3.8, 1, 0) + 0.1, las = 3)
plot(full, label_points = TRUE)
plot(reduced, label_points = TRUE)

compare_models(full, reduced)

Error in discrete == discrete[1]: comparison of these types is not implemented

Graphical summaries

General
MCMCpack
JAGS
STAN
RSTANARM
BRMS

Matrix model (MCMCpack)

mcmc = data1.mcmcpack
## Calculate the fitted values
newdata1 = expand.grid(A = levels(data1$A), B = seq(min(data1$B), max(data1$B),
    len = 100))
Xmat = model.matrix(~A * B, newdata1)
coefs = mcmc[, 1:6]
fit = coefs %*% t(Xmat)
newdata1 = newdata1 %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.method = "HPDinterval"))

ggplot(newdata1, aes(y = estimate, x = B, fill = A)) + geom_ribbon(aes(ymin = conf.low,
    ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") +
    scale_x_continuous("B") + theme_classic()

plot of chunk tut7.5bMCMCpack2GraphicalSummaries

As this is simple single factor ANOVA, we can simple add the raw data to this figure. For more complex designs with additional predictors, it is necessary to plot partial residuals.

## Calculate partial residuals fitted values
fdata1 = rdata1 = data1
fMat = rMat = model.matrix(~A * B, fdata1)
fit = as.vector(apply(coefs, 2, median) %*% t(fMat))
resid = as.vector(data1$Y - apply(coefs, 2, median) %*% t(rMat))
rdata1 = rdata1 %>% mutate(partial.resid = resid + fit)

ggplot(newdata1, aes(y = estimate, x = B, fill = A)) + geom_point(data = rdata1,
    aes(y = partial.resid, x = B, color = A)) + geom_ribbon(aes(ymin = conf.low,
    ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") +
    scale_x_continuous("B") + theme_classic()

plot of chunk tut7.5bMCMCpack2GraphicalSummaries2

Matrix model (JAGS)

mcmc = data1.r2jags$BUGSoutput$sims.matrix
## Calculate the fitted values
newdata1 = expand.grid(A = levels(data1$A), B = seq(min(data1$B), max(data1$B),
    len = 100))
Xmat = model.matrix(~A * B, newdata1)
coefs = mcmc[, c("beta[1]", "beta[2]", "beta[3]", "beta[4]", "beta[5]",
    "beta[6]")]
fit = coefs %*% t(Xmat)
newdata1 = newdata1 %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.method = "HPDinterval"))

ggplot(newdata1, aes(y = estimate, x = B, fill = A)) + geom_ribbon(aes(ymin = conf.low,
    ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") +
    scale_x_continuous("B") + theme_classic()

plot of chunk tut7.5bJAGSG2raphicalSummaries

As this is simple single factor ANOVA, we can simple add the raw data to this figure. For more complex designs with additional predictors, it is necessary to plot partial residuals.

## Calculate partial residuals fitted values
fdata1 = rdata1 = data1
fMat = rMat = model.matrix(~A * B, fdata1)
fit = as.vector(apply(coefs, 2, median) %*% t(fMat))
resid = as.vector(data1$Y - apply(coefs, 2, median) %*% t(rMat))
rdata1 = rdata1 %>% mutate(partial.resid = resid + fit)

ggplot(newdata1, aes(y = estimate, x = B, fill = A)) + geom_point(data = rdata1,
    aes(y = partial.resid, x = B, color = A)) + geom_ribbon(aes(ymin = conf.low,
    ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") +
    scale_x_continuous("B") + theme_classic()

plot of chunk tut7.5bJAGS2GraphicalSummaries2

Matrix model (RSTAN)

mcmc = as.matrix(data1.rstan)
## Calculate the fitted values
newdata1 = expand.grid(A = levels(data1$A), B = seq(min(data1$B), max(data1$B),
    len = 100))
Xmat = model.matrix(~A * B, newdata1)
coefs = mcmc[, c("beta[1]", "beta[2]", "beta[3]", "beta[4]", "beta[5]",
    "beta[6]")]
fit = coefs %*% t(Xmat)
newdata1 = newdata1 %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.method = "HPDinterval"))

ggplot(newdata1, aes(y = estimate, x = B, fill = A)) + geom_ribbon(aes(ymin = conf.low,
    ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") +
    scale_x_continuous("B") + theme_classic()

plot of chunk tut7.5bRSTAN2GraphicalSummaries

As this is simple single factor ANOVA, we can simple add the raw data to this figure. For more complex designs with additional predictors, it is necessary to plot partial residuals.

## Calculate partial residuals fitted values
fdata1 = rdata1 = data1
fMat = rMat = model.matrix(~A * B, fdata1)
fit = as.vector(apply(coefs, 2, median) %*% t(fMat))
resid = as.vector(data1$Y - apply(coefs, 2, median) %*% t(rMat))
rdata1 = rdata1 %>% mutate(partial.resid = resid + fit)

ggplot(newdata1, aes(y = estimate, x = B, fill = A)) + geom_point(data = rdata1,
    aes(y = partial.resid, x = B, color = A)) + geom_ribbon(aes(ymin = conf.low,
    ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") +
    scale_x_continuous("B") + theme_classic()

plot of chunk tut7.5bRSTAN2GraphicalSummaries2

Matrix model (RSTANARM)

## Calculate the fitted values
newdata1 = expand.grid(A = levels(data1$A), B = seq(min(data1$B),
    max(data1$B), len = 100))
fit = posterior_linpred(data1.rstanarm, newdata = newdata1)
newdata1 = newdata1 %>% cbind(tidyMCMC(as.mcmc(fit), conf.int = TRUE,
    conf.method = "HPDinterval"))

ggplot(newdata1, aes(y = estimate, x = B, fill = A)) + geom_ribbon(aes(ymin = conf.low,
    ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") +
    scale_x_continuous("B") + theme_classic()

plot of chunk tut7.5bRSTANARM2GraphicalSummaries

As this is simple single factor ANOVA, we can simple add the raw data to this figure. For more complex designs with additional predictors, it is necessary to plot partial residuals.

## Calculate partial residuals
rdata1 = data1
pp = posterior_linpred(data1.rstanarm, newdata = rdata1)
fit = as.vector(apply(pp, 2, median))
resid = resid(data1.rstanarm)
rdata1 = rdata1 %>% mutate(partial.resid = resid + fit)

ggplot(newdata1, aes(y = estimate, x = B, fill = A)) + geom_point(data = rdata1,
    aes(y = partial.resid, x = B, color = A)) + geom_ribbon(aes(ymin = conf.low,
    ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") +
    scale_x_continuous("B") + theme_classic()

plot of chunk tut7.5bRSTANARM2GraphicalSummaries2

Matrix model (BRMS)

plot(marginal_effects(data1.brms), points = TRUE)

plot of chunk tut7.5bBRMS2GraphicalSummaries.a

# OR
eff = plot(marginal_effects(data1.brms), points = TRUE, plot = FALSE)
eff

$A

$B

$`B:A`

## Calculate the fitted values
newdata1 = expand.grid(A = levels(data1$A), B = seq(min(data1$B),
    max(data1$B), len = 100))
fit = fitted(data1.brms, newdata = newdata1, summary = FALSE)
newdata1 = newdata1 %>% cbind(tidyMCMC(as.mcmc(fit), conf.int = TRUE,
    conf.method = "HPDinterval"))

ggplot(newdata1, aes(y = estimate, x = B, fill = A)) + geom_ribbon(aes(ymin = conf.low,
    ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") +
    scale_x_continuous("B") + theme_classic()

plot of chunk tut7.5bBRMS2GraphicalSummaries

As this is simple single factor ANOVA, we can simple add the raw data to this figure. For more complex designs with additional predictors, it is necessary to plot partial residuals.

## Calculate partial residuals
rdata1 = data1
fit = fitted(data1.brms, summary = TRUE)[, "Estimate"]
resid = resid(data1.brms)[, "Estimate"]
rdata1 = rdata1 %>% mutate(partial.resid = resid + fit)

ggplot(newdata1, aes(y = estimate, x = B, fill = A)) + geom_point(data = rdata1,
    aes(y = partial.resid, x = B, color = A)) + geom_ribbon(aes(ymin = conf.low,
    ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") +
    scale_x_continuous("B") + theme_classic()

plot of chunk tut7.5bBRMS2GraphicalSummaries2

References

Gelman, A. (2005). “Analysis of Variance - Why it is More Important Than Ever”. In: The Annals of Statistics 33.1, pp. 1–53.

Gelman, A., B. Goodrich, J. Gabry, et al. (2017). “R-squared for Bayesian regression models”.

Worked Examples

Single factor classification (ANOVA) references

McCarthy (2007) - Chpt 6
Kery (2010) - Chpt 10
Gelman & Hill (2007) - Chpt 4
Logan (2010) - Chpt 12
Quinn & Keough (2002) - Chpt 9

Homogeneous slopes

To investigate the impacts of sexual activity on the fruitfly longevity, Partridge and Farquhar (1981), measured the longevity of male fruitflies with access to either one virgin female (potential mate), eight virgin females, one pregnant female (not a potential mate), eight pregnant females or no females. The pool of available male fruitflies varied in size and since size is known to impact longevity, the researchers also measured thorax length as a covariate.

Download Partridge1 data set

Format of partridge1.csv data file

TREATMENT	THORAX	LONGEV
Preg8	0.64	35
Preg8	0.68	37
Preg8	0.68	49
..	..	..

TREATMENT	Categorical listing of the number and sort of female partners(None - no female partners, Preg1 - one pregnant female partner, Preg8 - eight pregnant partners, Virg1 - one virgin female, Virg8 - eight virgin females)
THORAX	Continuous covariate, the length of the thorax (mm)
LONGEV	Longevity of male fruitflies. Response variable.

Open

the partridge1 data file.

Show code

partridge <- read.csv("../downloads/data/partridge1.csv", strip.white = T)
head(partridge)

  TREATMENT THORAX LONGEV
1     Preg8   0.64     35
2     Preg8   0.68     37
3     Preg8   0.68     49
4     Preg8   0.72     46
5     Preg8   0.72     63
6     Preg8   0.76     39

A subset of these data were analysed in Tutorial 7.4b Q3 (the same data, yet without the continuous covariate). On that occasion we decided that although the response were counts (and thus a good match for a Poisson distribution), the magnitude of the observations would probably mean that a Gaussian distribution would be adequate.

The addition of a continuous covariate is likely to complicate this somewhat. Now we not only assume homogeneity of variance between treatments, we also assume homogeneity of variance with thorax length.

Perform basic exploratory data analysis to examined.
- normality
- homogeneity of variance
- homogeneity of slopes (and equality of covariate ranges)
View full code
ggplot(partridge, aes(y = LONGEV, x = THORAX, color = TREATMENT)) + geom_smooth() + geom_point()
We can also quickly explore these by plotting the residuals from a simple linear model
View full code
plot(lm(LONGEV ~ THORAX + TREATMENT, partridge), which = 1)
There is definitely evidence of non-homogeneity of variance and perhaps even non-linearity. As already indicated, this is perhaps not unsurprising. Arguably, the most appropriate starting point would be to begin with a Poisson distribution which would not only expect the non-homogeneity of variance, since the Poisson uses a log-link, the relationship on a response scale would be expected to be non-linear.

Nevertheless, we will researve the use of a Poisson distribution until Tutorial 9.4b and for now, simply use a log (base 10) transformation to be consistent with the original authors. Note, as there are no zeros in these data, and (perhaps more importantly), the response does not approach zero, a logarithmic transformation is likely to be reasonable.

View full code
ggplot(partridge, aes(y = log10(LONGEV), x = THORAX, color = TREATMENT)) + geom_smooth() + geom_point()
plot(lm(log10(LONGEV) ~ THORAX + TREATMENT, partridge), which = 1)
That does appear more reasonable..

And now to explore the range of the covariate.

View full code
ggplot(partridge, aes(y = THORAX, x = TREATMENT)) + geom_boxplot()
summary(lm(THORAX ~ TREATMENT, partridge))
Call: lm(formula = THORAX ~ TREATMENT, data = partridge) Residuals: Min 1Q Median 3Q Max -0.1960 -0.0456 0.0144 0.0544 0.1344 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.83600 0.01543 54.194 <2e-16 *** TREATMENTPreg1 -0.01040 0.02182 -0.477 0.634 TREATMENTPreg8 -0.03040 0.02182 -1.393 0.166 TREATMENTVirg1 0.00160 0.02182 0.073 0.942 TREATMENTVirg8 -0.03600 0.02182 -1.650 0.102 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.07713 on 120 degrees of freedom Multiple R-squared: 0.04032, Adjusted R-squared: 0.008335 F-statistic: 1.261 on 4 and 120 DF, p-value: 0.2893
Conclusions - no evidence that the range of thorax length varies substantially between treatments. Furthermore, homogeneity of slopes seems reasonable and so we will use an additive model.

Fit the appropriate Bayesian model to explore the effect of treatment and thorax length on male fruitfly longenvity. $$ \begin{align} y_{ij} &\sim{} N(\mu_{ij}, \sigma)\\ \mu_{ij} &= \beta_0 + \mathbf{X}\boldsymbol{\beta}\\[1em] \beta_0 &\sim{} N(0,10)\\ \beta &\sim{} N(0,10)\\ \sigma &\sim{} cauchy(0,5)\\ \end{align} $$

View MCMCpack code

library(MCMCpack)
partridge.mcmcpack = MCMCregress(log10(LONGEV) ~ TREATMENT + THORAX, data = partridge)

Show JAGS code

modelString = "
  model {
  #Likelihood
  for (i in 1:n) {
  y[i]~dnorm(mu[i],tau)
  mu[i] <- beta0 + inprod(beta[],X[i,])
  }
  #Priors
  beta0 ~ dnorm(0.01,1.0E-6)
  for (j in 1:nX) {
  beta[j] ~ dnorm(0.01,1.0E-6)
  }
  tau <- 1 / (sigma * sigma)
  sigma~dunif(0,100)
  }
  "

X = model.matrix(~TREATMENT + THORAX, data = partridge)
partridge.list <- with(partridge, list(y = log10(LONGEV), X = X[, -1],
    nX = ncol(X) - 1, n = nrow(partridge)))

params <- c("beta0", "beta", "sigma")
burnInSteps = 3000
nChains = 3
numSavedSteps = 15000
thinSteps = 10
nIter = ceiling((numSavedSteps * thinSteps)/nChains)

partridge.r2jags <- jags(data = partridge.list, inits = NULL, parameters.to.save = params,
    model.file = textConnection(modelString), n.chains = nChains, n.iter = nIter,
    n.burnin = burnInSteps, n.thin = thinSteps)

Compiling model graph
   Resolving undeclared variables
   Allocating nodes
Graph information:
   Observed stochastic nodes: 125
   Unobserved stochastic nodes: 7
   Total graph size: 998

Initializing model

Show RSTAN code

					  modelString=" 
					  data { 
					  int n;   // total number of observations 
					  vector[n] Y;      // response variable 
					  int nX;  // number of effects 
					  matrix[n, nX] X;   // model matrix 
					  } 
					  transformed data { 
					  matrix[n, nX - 1] Xc;  // centered version of X 
					  vector[nX - 1] means_X;  // column means of X before centering 
					  
					  for (i in 2:nX) { 
					  means_X[i - 1] = mean(X[, i]); 
					  Xc[, i - 1] = X[, i] - means_X[i - 1]; 
					  }  
					  } 
					  parameters { 
					  vector[nX-1] beta;  // population-level effects 
					  real cbeta0;  // center-scale intercept 
					  real sigma;  // residual SD 
					  } 
					  transformed parameters { 
					  } 
					  model { 
					  vector[n] mu; 
					  mu = Xc * beta + cbeta0; 
					  // prior specifications 
					  beta ~ normal(0, 10); 
					  cbeta0 ~ normal(0, 10); 
					  sigma ~ cauchy(0, 5); 
					  // likelihood contribution 
					  Y ~ normal(mu, sigma); 
					  } 
					  generated quantities { 
					  real beta0;  // population-level intercept 
					  vector[n] log_lik;
					  beta0 = cbeta0 - dot_product(means_X, beta);
					  for (i in 1:n) {
					  log_lik[i] = normal_lpdf(Y[i] | Xc[i] * beta + cbeta0, sigma);
					  }
					  }
					  "

X = model.matrix(~TREATMENT + THORAX, data = partridge)
partridge.list <- with(partridge, list(Y = log10(LONGEV), X = X, nX = ncol(X), n = nrow(partridge)))

library(rstan)
partridge.rstan <- stan(data = partridge.list, model_code = modelString, chains = 3, iter = 5000, warmup = 500,
    thin = 2)

In file included from /usr/local/lib/R/site-library/BH/include/boost/config.hpp:39:0,
                 from /usr/local/lib/R/site-library/BH/include/boost/math/tools/config.hpp:13,
                 from /usr/local/lib/R/site-library/StanHeaders/include/stan/math/rev/core/var.hpp:7,
                 from /usr/local/lib/R/site-library/StanHeaders/include/stan/math/rev/core/gevv_vvv_vari.hpp:5,
                 from /usr/local/lib/R/site-library/StanHeaders/include/stan/math/rev/core.hpp:12,
                 from /usr/local/lib/R/site-library/StanHeaders/include/stan/math/rev/mat.hpp:4,
                 from /usr/local/lib/R/site-library/StanHeaders/include/stan/math.hpp:4,
                 from /usr/local/lib/R/site-library/StanHeaders/include/src/stan/model/model_header.hpp:4,
                 from file6f0539983e2.cpp:8:
/usr/local/lib/R/site-library/BH/include/boost/config/compiler/gcc.hpp:186:0: warning: "BOOST_NO_CXX11_RVALUE_REFERENCES" redefined
 #  define BOOST_NO_CXX11_RVALUE_REFERENCES
 ^
<command-line>:0:0: note: this is the location of the previous definition

SAMPLING FOR MODEL 'd98dbf6a02725fc3fce11306b77873e9' NOW (CHAIN 1).

Gradient evaluation took 3.1e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.31 seconds.
Adjust your expectations accordingly!


Iteration:    1 / 5000 [  0%]  (Warmup)
Iteration:  500 / 5000 [ 10%]  (Warmup)
Iteration:  501 / 5000 [ 10%]  (Sampling)
Iteration: 1000 / 5000 [ 20%]  (Sampling)
Iteration: 1500 / 5000 [ 30%]  (Sampling)
Iteration: 2000 / 5000 [ 40%]  (Sampling)
Iteration: 2500 / 5000 [ 50%]  (Sampling)
Iteration: 3000 / 5000 [ 60%]  (Sampling)
Iteration: 3500 / 5000 [ 70%]  (Sampling)
Iteration: 4000 / 5000 [ 80%]  (Sampling)
Iteration: 4500 / 5000 [ 90%]  (Sampling)
Iteration: 5000 / 5000 [100%]  (Sampling)

 Elapsed Time: 0.077105 seconds (Warm-up)
               0.380042 seconds (Sampling)
               0.457147 seconds (Total)


SAMPLING FOR MODEL 'd98dbf6a02725fc3fce11306b77873e9' NOW (CHAIN 2).

Gradient evaluation took 1.6e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.16 seconds.
Adjust your expectations accordingly!


Iteration:    1 / 5000 [  0%]  (Warmup)
Iteration:  500 / 5000 [ 10%]  (Warmup)
Iteration:  501 / 5000 [ 10%]  (Sampling)
Iteration: 1000 / 5000 [ 20%]  (Sampling)
Iteration: 1500 / 5000 [ 30%]  (Sampling)
Iteration: 2000 / 5000 [ 40%]  (Sampling)
Iteration: 2500 / 5000 [ 50%]  (Sampling)
Iteration: 3000 / 5000 [ 60%]  (Sampling)
Iteration: 3500 / 5000 [ 70%]  (Sampling)
Iteration: 4000 / 5000 [ 80%]  (Sampling)
Iteration: 4500 / 5000 [ 90%]  (Sampling)
Iteration: 5000 / 5000 [100%]  (Sampling)

 Elapsed Time: 0.067367 seconds (Warm-up)
               0.394947 seconds (Sampling)
               0.462314 seconds (Total)


SAMPLING FOR MODEL 'd98dbf6a02725fc3fce11306b77873e9' NOW (CHAIN 3).

Gradient evaluation took 1.4e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.14 seconds.
Adjust your expectations accordingly!


Iteration:    1 / 5000 [  0%]  (Warmup)
Iteration:  500 / 5000 [ 10%]  (Warmup)
Iteration:  501 / 5000 [ 10%]  (Sampling)
Iteration: 1000 / 5000 [ 20%]  (Sampling)
Iteration: 1500 / 5000 [ 30%]  (Sampling)
Iteration: 2000 / 5000 [ 40%]  (Sampling)
Iteration: 2500 / 5000 [ 50%]  (Sampling)
Iteration: 3000 / 5000 [ 60%]  (Sampling)
Iteration: 3500 / 5000 [ 70%]  (Sampling)
Iteration: 4000 / 5000 [ 80%]  (Sampling)
Iteration: 4500 / 5000 [ 90%]  (Sampling)
Iteration: 5000 / 5000 [100%]  (Sampling)

 Elapsed Time: 0.067727 seconds (Warm-up)
               0.42144 seconds (Sampling)
               0.489167 seconds (Total)

Show RSTANARM code

partridge.rstanarm = stan_glm(log10(LONGEV) ~ TREATMENT + THORAX, data = partridge,
    iter = 5000, warmup = 500, chains = 3, thin = 2, refresh = 0, prior_intercept = normal(0,
        10), prior = normal(0, 10), prior_aux = cauchy(0, 5))

Gradient evaluation took 0.000161 seconds
1000 transitions using 10 leapfrog steps per transition would take 1.61 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.109844 seconds (Warm-up)
               0.657854 seconds (Sampling)
               0.767698 seconds (Total)


Gradient evaluation took 2.1e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.21 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.08606 seconds (Warm-up)
               0.507298 seconds (Sampling)
               0.593358 seconds (Total)


Gradient evaluation took 3e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.3 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.084726 seconds (Warm-up)
               0.50021 seconds (Sampling)
               0.584936 seconds (Total)

Show BRMS code

partridge.brm = brm(log10(LONGEV) ~ TREATMENT + THORAX, data = partridge,
    iter = 5000, warmup = 500, chains = 3, thin = 2, refresh = 0, prior = c(prior(normal(0,
        10), class = "Intercept"), prior(normal(0, 10), class = "b"), prior(cauchy(0,
        5), class = "sigma")))

Gradient evaluation took 2.5e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.25 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.060042 seconds (Warm-up)
               0.340743 seconds (Sampling)
               0.400785 seconds (Total)


Gradient evaluation took 1.2e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.12 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.061698 seconds (Warm-up)
               0.346451 seconds (Sampling)
               0.408149 seconds (Total)


Gradient evaluation took 1.4e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.14 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.064057 seconds (Warm-up)
               0.336682 seconds (Sampling)
               0.400739 seconds (Total)

Explore MCMC diagnostics

Show MCMCpack code

library(MCMCpack)
plot(partridge.mcmcpack)

raftery.diag(partridge.mcmcpack)

Quantile (q) = 0.025
Accuracy (r) = +/- 0.005
Probability (s) = 0.95 
                                                      
                Burn-in  Total Lower bound  Dependence
                (M)      (N)   (Nmin)       factor (I)
 (Intercept)    2        3865  3746         1.030     
 TREATMENTPreg1 2        3710  3746         0.990     
 TREATMENTPreg8 2        3741  3746         0.999     
 TREATMENTVirg1 2        3710  3746         0.990     
 TREATMENTVirg8 2        3771  3746         1.010     
 THORAX         2        3710  3746         0.990     
 sigma2         2        3994  3746         1.070

autocorr.diag(partridge.mcmcpack)

        (Intercept) TREATMENTPreg1 TREATMENTPreg8 TREATMENTVirg1 TREATMENTVirg8        THORAX
Lag 0   1.000000000    1.000000000   1.000000e+00    1.000000000    1.000000000  1.000000e+00
Lag 1  -0.006504213    0.005715657  -1.260043e-03   -0.008276112   -0.008425989 -6.216742e-03
Lag 5   0.020132882    0.012739945  -2.900389e-02   -0.001338169   -0.015262191  2.106973e-02
Lag 10  0.005066356    0.011293213  -8.019851e-03   -0.003100969   -0.013767490  5.865134e-03
Lag 50 -0.001937426   -0.024530392   8.454565e-05    0.004761843   -0.008212958 -3.677482e-05
              sigma2
Lag 0   1.000000e+00
Lag 1   5.041996e-02
Lag 5   7.127112e-05
Lag 10  1.653846e-02
Lag 50 -7.121217e-03

Show JAGS code

library(R2jags)
library(coda)
partridge.mcmc = as.mcmc(partridge.r2jags)
plot(partridge.mcmc)

raftery.diag(partridge.mcmc)

[[1]]

Quantile (q) = 0.025
Accuracy (r) = +/- 0.005
Probability (s) = 0.95 
                                                
          Burn-in  Total Lower bound  Dependence
          (M)      (N)   (Nmin)       factor (I)
 beta0    10       37660 3746         10.10     
 beta[1]  20       37020 3746          9.88     
 beta[2]  20       37020 3746          9.88     
 beta[3]  20       37020 3746          9.88     
 beta[4]  20       35750 3746          9.54     
 beta[5]  20       37020 3746          9.88     
 deviance 20       37020 3746          9.88     
 sigma    10       37660 3746         10.10     


[[2]]

Quantile (q) = 0.025
Accuracy (r) = +/- 0.005
Probability (s) = 0.95 
                                                
          Burn-in  Total Lower bound  Dependence
          (M)      (N)   (Nmin)       factor (I)
 beta0    10       37660 3746         10.10     
 beta[1]  10       37660 3746         10.10     
 beta[2]  30       40390 3746         10.80     
 beta[3]  20       37020 3746          9.88     
 beta[4]  10       37660 3746         10.10     
 beta[5]  20       38330 3746         10.20     
 deviance 20       38330 3746         10.20     
 sigma    20       36380 3746          9.71     


[[3]]

Quantile (q) = 0.025
Accuracy (r) = +/- 0.005
Probability (s) = 0.95 
                                                
          Burn-in  Total Lower bound  Dependence
          (M)      (N)   (Nmin)       factor (I)
 beta0    20       37020 3746          9.88     
 beta[1]  10       37660 3746         10.10     
 beta[2]  20       39000 3746         10.40     
 beta[3]  20       37020 3746          9.88     
 beta[4]  20       37020 3746          9.88     
 beta[5]  10       37660 3746         10.10     
 deviance 20       37020 3746          9.88     
 sigma    20       38330 3746         10.20

autocorr.diag(partridge.mcmc)

               beta0      beta[1]     beta[2]      beta[3]      beta[4]      beta[5]     deviance
Lag 0    1.000000000  1.000000000 1.000000000  1.000000000  1.000000000  1.000000000  1.000000000
Lag 10  -0.013921949  0.005149797 0.004975120 -0.001853710 -0.014382555 -0.012575832 -0.008892071
Lag 50  -0.003696594 -0.002835295 0.001318796  0.010156118 -0.011928061 -0.003560853  0.010943677
Lag 100  0.003584436 -0.001331310 0.003826374  0.009171373  0.008765503  0.003492416 -0.008883727
Lag 500  0.016409182  0.013260824 0.009091810  0.001029463 -0.003070906  0.015760118  0.002411228
                sigma
Lag 0    1.0000000000
Lag 10  -0.0092198712
Lag 50   0.0059702667
Lag 100 -0.0008803947
Lag 500  0.0077816203

Show RSTAN code

library(rstan)
library(coda)
s = as.array(partridge.rstan)
partridge.mcmc <- do.call(mcmc.list, plyr:::alply(s[, , c("beta0", "beta[1]", "beta[2]", "beta[3]", "beta[4]",
    "beta[5]", "sigma")], 2, as.mcmc))
plot(partridge.mcmc)

raftery.diag(partridge.mcmc)

$`1`

Quantile (q) = 0.025
Accuracy (r) = +/- 0.005
Probability (s) = 0.95 

You need a sample size of at least 3746 with these values of q, r and s

$`2`

Quantile (q) = 0.025
Accuracy (r) = +/- 0.005
Probability (s) = 0.95 

You need a sample size of at least 3746 with these values of q, r and s

$`3`

Quantile (q) = 0.025
Accuracy (r) = +/- 0.005
Probability (s) = 0.95 

You need a sample size of at least 3746 with these values of q, r and s

autocorr.diag(partridge.mcmc)

               beta0      beta[1]      beta[2]     beta[3]      beta[4]     beta[5]         sigma
Lag 0   1.0000000000  1.000000000  1.000000000 1.000000000  1.000000000 1.000000000  1.0000000000
Lag 1   0.0362085746  0.091738066  0.093666654 0.065303202  0.083647790 0.029088441  0.0297993560
Lag 5   0.0075315830 -0.007179393 -0.013474037 0.019288850 -0.002217306 0.007947353 -0.0114935490
Lag 10  0.0154634582  0.004801029  0.020335742 0.010920588  0.025187625 0.014189952  0.0007989306
Lag 50 -0.0001803206 -0.001614632 -0.005646201 0.003760134 -0.019317695 0.003126408 -0.0024392021

library(rstan)
library(coda)
stan_ac(partridge.rstan, pars = c("beta0", "beta", "sigma"))

stan_rhat(partridge.rstan, pars = c("beta0", "beta", "sigma"))

stan_ess(partridge.rstan, pars = c("beta0", "beta", "sigma"))

# using Bayeseplot
library(bayesplot)
detach("package:reshape")
mcmc_trace(as.array(partridge.rstan), regex_par = "beta|sigma")

mcmc_trace(as.array(partridge.rstan), regex_pars = "beta|sigma")

mcmc_dens(as.array(partridge.rstan), regex_par = "beta|sigma")

detach("package:reshape")
library(bayesplot)
mcmc_combo(as.array(partridge.rstan), regex_par = "beta|sigma")

Show RSTANARM code

library(rstanarm)
library(coda)
s = as.array(partridge.rstanarm)
colnames(as.matrix(partridge.rstanarm))

[1] "(Intercept)"    "TREATMENTPreg1" "TREATMENTPreg8" "TREATMENTVirg1" "TREATMENTVirg8"
[6] "THORAX"         "sigma"

partridge.mcmc <- do.call(mcmc.list, plyr:::alply(s[, , c("(Intercept)", "TREATMENTPreg1", "TREATMENTPreg8",
    "TREATMENTVirg1", "TREATMENTVirg8", "THORAX", "sigma")], 2, as.mcmc))
plot(partridge.mcmc)

raftery.diag(partridge.mcmc)

$`1`

Quantile (q) = 0.025
Accuracy (r) = +/- 0.005
Probability (s) = 0.95 

You need a sample size of at least 3746 with these values of q, r and s

$`2`

Quantile (q) = 0.025
Accuracy (r) = +/- 0.005
Probability (s) = 0.95 

You need a sample size of at least 3746 with these values of q, r and s

$`3`

Quantile (q) = 0.025
Accuracy (r) = +/- 0.005
Probability (s) = 0.95 

You need a sample size of at least 3746 with these values of q, r and s

autocorr.diag(partridge.mcmc)

        (Intercept) TREATMENTPreg1 TREATMENTPreg8 TREATMENTVirg1 TREATMENTVirg8        THORAX
Lag 0   1.000000000    1.000000000    1.000000000   1.0000000000    1.000000000  1.0000000000
Lag 1   0.075353592    0.177654851    0.188350285   0.1751439722    0.185668547  0.0617834129
Lag 5   0.003490198    0.005324381    0.000440865  -0.0045602106   -0.002274830  0.0006088013
Lag 10 -0.003447542    0.012745177   -0.005106027   0.0115046730   -0.010358881 -0.0033071860
Lag 50 -0.004644673   -0.009315829    0.011216715  -0.0005416519   -0.007338707 -0.0043183473
              sigma
Lag 0   1.000000000
Lag 1   0.048851176
Lag 5  -0.011582830
Lag 10 -0.008910899
Lag 50  0.003726101

library(rstanarm)
library(coda)
stan_ac(partridge.rstanarm, regex_pars = "Intercept|TREATMENT|THORAX|sigma")

stan_rhat(partridge.rstanarm, regex_pars = "Intercept|TREATMENT|THORAX|sigma")

stan_ess(partridge.rstanarm, regex_pars = "Intercept|TREATMENT|THORAX|sigma")

# using Bayeseplot
library(bayesplot)
detach("package:reshape")
mcmc_trace(as.array(partridge.rstanarm), regex_par = "Intercept|TREATMENT|THORAX|sigma")

mcmc_trace(as.array(partridge.rstanarm), regex_pars = "Intercept|TREATMENT|THORAX|sigma")

mcmc_dens(as.array(partridge.rstanarm), regex_pars = "Intercept|TREATMENT|THORAX|sigma")

detach("package:reshape")
library(bayesplot)
mcmc_combo(as.array(partridge.rstanarm), regex_par = "Intercept|TREATMENT|THORAX|sigma")

library(rstanarm)
posterior_vs_prior(partridge.rstanarm, color_by = "vs", group_by = TRUE,
    facet_args = list(scales = "free_y"))

Gradient evaluation took 4.7e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.47 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.064874 seconds (Warm-up)
               0.139432 seconds (Sampling)
               0.204306 seconds (Total)


Gradient evaluation took 4e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.4 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.074223 seconds (Warm-up)
               0.240174 seconds (Sampling)
               0.314397 seconds (Total)

Show BRMS code

library(coda)
library(brms)
partridge.mcmc = as.mcmc(partridge.brm)
plot(partridge.mcmc)

raftery.diag(partridge.mcmc)

[[1]]

Quantile (q) = 0.025
Accuracy (r) = +/- 0.005
Probability (s) = 0.95 

You need a sample size of at least 3746 with these values of q, r and s

[[2]]

Quantile (q) = 0.025
Accuracy (r) = +/- 0.005
Probability (s) = 0.95 

You need a sample size of at least 3746 with these values of q, r and s

[[3]]

Quantile (q) = 0.025
Accuracy (r) = +/- 0.005
Probability (s) = 0.95 

You need a sample size of at least 3746 with these values of q, r and s

autocorr.diag(partridge.mcmc)

Error in ts(x, start = start(x), end = end(x), deltat = thin(x)): invalid time series parameters specified

library(coda)
stan_ac(partridge.brm$fit)

stan_rhat(partridge.brm$fit)

stan_ess(partridge.brm$fit)

The chains appear well mixed and have converged on what is likely to be a stable posterior.

Perform model validation

Show MCMCpack code

library(MCMCpack)
partridge.mcmc = as.data.frame(partridge.mcmcpack)
# generate a model matrix
newdata = partridge
Xmat = model.matrix(~TREATMENT + THORAX, newdata)
## get median parameter estimates
wch = grep("Intercept|TREATMENT|THORAX", colnames(partridge.mcmc))
coefs = apply(partridge.mcmc[, wch], 2, median)
fit = as.vector(coefs %*% t(Xmat))
resid = log10(partridge$LONGEV) - fit
ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))

# Plot residuals against treatment level
newdata = newdata %>% cbind(fit, resid)
newdata.melt = newdata %>% gather(key = Pred, value = Value, TREATMENT, THORAX)
ggplot(newdata.melt) + geom_point(aes(y = resid, x = Value)) + facet_wrap(~Pred, scales = "free")

# Standardized residuals
sresid = resid/sd(resid)
ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))

library(MCMCpack)
partridge.mcmc = as.matrix(partridge.mcmcpack)
# generate a model matrix
Xmat = model.matrix(~TREATMENT + THORAX, partridge)
## get median parameter estimates
wch = grep("Intercept|TREATMENT|THORAX", colnames(partridge.mcmc))
coefs = partridge.mcmc[, wch]
fit = 10^(coefs %*% t(Xmat))
## draw samples from this model
yRep = sapply(1:nrow(partridge.mcmc), function(i) rnorm(nrow(partridge),
    fit[i, ], sqrt(partridge.mcmc[i, "sigma2"])))
newdata = data.frame(TREATMENT = partridge$TREATMENT, THORAX = partridge$THORAX,
    yRep) %>% gather(key = Sample, value = Value, -TREATMENT, -THORAX)
ggplot(newdata) + geom_violin(aes(y = Value, x = TREATMENT, fill = "Model"),
    alpha = 0.5) + geom_violin(data = partridge, aes(y = LONGEV, x = TREATMENT,
    fill = "Obs"), alpha = 0.5) + geom_point(data = partridge, aes(y = LONGEV,
    x = TREATMENT), position = position_jitter(width = 0.1, height = 0),
    color = "black")

ggplot(newdata) + geom_violin(aes(y = Value, x = THORAX, fill = "Model",
    group = THORAX, color = TREATMENT), alpha = 0.5) + geom_point(data = partridge,
    aes(y = LONGEV, x = THORAX, group = THORAX, color = TREATMENT))

We can also explore the posteriors of each parameter.

library(bayesplot)
mcmc_intervals(as.matrix(partridge.mcmcpack), regex_pars = "TREATMENT|THORAX")

mcmc_areas(as.matrix(partridge.mcmcpack), regex_pars = "TREATMENT|THORAX")

Show JAGS code

partridge.mcmc = partridge.r2jags$BUGSoutput$sims.matrix
# generate a model matrix
newdata = partridge
Xmat = model.matrix(~TREATMENT + THORAX, newdata)
## get median parameter estimates
wch = c(which(colnames(partridge.mcmc) == "beta0"), grep("beta\\[", colnames(partridge.mcmc)))
coefs = apply(partridge.mcmc[, wch], 2, median)
fit = as.vector(coefs %*% t(Xmat))
resid = log10(partridge$LONGEV) - fit
ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))

# Plot residuals against treatment level
newdata = newdata %>% cbind(fit, resid)
newdata.melt = newdata %>% gather(key = Pred, value = Value, TREATMENT, THORAX)
ggplot(newdata.melt) + geom_point(aes(y = resid, x = Value)) + facet_wrap(~Pred, scales = "free")

# Standardized residuals
sresid = resid/sd(resid)
ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))

partridge.mcmc = partridge.r2jags$BUGSoutput$sims.matrix
# generate a model matrix
Xmat = model.matrix(~TREATMENT + THORAX, partridge)
## get median parameter estimates
wch = c(which(colnames(partridge.mcmc) == "beta0"), grep("beta\\[", colnames(partridge.mcmc)))
coefs = partridge.mcmc[, wch]
fit = 10^(coefs %*% t(Xmat))
## draw samples from this model
yRep = sapply(1:nrow(partridge.mcmc), function(i) rnorm(nrow(partridge),
    fit[i, ], partridge.mcmc[i, "sigma"]))
newdata = data.frame(TREATMENT = partridge$TREATMENT, THORAX = partridge$THORAX,
    yRep) %>% gather(key = Sample, value = Value, -TREATMENT, -THORAX)
ggplot(newdata) + geom_violin(aes(y = Value, x = TREATMENT, fill = "Model"),
    alpha = 0.5) + geom_violin(data = partridge, aes(y = LONGEV, x = TREATMENT,
    fill = "Obs"), alpha = 0.5) + geom_point(data = partridge, aes(y = LONGEV,
    x = TREATMENT), position = position_jitter(width = 0.1, height = 0),
    color = "black")

ggplot(newdata) + geom_violin(aes(y = Value, x = THORAX, fill = "Model",
    group = THORAX, color = TREATMENT), alpha = 0.5) + geom_point(data = partridge,
    aes(y = LONGEV, x = THORAX, group = THORAX, color = TREATMENT))

We can also explore the posteriors of each parameter.

library(bayesplot)
mcmc_intervals(partridge.r2jags$BUGSoutput$sims.matrix, regex_pars = "beta|sigma")

mcmc_areas(partridge.r2jags$BUGSoutput$sims.matrix, regex_pars = "beta|sigma")

Show RSTAN code

partridge.mcmc = as.matrix(partridge.rstan)
# generate a model matrix
newdata = partridge
Xmat = model.matrix(~TREATMENT + THORAX, newdata)
## get median parameter estimates
wch = c(which(colnames(partridge.mcmc) == "beta0"), grep("beta\\[", colnames(partridge.mcmc)))
coefs = apply(partridge.mcmc[, wch], 2, median)
fit = as.vector(coefs %*% t(Xmat))
resid = log10(partridge$LONGEV) - fit
ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))

# Plot residuals against treatment level
newdata = newdata %>% cbind(fit, resid)
newdata.melt = newdata %>% gather(key = Pred, value = Value, TREATMENT, THORAX)
ggplot(newdata.melt) + geom_point(aes(y = resid, x = Value)) + facet_wrap(~Pred, scales = "free")

# Standardized residuals
sresid = resid/sd(resid)
ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))

partridge.mcmc = as.matrix(partridge.rstan)
# generate a model matrix
Xmat = model.matrix(~TREATMENT + THORAX, partridge)
## get median parameter estimates
wch = c(which(colnames(partridge.mcmc) == "beta0"), grep("beta\\[", colnames(partridge.mcmc)))
coefs = partridge.mcmc[, wch]
fit = 10^(coefs %*% t(Xmat))
## draw samples from this model
yRep = sapply(1:nrow(partridge.mcmc), function(i) rnorm(nrow(partridge),
    fit[i, ], partridge.mcmc[i, "sigma"]))
newdata = data.frame(TREATMENT = partridge$TREATMENT, THORAX = partridge$THORAX,
    yRep) %>% gather(key = Sample, value = Value, -TREATMENT, -THORAX)
ggplot(newdata) + geom_violin(aes(y = Value, x = TREATMENT, fill = "Model"),
    alpha = 0.5) + geom_violin(data = partridge, aes(y = LONGEV, x = TREATMENT,
    fill = "Obs"), alpha = 0.5) + geom_point(data = partridge, aes(y = LONGEV,
    x = TREATMENT), position = position_jitter(width = 0.1, height = 0),
    color = "black")

ggplot(newdata) + geom_violin(aes(y = Value, x = THORAX, fill = "Model",
    group = THORAX, color = TREATMENT), alpha = 0.5) + geom_point(data = partridge,
    aes(y = LONGEV, x = THORAX, group = THORAX, color = TREATMENT))

We can also explore the posteriors of each parameter.

library(bayesplot)
mcmc_intervals(as.matrix(partridge.rstan), regex_pars = "beta|sigma")

mcmc_areas(as.matrix(partridge.rstan), regex_pars = "beta|sigma")

Show RSTANARM code

partridge.mcmc = as.matrix(partridge.rstanarm)
# generate a model matrix
newdata = partridge
Xmat = model.matrix(~TREATMENT + THORAX, newdata)
## get median parameter estimates
wch = grep("Intercept|TREATMENT|THORAX", colnames(partridge.mcmc))
coefs = apply(partridge.mcmc[, wch], 2, median)
fit = as.vector(coefs %*% t(Xmat))
resid = log10(partridge$LONGEV) - fit
ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))

# Plot residuals against treatment level
newdata = newdata %>% cbind(fit, resid)
newdata.melt = newdata %>% gather(key = Pred, value = Value, TREATMENT, THORAX)
ggplot(newdata.melt) + geom_point(aes(y = resid, x = Value)) + facet_wrap(~Pred, scales = "free")

# Standardized residuals
sresid = resid/sd(resid)
ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))

partridge.mcmc = as.matrix(partridge.rstanarm)
# generate a model matrix
Xmat = model.matrix(~TREATMENT + THORAX, partridge)
## get median parameter estimates
wch = grep("Intercept|TREATMENT|THORAX", colnames(partridge.mcmc))
coefs = partridge.mcmc[, wch]
fit = 10^(coefs %*% t(Xmat))
## draw samples from this model
yRep = sapply(1:nrow(partridge.mcmc), function(i) rnorm(nrow(partridge),
    fit[i, ], partridge.mcmc[i, "sigma"]))
newdata = data.frame(TREATMENT = partridge$TREATMENT, THORAX = partridge$THORAX,
    yRep) %>% gather(key = Sample, value = Value, -TREATMENT, -THORAX)
ggplot(newdata) + geom_violin(aes(y = Value, x = TREATMENT, fill = "Model"),
    alpha = 0.5) + geom_violin(data = partridge, aes(y = LONGEV, x = TREATMENT,
    fill = "Obs"), alpha = 0.5) + geom_point(data = partridge, aes(y = LONGEV,
    x = TREATMENT), position = position_jitter(width = 0.1, height = 0),
    color = "black")

ggplot(newdata) + geom_violin(aes(y = Value, x = THORAX, fill = "Model",
    group = THORAX, color = TREATMENT), alpha = 0.5) + geom_point(data = partridge,
    aes(y = LONGEV, x = THORAX, group = THORAX, color = TREATMENT))

We can also explore the posteriors of each parameter.

library(bayesplot)
mcmc_intervals(as.matrix(partridge.rstanarm), regex_pars = "Intercept|TREATMENT|THORAX|sigma")

mcmc_areas(as.matrix(partridge.rstanarm), regex_pars = "Intercept|TREATMENT|THORAX|sigma")

Show BRMS code

partridge.mcmc = as.matrix(partridge.brm)
# generate a model matrix
newdata = partridge
Xmat = model.matrix(~TREATMENT + THORAX, newdata)
## get median parameter estimates
wch = grep("b_", colnames(partridge.mcmc))
coefs = apply(partridge.mcmc[, wch], 2, median)
fit = as.vector(coefs %*% t(Xmat))
resid = log10(partridge$LONGEV) - fit
ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))

# Plot residuals against treatment level
newdata = newdata %>% cbind(fit, resid)
newdata.melt = newdata %>% gather(key = Pred, value = Value, TREATMENT, THORAX)
ggplot(newdata.melt) + geom_point(aes(y = resid, x = Value)) + facet_wrap(~Pred, scales = "free")

# Standardized residuals
sresid = resid/sd(resid)
ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))

partridge.mcmc = as.matrix(partridge.brm)
# generate a model matrix
Xmat = model.matrix(~TREATMENT + THORAX, partridge)
## get median parameter estimates
wch = grep("b_", colnames(partridge.mcmc))
coefs = partridge.mcmc[, wch]
fit = 10^(coefs %*% t(Xmat))
## draw samples from this model
yRep = sapply(1:nrow(partridge.mcmc), function(i) rnorm(nrow(partridge),
    fit[i, ], partridge.mcmc[i, "sigma"]))
newdata = data.frame(TREATMENT = partridge$TREATMENT, THORAX = partridge$THORAX,
    yRep) %>% gather(key = Sample, value = Value, -TREATMENT, -THORAX)
ggplot(newdata) + geom_violin(aes(y = Value, x = TREATMENT, fill = "Model"),
    alpha = 0.5) + geom_violin(data = partridge, aes(y = LONGEV, x = TREATMENT,
    fill = "Obs"), alpha = 0.5) + geom_point(data = partridge, aes(y = LONGEV,
    x = TREATMENT), position = position_jitter(width = 0.1, height = 0),
    color = "black")

ggplot(newdata) + geom_violin(aes(y = Value, x = THORAX, fill = "Model",
    group = THORAX, color = TREATMENT), alpha = 0.5) + geom_point(data = partridge,
    aes(y = LONGEV, x = THORAX, group = THORAX, color = TREATMENT))

We can also explore the posteriors of each parameter.

library(bayesplot)
mcmc_intervals(as.matrix(partridge.brm), regex_pars = "b_|sigma")

mcmc_areas(as.matrix(partridge.brm), regex_pars = "b_|sigma")

The model validation diagnostics all seem reasonable suggesting that the model is likely to be reliable.

Explore parameter estimates

Show MCMCpack code

library(MCMCpack)
summary(partridge.mcmcpack)

Iterations = 1001:11000
Thinning interval = 1 
Number of chains = 1 
Sample size per chain = 10000 

1. Empirical mean and standard deviation for each variable,
   plus standard error of the mean:

                   Mean        SD  Naive SE Time-series SE
(Intercept)     0.79153 0.0862441 8.624e-04      8.643e-04
TREATMENTPreg1  0.02226 0.0238689 2.387e-04      2.387e-04
TREATMENTPreg8  0.03650 0.0241417 2.414e-04      2.414e-04
TREATMENTVirg1 -0.05404 0.0240102 2.401e-04      2.401e-04
TREATMENTVirg8 -0.18198 0.0241462 2.415e-04      2.415e-04
THORAX          1.19335 0.1009666 1.010e-03      1.006e-03
sigma2          0.00713 0.0009452 9.452e-06      9.942e-06

2. Quantiles for each variable:

                    2.5%       25%       50%       75%     97.5%
(Intercept)     0.625379  0.732743  0.791865  0.849579  0.961182
TREATMENTPreg1 -0.025297  0.006360  0.022323  0.038120  0.068806
TREATMENTPreg8 -0.010526  0.020301  0.036510  0.052720  0.083886
TREATMENTVirg1 -0.101254 -0.069767 -0.053919 -0.038007 -0.006695
TREATMENTVirg8 -0.230098 -0.197744 -0.181983 -0.165952 -0.134440
THORAX          0.992453  1.124938  1.193252  1.262005  1.389023
sigma2          0.005491  0.006455  0.007057  0.007723  0.009211

library(broom)
tidyMCMC(partridge.mcmcpack, conf.int = TRUE, conf.method = "HPDinterval")

            term     estimate    std.error     conf.low    conf.high
1    (Intercept)  0.791534756 0.0862440749  0.627282417  0.962326127
2 TREATMENTPreg1  0.022262435 0.0238688705 -0.025800765  0.068161673
3 TREATMENTPreg8  0.036498778 0.0241417477 -0.012028238  0.082087102
4 TREATMENTVirg1 -0.054043317 0.0240102293 -0.101607483 -0.007550831
5 TREATMENTVirg8 -0.181978738 0.0241461559 -0.231333679 -0.135891212
6         THORAX  1.193347232 0.1009665978  0.995580301  1.390356975
7         sigma2  0.007129683 0.0009451969  0.005343271  0.008958414

mcmcpvalue(partridge.mcmcpack[, "TREATMENTPreg1"])

[1] 0.3491

mcmcpvalue(partridge.mcmcpack[, "TREATMENTPreg8"])

[1] 0.133

mcmcpvalue(partridge.mcmcpack[, "TREATMENTVirg1"])

[1] 0.0271

mcmcpvalue(partridge.mcmcpack[, "TREATMENTVirg8"])

[1] 0

mcmcpvalue(partridge.mcmcpack[, "THORAX"])

[1] 0

wch = grep("TREATMENT|THORAX", colnames(partridge.mcmcpack))
mcmcpvalue(partridge.mcmcpack[, wch])

[1] 0

## Frequentist for comparison
summary(lm(log10(LONGEV) ~ TREATMENT + THORAX, partridge))

Call:
lm(formula = log10(LONGEV) ~ TREATMENT + THORAX, data = partridge)

Residuals:
      Min        1Q    Median        3Q       Max 
-0.226736 -0.058445 -0.003469  0.059961  0.170389 

Coefficients:
               Estimate Std. Error t value Pr(>|t|)    
(Intercept)     0.79095    0.08443   9.368 5.89e-16 ***
TREATMENTPreg1  0.02260    0.02368   0.954   0.3419    
TREATMENTPreg8  0.03648    0.02385   1.530   0.1287    
TREATMENTVirg1 -0.05381    0.02366  -2.275   0.0247 *  
TREATMENTVirg8 -0.18165    0.02392  -7.592 7.79e-12 ***
THORAX          1.19385    0.09900  12.060  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.08364 on 119 degrees of freedom
Multiple R-squared:  0.7055,	Adjusted R-squared:  0.6932 
F-statistic: 57.02 on 5 and 119 DF,  p-value: < 2.2e-16

Show JAGS code

print(partridge.r2jags)

Inference for Bugs model at "5", fit using jags,
 3 chains, each with 50000 iterations (first 3000 discarded), n.thin = 10
 n.sims = 14100 iterations saved
          mu.vect sd.vect     2.5%      25%      50%      75%    97.5%  Rhat n.eff
beta[1]     0.022   0.024   -0.025    0.006    0.022    0.039    0.069 1.001  5000
beta[2]     0.037   0.024   -0.010    0.020    0.037    0.053    0.083 1.001 11000
beta[3]    -0.054   0.024   -0.101   -0.070   -0.054   -0.038   -0.007 1.001 14000
beta[4]    -0.182   0.024   -0.229   -0.198   -0.182   -0.166   -0.134 1.001  6500
beta[5]     1.194   0.100    0.995    1.128    1.194    1.262    1.389 1.001  7500
beta0       0.791   0.085    0.626    0.733    0.790    0.848    0.960 1.001  6800
sigma       0.085   0.006    0.074    0.081    0.084    0.088    0.096 1.001 14000
deviance -264.409   3.899 -269.889 -267.273 -265.103 -262.305 -254.937 1.002  2600

For each parameter, n.eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor (at convergence, Rhat=1).

DIC info (using the rule, pD = var(deviance)/2)
pD = 7.6 and DIC = -256.8
DIC is an estimate of expected predictive error (lower deviance is better).

library(broom)
tidyMCMC(partridge.r2jags$BUGSoutput$sims.matrix, conf.int = TRUE, conf.method = "HPDinterval")

      term      estimate   std.error      conf.low     conf.high
1  beta[1]    0.02222087 0.024107038   -0.02387136  7.028915e-02
2  beta[2]    0.03655562 0.024131694   -0.01101267  8.252448e-02
3  beta[3]   -0.05390009 0.023948614   -0.09989605 -6.049349e-03
4  beta[4]   -0.18170503 0.024261555   -0.22861460 -1.338749e-01
5  beta[5]    1.19380034 0.100331683    1.00121995  1.392995e+00
6    beta0    0.79102584 0.085462563    0.62940375  9.630549e-01
7 deviance -264.40914257 3.898646906 -270.71869433 -2.567381e+02
8    sigma    0.08453005 0.005586101    0.07400793  9.562832e-02

partridge.mcmc = partridge.r2jags$BUGSoutput$sims.matrix
mcmcpvalue(partridge.mcmc[, "beta[1]"])

[1] 0.3587234

mcmcpvalue(partridge.mcmc[, "beta[2]"])

[1] 0.1301418

mcmcpvalue(partridge.mcmc[, "beta[3]"])

[1] 0.02468085

mcmcpvalue(partridge.mcmc[, "beta[4]"])

[1] 0

mcmcpvalue(partridge.mcmc[, "beta[5]"])

[1] 0

wch = grep("beta\\[", colnames(partridge.mcmc))
mcmcpvalue(partridge.mcmc[, wch])

[1] 0

Show RSTAN code

# summary(partridge.rstan)
library(broom)
partridge.mcmc = as.matrix(partridge.rstan)
tidyMCMC(partridge.rstan, conf.int = TRUE, conf.method = "HPDinterval", pars = c("beta0", "beta", "sigma"),
    ess = TRUE, rhat = TRUE)

     term    estimate   std.error    conf.low    conf.high     rhat  ess
1   beta0  0.79128931 0.086061774  0.62566968  0.961482006 1.000007 5827
2 beta[1]  0.02299254 0.024083810 -0.02571685  0.067810084 1.000434 5184
3 beta[2]  0.03628671 0.024376240 -0.01049910  0.083787879 1.000117 5274
4 beta[3] -0.05368096 0.023662283 -0.10081880 -0.009464246 1.000751 5262
5 beta[4] -0.18108684 0.024518494 -0.22869622 -0.132604319 1.000599 5420
6 beta[5]  1.19318572 0.100771538  1.00337859  1.396651732 1.000058 6383
7   sigma  0.08455309 0.005593207  0.07383390  0.095473807 1.000092 6376

mcmcpvalue(partridge.mcmc[, "beta[1]"])

[1] 0.3402963

mcmcpvalue(partridge.mcmc[, "beta[2]"])

[1] 0.1362963

mcmcpvalue(partridge.mcmc[, "beta[3]"])

[1] 0.02237037

mcmcpvalue(partridge.mcmc[, "beta[4]"])

[1] 0

mcmcpvalue(partridge.mcmc[, "beta[5]"])

[1] 0

wch = grep("beta\\[", colnames(partridge.mcmc))
mcmcpvalue(partridge.mcmc[, wch])

[1] 0

Show RSTANARM code

summary(partridge.rstanarm)

Model Info:

 function:  stan_glm
 family:    gaussian [identity]
 formula:   log10(LONGEV) ~ TREATMENT + THORAX
 algorithm: sampling
 priors:    see help('prior_summary')
 sample:    6750 (posterior sample size)
 num obs:   125

Estimates:
                 mean   sd    2.5%   25%   50%   75%   97.5%
(Intercept)      0.8    0.1   0.6    0.7   0.8   0.9   1.0  
TREATMENTPreg1   0.0    0.0   0.0    0.0   0.0   0.0   0.1  
TREATMENTPreg8   0.0    0.0   0.0    0.0   0.0   0.1   0.1  
TREATMENTVirg1  -0.1    0.0  -0.1   -0.1  -0.1   0.0   0.0  
TREATMENTVirg8  -0.2    0.0  -0.2   -0.2  -0.2  -0.2  -0.1  
THORAX           1.2    0.1   1.0    1.1   1.2   1.3   1.4  
sigma            0.1    0.0   0.1    0.1   0.1   0.1   0.1  
mean_PPD         1.7    0.0   1.7    1.7   1.7   1.7   1.8  
log-posterior  122.3    1.9 117.6  121.3 122.6 123.7 125.0  

Diagnostics:
               mcse Rhat n_eff
(Intercept)    0.0  1.0  5726 
TREATMENTPreg1 0.0  1.0  4536 
TREATMENTPreg8 0.0  1.0  4695 
TREATMENTVirg1 0.0  1.0  4830 
TREATMENTVirg8 0.0  1.0  4579 
THORAX         0.0  1.0  5922 
sigma          0.0  1.0  6149 
mean_PPD       0.0  1.0  6728 
log-posterior  0.0  1.0  4744 

For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence Rhat=1).

library(broom)
partridge.mcmc = as.matrix(partridge.rstanarm)
tidyMCMC(partridge.rstanarm$stanfit, conf.int = TRUE, conf.method = "HPDinterval", ess = TRUE, rhat = TRUE)

            term     estimate   std.error     conf.low     conf.high      rhat  ess
1    (Intercept)   0.79293103 0.085741974   0.63442374   0.967936705 0.9998648 5726
2 TREATMENTPreg1   0.02234996 0.024074877  -0.02338140   0.070865192 0.9999204 4536
3 TREATMENTPreg8   0.03641433 0.024403496  -0.01013371   0.086076915 1.0000061 4695
4 TREATMENTVirg1  -0.05403215 0.023548978  -0.09925124  -0.006906492 0.9997467 4830
5 TREATMENTVirg8  -0.18226103 0.023984235  -0.22966723  -0.136666158 1.0002442 4579
6         THORAX   1.19165812 0.100358278   0.98759586   1.382129994 0.9999683 5922
7          sigma   0.08462483 0.005556366   0.07493118   0.096523655 0.9996408 6149
8       mean_PPD   1.73567551 0.010668382   1.71548076   1.757712481 1.0000841 6728
9  log-posterior 122.29436093 1.937479394 118.54507644 125.359663348 0.9998040 4744

mcmcpvalue(partridge.mcmc[, "TREATMENTPreg1"])

[1] 0.354963

mcmcpvalue(partridge.mcmc[, "TREATMENTPreg8"])

[1] 0.1346667

mcmcpvalue(partridge.mcmc[, "TREATMENTVirg1"])

[1] 0.02222222

mcmcpvalue(partridge.mcmc[, "TREATMENTVirg8"])

[1] 0

mcmcpvalue(partridge.mcmc[, "THORAX"])

[1] 0

wch = grep("TREATMENT|THORAX", colnames(partridge.mcmc))
mcmcpvalue(partridge.mcmc[, wch])

[1] 0

Show BRMS code

summary(partridge.brm)

 Family: gaussian(identity) 
Formula: log10(LONGEV) ~ TREATMENT + THORAX 
   Data: partridge (Number of observations: 125) 
Samples: 3 chains, each with iter = 5000; warmup = 500; thin = 2; 
         total post-warmup samples = 6750
    ICs: LOO = NA; WAIC = NA; R2 = NA
 
Population-Level Effects: 
               Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
Intercept          0.79      0.09     0.63     0.96       6165    1
TREATMENTPreg1     0.02      0.02    -0.02     0.07       5774    1
TREATMENTPreg8     0.04      0.02    -0.01     0.08       5790    1
TREATMENTVirg1    -0.05      0.02    -0.10    -0.01       5633    1
TREATMENTVirg8    -0.18      0.02    -0.23    -0.13       5810    1
THORAX             1.19      0.10     1.00     1.39       6155    1

Family Specific Parameters: 
      Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sigma     0.08      0.01     0.07      0.1       6265    1

Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample 
is a crude measure of effective sample size, and Rhat is the potential 
scale reduction factor on split chains (at convergence, Rhat = 1).

library(broom)
partridge.mcmc = as.matrix(partridge.brm)
tidyMCMC(partridge.brm$fit, conf.int = TRUE, conf.method = "HPDinterval", ess = TRUE, rhat = TRUE)

              term    estimate   std.error     conf.low    conf.high      rhat  ess
1      b_Intercept  0.79087403 0.085559712  0.628005663  0.963051668 0.9999102 6165
2 b_TREATMENTPreg1  0.02240460 0.024011041 -0.024874580  0.068487474 0.9997021 5774
3 b_TREATMENTPreg8  0.03620256 0.023843389 -0.008463557  0.084372278 0.9999173 5790
4 b_TREATMENTVirg1 -0.05404743 0.023815863 -0.100910410 -0.008313408 0.9997183 5633
5 b_TREATMENTVirg8 -0.18198728 0.023869345 -0.228420724 -0.135256737 0.9997294 5810
6         b_THORAX  1.19414513 0.100754965  0.995884101  1.389117902 0.9998769 6155
7            sigma  0.08463568 0.005584061  0.074166370  0.095825344 1.0004425 6265

mcmcpvalue(partridge.mcmc[, "b_TREATMENTPreg1"])

[1] 0.3511111

mcmcpvalue(partridge.mcmc[, "b_TREATMENTPreg8"])

[1] 0.128

mcmcpvalue(partridge.mcmc[, "b_TREATMENTVirg1"])

[1] 0.024

mcmcpvalue(partridge.mcmc[, "b_TREATMENTVirg8"])

[1] 0

wch = grep("b_TREATMENT|b_THORAX", colnames(partridge.mcmc))
mcmcpvalue(partridge.mcmc[, wch])

[1] 0

Generate graphical summaries

Show MCMCpack code

library(MCMCpack)
partridge.mcmc = partridge.mcmcpack
## Calculate the fitted values
newdata = with(partridge, expand.grid(TREATMENT = levels(TREATMENT), THORAX = seq(min(THORAX),
    max(THORAX), len = 100)))
Xmat = model.matrix(~TREATMENT + THORAX, newdata)
wch = grep("Intercept|TREATMENT|THORAX", colnames(partridge.mcmc))
coefs = partridge.mcmc[, wch]
fit = 10^(coefs %*% t(Xmat))
newdata.95 = newdata %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.level = 0.95,
    conf.method = "HPDinterval"))
newdata.80 = newdata %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.level = 0.8,
    conf.method = "HPDinterval"))
newdata.95 %>% head

  TREATMENT    THORAX estimate std.error conf.low conf.high
1      None 0.6400000 35.98025  2.167746 31.85706  40.31875
2     Preg1 0.6400000 37.86609  2.167869 33.72704  42.22615
3     Preg8 0.6400000 39.12340  2.159726 34.98930  43.45699
4     Virg1 0.6400000 31.77057  1.920246 28.06600  35.52507
5     Virg8 0.6400000 23.65556  1.281681 21.25357  26.25122
6      None 0.6430303 36.27993  2.166315 32.15499  40.61696

ggplot(newdata.95, aes(y = estimate, x = THORAX, color = TREATMENT)) +
    geom_ribbon(aes(ymin = conf.low, ymax = conf.high, fill = TREATMENT),
        color = NA, alpha = 0.2) + geom_ribbon(data = newdata.80, aes(ymin = conf.low,
    ymax = conf.high, fill = TREATMENT), color = NA, alpha = 0.4) + geom_line() +
    scale_y_continuous("Male fruitfly longevity (days)") + scale_fill_discrete("Substrate type",
    breaks = c("None", "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners",
        "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner",
        "8 virgin\npartners")) + scale_color_discrete("Substrate type",
    breaks = c("None", "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners",
        "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner",
        "8 virgin\npartners")) + theme_classic()

# And now with partial residuals
fdata = rdata = partridge
fMat = rMat = model.matrix(~TREATMENT + THORAX, fdata)
fit = as.vector(apply(coefs, 2, median) %*% t(fMat))
resid = as.vector(log10(partridge$LONGEV) - apply(coefs, 2, median) %*%
    t(rMat))
rdata = rdata %>% mutate(partial.resid = 10^(resid + fit))

br <- log10(pretty(partridge$LONGEV))
ggplot(newdata.95, aes(y = estimate, x = THORAX, linetype = TREATMENT)) +
    geom_line() + geom_ribbon(aes(ymin = conf.low, ymax = conf.high), fill = "grey",
    alpha = 0.2) + geom_point(data = rdata, aes(y = partial.resid, shape = TREATMENT,
    fill = TREATMENT)) + scale_y_continuous("Longevity (days)") + scale_x_continuous("Thorax length (mm)") +
    scale_linetype_manual("Substrate type", values = 1:5, breaks = c("None",
        "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners",
        "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner",
        "8 virgin\npartners"), guide = guide_legend(ncol = 2)) + scale_shape_manual("Substrate type",
    values = c(23, 22, 21, 22, 21), breaks = c("None", "Preg1", "Preg8",
        "Virg1", "Virg8"), labels = c("No\npartners", "1 pregnant\npartner",
        "8 pregnant\npartners", "1 virgin\npartner", "8 virgin\npartners"),
    guide = guide_legend(ncol = 2)) + scale_fill_manual("Substrate type",
    values = c("black", "black", "black", "white", "white"), breaks = c("None",
        "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners",
        "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner",
        "8 virgin\npartners"), guide = guide_legend(ncol = 2)) + theme_classic() +
    theme(legend.justification = c(0, 1), legend.position = c(0.1, 1),
        axis.title.y = element_text(vjust = 2, size = rel(1.25)), axis.title.x = element_text(vjust = -2,
            size = rel(1.25)), plot.margin = unit(c(0.5, 0.5, 2, 2), "lines"))

Show JAGS code

partridge.mcmc = partridge.r2jags$BUGSoutput$sims.matrix
## Calculate the fitted values
newdata = with(partridge, expand.grid(TREATMENT = levels(TREATMENT), THORAX = seq(min(THORAX),
    max(THORAX), len = 100)))
Xmat = model.matrix(~TREATMENT + THORAX, newdata)
wch = c(which(colnames(partridge.mcmc) == "beta0"), grep("beta\\[", colnames(partridge.mcmc)))
coefs = partridge.mcmc[, wch]
fit = 10^(coefs %*% t(Xmat))
newdata.95 = newdata %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.level = 0.95,
    conf.method = "HPDinterval"))
newdata.80 = newdata %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.level = 0.8,
    conf.method = "HPDinterval"))
newdata.95 %>% head

  TREATMENT    THORAX estimate std.error conf.low conf.high
1      None 0.6400000 35.96099  2.150108 31.84513  40.20482
2     Preg1 0.6400000 37.84543  2.204559 33.43408  42.08953
3     Preg8 0.6400000 39.10798  2.144250 34.96804  43.38181
4     Virg1 0.6400000 31.76434  1.908763 28.00014  35.46843
5     Virg8 0.6400000 23.65820  1.274451 21.22775  26.21792
6      None 0.6430303 36.26065  2.148823 32.15733  40.50511

ggplot(newdata.95, aes(y = estimate, x = THORAX, color = TREATMENT)) +
    geom_ribbon(aes(ymin = conf.low, ymax = conf.high, fill = TREATMENT),
        color = NA, alpha = 0.2) + geom_ribbon(data = newdata.80, aes(ymin = conf.low,
    ymax = conf.high, fill = TREATMENT), color = NA, alpha = 0.4) + geom_line() +
    scale_y_continuous("Male fruitfly longevity (days)") + scale_fill_discrete("Substrate type",
    breaks = c("None", "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners",
        "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner",
        "8 virgin\npartners")) + scale_color_discrete("Substrate type",
    breaks = c("None", "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners",
        "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner",
        "8 virgin\npartners")) + theme_classic()

# And now with partial residuals
fdata = rdata = partridge
fMat = rMat = model.matrix(~TREATMENT + THORAX, fdata)
fit = as.vector(apply(coefs, 2, median) %*% t(fMat))
resid = as.vector(log10(partridge$LONGEV) - apply(coefs, 2, median) %*%
    t(rMat))
rdata = rdata %>% mutate(partial.resid = 10^(resid + fit))

br <- log10(pretty(partridge$LONGEV))
ggplot(newdata.95, aes(y = estimate, x = THORAX, linetype = TREATMENT)) +
    geom_line() + geom_ribbon(aes(ymin = conf.low, ymax = conf.high), fill = "grey",
    alpha = 0.2) + geom_point(data = rdata, aes(y = partial.resid, shape = TREATMENT,
    fill = TREATMENT)) + scale_y_continuous("Longevity (days)") + scale_x_continuous("Thorax length (mm)") +
    scale_linetype_manual("Substrate type", values = 1:5, breaks = c("None",
        "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners",
        "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner",
        "8 virgin\npartners"), guide = guide_legend(ncol = 2)) + scale_shape_manual("Substrate type",
    values = c(23, 22, 21, 22, 21), breaks = c("None", "Preg1", "Preg8",
        "Virg1", "Virg8"), labels = c("No\npartners", "1 pregnant\npartner",
        "8 pregnant\npartners", "1 virgin\npartner", "8 virgin\npartners"),
    guide = guide_legend(ncol = 2)) + scale_fill_manual("Substrate type",
    values = c("black", "black", "black", "white", "white"), breaks = c("None",
        "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners",
        "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner",
        "8 virgin\npartners"), guide = guide_legend(ncol = 2)) + theme_classic() +
    theme(legend.justification = c(0, 1), legend.position = c(0.1, 1),
        axis.title.y = element_text(vjust = 2, size = rel(1.25)), axis.title.x = element_text(vjust = -2,
            size = rel(1.25)), plot.margin = unit(c(0.5, 0.5, 2, 2), "lines"))

Show RSTAN code

partridge.mcmc = as.matrix(partridge.rstan)
## Calculate the fitted values
newdata = with(partridge, expand.grid(TREATMENT = levels(TREATMENT), THORAX = seq(min(THORAX),
    max(THORAX), len = 100)))
Xmat = model.matrix(~TREATMENT + THORAX, newdata)
wch = c(which(colnames(partridge.mcmc) == "beta0"), grep("beta\\[", colnames(partridge.mcmc)))
coefs = partridge.mcmc[, wch]
fit = 10^(coefs %*% t(Xmat))
newdata.95 = newdata %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.level = 0.95,
    conf.method = "HPDinterval"))
newdata.80 = newdata %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.level = 0.8,
    conf.method = "HPDinterval"))
newdata.95 %>% head

  TREATMENT    THORAX estimate std.error conf.low conf.high
1      None 0.6400000 35.95150  2.171163 32.04491  40.56730
2     Preg1 0.6400000 37.90240  2.225653 33.73523  42.38996
3     Preg8 0.6400000 39.07261  2.150087 34.97066  43.26534
4     Virg1 0.6400000 31.77152  1.919648 28.18076  35.68269
5     Virg8 0.6400000 23.68545  1.287215 21.32823  26.26502
6      None 0.6430303 36.25091  2.169872 32.34034  40.84279

ggplot(newdata.95, aes(y = estimate, x = THORAX, color = TREATMENT)) +
    geom_ribbon(aes(ymin = conf.low, ymax = conf.high, fill = TREATMENT),
        color = NA, alpha = 0.2) + geom_ribbon(data = newdata.80, aes(ymin = conf.low,
    ymax = conf.high, fill = TREATMENT), color = NA, alpha = 0.4) + geom_line() +
    scale_y_continuous("Male fruitfly longevity (days)") + scale_fill_discrete("Substrate type",
    breaks = c("None", "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners",
        "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner",
        "8 virgin\npartners")) + scale_color_discrete("Substrate type",
    breaks = c("None", "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners",
        "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner",
        "8 virgin\npartners")) + theme_classic()

# And now with partial residuals
fdata = rdata = partridge
fMat = rMat = model.matrix(~TREATMENT + THORAX, fdata)
fit = as.vector(apply(coefs, 2, median) %*% t(fMat))
resid = as.vector(log10(partridge$LONGEV) - apply(coefs, 2, median) %*%
    t(rMat))
rdata = rdata %>% mutate(partial.resid = 10^(resid + fit))

br <- log10(pretty(partridge$LONGEV))
ggplot(newdata.95, aes(y = estimate, x = THORAX, linetype = TREATMENT)) +
    geom_line() + geom_ribbon(aes(ymin = conf.low, ymax = conf.high), fill = "grey",
    alpha = 0.2) + geom_point(data = rdata, aes(y = partial.resid, shape = TREATMENT,
    fill = TREATMENT)) + scale_y_continuous("Longevity (days)") + scale_x_continuous("Thorax length (mm)") +
    scale_linetype_manual("Substrate type", values = 1:5, breaks = c("None",
        "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners",
        "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner",
        "8 virgin\npartners"), guide = guide_legend(ncol = 2)) + scale_shape_manual("Substrate type",
    values = c(23, 22, 21, 22, 21), breaks = c("None", "Preg1", "Preg8",
        "Virg1", "Virg8"), labels = c("No\npartners", "1 pregnant\npartner",
        "8 pregnant\npartners", "1 virgin\npartner", "8 virgin\npartners"),
    guide = guide_legend(ncol = 2)) + scale_fill_manual("Substrate type",
    values = c("black", "black", "black", "white", "white"), breaks = c("None",
        "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners",
        "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner",
        "8 virgin\npartners"), guide = guide_legend(ncol = 2)) + theme_classic() +
    theme(legend.justification = c(0, 1), legend.position = c(0.1, 1),
        axis.title.y = element_text(vjust = 2, size = rel(1.25)), axis.title.x = element_text(vjust = -2,
            size = rel(1.25)), plot.margin = unit(c(0.5, 0.5, 2, 2), "lines"))

Show RSTANARM code

partridge.mcmc = as.matrix(partridge.rstanarm)
## Calculate the fitted values
newdata = with(partridge, expand.grid(TREATMENT = levels(TREATMENT), THORAX = seq(min(THORAX),
    max(THORAX), len = 100)))
Xmat = model.matrix(~TREATMENT + THORAX, newdata)
wch = grep("Intercept|TREATMENT|THORAX", colnames(partridge.mcmc))
coefs = partridge.mcmc[, wch]
fit = 10^(coefs %*% t(Xmat))
newdata.95 = newdata %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.level = 0.95,
    conf.method = "HPDinterval"))
newdata.80 = newdata %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.level = 0.8,
    conf.method = "HPDinterval"))
newdata.95 %>% head

  TREATMENT    THORAX estimate std.error conf.low conf.high
1      None 0.6400000 36.00583  2.159989 31.81425  40.09867
2     Preg1 0.6400000 37.90180  2.179165 33.48592  42.07839
3     Preg8 0.6400000 39.14320  2.141342 35.04090  43.33835
4     Virg1 0.6400000 31.79228  1.886028 27.94081  35.30833
5     Virg8 0.6400000 23.65624  1.258929 21.30229  26.22091
6      None 0.6430303 36.30531  2.158638 32.06754  40.35430

ggplot(newdata.95, aes(y = estimate, x = THORAX, color = TREATMENT)) +
    geom_ribbon(aes(ymin = conf.low, ymax = conf.high, fill = TREATMENT),
        color = NA, alpha = 0.2) + geom_ribbon(data = newdata.80, aes(ymin = conf.low,
    ymax = conf.high, fill = TREATMENT), color = NA, alpha = 0.4) + geom_line() +
    scale_y_continuous("Male fruitfly longevity (days)") + scale_fill_discrete("Substrate type",
    breaks = c("None", "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners",
        "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner",
        "8 virgin\npartners")) + scale_color_discrete("Substrate type",
    breaks = c("None", "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners",
        "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner",
        "8 virgin\npartners")) + theme_classic()

# And now with partial residuals
fdata = rdata = partridge
fMat = rMat = model.matrix(~TREATMENT + THORAX, fdata)
fit = as.vector(apply(coefs, 2, median) %*% t(fMat))
resid = as.vector(log10(partridge$LONGEV) - apply(coefs, 2, median) %*%
    t(rMat))
rdata = rdata %>% mutate(partial.resid = 10^(resid + fit))

br <- log10(pretty(partridge$LONGEV))
ggplot(newdata.95, aes(y = estimate, x = THORAX, linetype = TREATMENT)) +
    geom_line() + geom_ribbon(aes(ymin = conf.low, ymax = conf.high), fill = "grey",
    alpha = 0.2) + geom_point(data = rdata, aes(y = partial.resid, shape = TREATMENT,
    fill = TREATMENT)) + scale_y_continuous("Longevity (days)") + scale_x_continuous("Thorax length (mm)") +
    scale_linetype_manual("Substrate type", values = 1:5, breaks = c("None",
        "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners",
        "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner",
        "8 virgin\npartners"), guide = guide_legend(ncol = 2)) + scale_shape_manual("Substrate type",
    values = c(23, 22, 21, 22, 21), breaks = c("None", "Preg1", "Preg8",
        "Virg1", "Virg8"), labels = c("No\npartners", "1 pregnant\npartner",
        "8 pregnant\npartners", "1 virgin\npartner", "8 virgin\npartners"),
    guide = guide_legend(ncol = 2)) + scale_fill_manual("Substrate type",
    values = c("black", "black", "black", "white", "white"), breaks = c("None",
        "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners",
        "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner",
        "8 virgin\npartners"), guide = guide_legend(ncol = 2)) + theme_classic() +
    theme(legend.justification = c(0, 1), legend.position = c(0.1, 1),
        axis.title.y = element_text(vjust = 2, size = rel(1.25)), axis.title.x = element_text(vjust = -2,
            size = rel(1.25)), plot.margin = unit(c(0.5, 0.5, 2, 2), "lines"))

Show BRMS code

partridge.mcmc = as.matrix(partridge.brm)
## Calculate the fitted values
newdata = with(partridge, expand.grid(TREATMENT = levels(TREATMENT), THORAX = seq(min(THORAX),
    max(THORAX), len = 100)))
Xmat = model.matrix(~TREATMENT + THORAX, newdata)
wch = grep("b_", colnames(partridge.mcmc))
coefs = partridge.mcmc[, wch]
fit = 10^(coefs %*% t(Xmat))
newdata.95 = newdata %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.level = 0.95,
    conf.method = "HPDinterval"))
newdata.80 = newdata %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.level = 0.8,
    conf.method = "HPDinterval"))
newdata.95 %>% head

  TREATMENT    THORAX estimate std.error conf.low conf.high
1      None 0.6400000 35.96536  2.125997 31.82643  40.13090
2     Preg1 0.6400000 37.86659  2.190934 33.72999  42.27928
3     Preg8 0.6400000 39.08282  2.151109 34.95758  43.39081
4     Virg1 0.6400000 31.75761  1.891817 27.93170  35.27973
5     Virg8 0.6400000 23.64734  1.288577 21.28669  26.30815
6      None 0.6430303 36.26515  2.124425 32.13702  40.42999

ggplot(newdata.95, aes(y = estimate, x = THORAX, color = TREATMENT)) +
    geom_ribbon(aes(ymin = conf.low, ymax = conf.high, fill = TREATMENT),
        color = NA, alpha = 0.2) + geom_ribbon(data = newdata.80, aes(ymin = conf.low,
    ymax = conf.high, fill = TREATMENT), color = NA, alpha = 0.4) + geom_line() +
    scale_y_continuous("Male fruitfly longevity (days)") + scale_fill_discrete("Substrate type",
    breaks = c("None", "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners",
        "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner",
        "8 virgin\npartners")) + scale_color_discrete("Substrate type",
    breaks = c("None", "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners",
        "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner",
        "8 virgin\npartners")) + theme_classic()

# And now with partial residuals
fdata = rdata = partridge
fMat = rMat = model.matrix(~TREATMENT + THORAX, fdata)
fit = as.vector(apply(coefs, 2, median) %*% t(fMat))
resid = as.vector(log10(partridge$LONGEV) - apply(coefs, 2, median) %*%
    t(rMat))
rdata = rdata %>% mutate(partial.resid = 10^(resid + fit))

br <- log10(pretty(partridge$LONGEV))
ggplot(newdata.95, aes(y = estimate, x = THORAX, linetype = TREATMENT)) +
    geom_line() + geom_ribbon(aes(ymin = conf.low, ymax = conf.high), fill = "grey",
    alpha = 0.2) + geom_point(data = rdata, aes(y = partial.resid, shape = TREATMENT,
    fill = TREATMENT)) + scale_y_continuous("Longevity (days)") + scale_x_continuous("Thorax length (mm)") +
    scale_linetype_manual("Substrate type", values = 1:5, breaks = c("None",
        "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners",
        "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner",
        "8 virgin\npartners"), guide = guide_legend(ncol = 2)) + scale_shape_manual("Substrate type",
    values = c(23, 22, 21, 22, 21), breaks = c("None", "Preg1", "Preg8",
        "Virg1", "Virg8"), labels = c("No\npartners", "1 pregnant\npartner",
        "8 pregnant\npartners", "1 virgin\npartner", "8 virgin\npartners"),
    guide = guide_legend(ncol = 2)) + scale_fill_manual("Substrate type",
    values = c("black", "black", "black", "white", "white"), breaks = c("None",
        "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners",
        "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner",
        "8 virgin\npartners"), guide = guide_legend(ncol = 2)) + theme_classic() +
    theme(legend.justification = c(0, 1), legend.position = c(0.1, 1),
        axis.title.y = element_text(vjust = 2, size = rel(1.25)), axis.title.x = element_text(vjust = -2,
            size = rel(1.25)), plot.margin = unit(c(0.5, 0.5, 2, 2), "lines"))

We have established that the male fruitfly longevity varies across the treatments. The effects model directly compared each of the partner type to the control treatment with no partners. We might also be interested in describing the difference in male fruitfly longevity between other combinations of partner types. Lets compare each treatment group to each other in a pairwise manner. Note, as we elected to fit the model on logarithmic (base ten) transformed longevity, we have to be a bit careful in how we process the results. Without any back transforms, the comparisons will be on a log scale. If we wish to express them on the natural response scale (difference in days, we will need to do a bit more manipulation and remember our log laws ($log(A) - log(B) = log(A)/log(B)$). Thus if we back transform the contrasts, we will get results that reflect the factor change.

Show MCMCpack code

library(MCMCpack)
partridge.mcmc = partridge.mcmcpack
wch = grep("Intercept|TREATMENT|THORAX", colnames(partridge.mcmc))
coefs = partridge.mcmc[, wch]
newdata = with(partridge, expand.grid(TREATMENT = levels(TREATMENT), THORAX = mean(THORAX, na.rm = TRUE)))
# A Tukeys contrast matrix
library(multcomp)
tuk.mat <- contrMat(n = table(newdata$TREATMENT), type = "Tukey")
Xmat <- model.matrix(~TREATMENT + THORAX, data = newdata)
pairwise.mat <- tuk.mat %*% Xmat
pairwise.mat

              (Intercept) TREATMENTPreg1 TREATMENTPreg8 TREATMENTVirg1 TREATMENTVirg8 THORAX
Preg1 - None            0              1              0              0              0      0
Preg8 - None            0              0              1              0              0      0
Virg1 - None            0              0              0              1              0      0
Virg8 - None            0              0              0              0              1      0
Preg8 - Preg1           0             -1              1              0              0      0
Virg1 - Preg1           0             -1              0              1              0      0
Virg8 - Preg1           0             -1              0              0              1      0
Virg1 - Preg8           0              0             -1              1              0      0
Virg8 - Preg8           0              0             -1              0              1      0
Virg8 - Virg1           0              0              0             -1              1      0

mcmc_areas(coefs %*% t(pairwise.mat))

(comps = tidyMCMC(coefs %*% t(pairwise.mat), conf.int = TRUE, conf.method = "HPDinterval"))

            term    estimate  std.error    conf.low    conf.high
1   Preg1 - None  0.02226244 0.02386887 -0.02580077  0.068161673
2   Preg8 - None  0.03649878 0.02414175 -0.01202824  0.082087102
3   Virg1 - None -0.05404332 0.02401023 -0.10160748 -0.007550831
4   Virg8 - None -0.18197874 0.02414616 -0.23133368 -0.135891212
5  Preg8 - Preg1  0.01423634 0.02396522 -0.03512377  0.059624253
6  Virg1 - Preg1 -0.07630575 0.02391487 -0.12156718 -0.028062534
7  Virg8 - Preg1 -0.20424117 0.02371341 -0.25045812 -0.156732472
8  Virg1 - Preg8 -0.09054210 0.02433891 -0.13738602 -0.041521436
9  Virg8 - Preg8 -0.21847752 0.02396819 -0.26613085 -0.171726858
10 Virg8 - Virg1 -0.12793542 0.02426936 -0.17484823 -0.080668677

ggplot(comps, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) +
    geom_hline(yintercept = 0, linetype = "dashed") + scale_y_continuous("Effect size") + scale_x_discrete("") +
    coord_flip() + theme_classic()

## With back transformed
mcmc_areas(10^(coefs %*% t(pairwise.mat)))

(comps1 = tidyMCMC(10^(coefs %*% t(pairwise.mat)), conf.int = TRUE, conf.method = "HPDinterval"))

            term  estimate  std.error  conf.low conf.high
1   Preg1 - None 1.0541883 0.05796784 0.9423218 1.1699348
2   Preg8 - None 1.0893557 0.06060224 0.9684462 1.2032111
3   Virg1 - None 0.8843419 0.04891830 0.7906505 0.9820142
4   Virg8 - None 0.6587068 0.03662892 0.5855194 0.7297431
5  Preg8 - Preg1 1.0348974 0.05712792 0.9223085 1.1471607
6  Virg1 - Preg1 0.8401419 0.04629954 0.7512269 0.9324124
7  Virg8 - Preg1 0.6257574 0.03418221 0.5595466 0.6945545
8  Virg1 - Preg8 0.8130922 0.04560181 0.7219419 0.9014199
9  Virg8 - Preg8 0.6055967 0.03343095 0.5375230 0.6689410
10 Virg8 - Virg1 0.7460068 0.04173700 0.6685775 0.8304841

ggplot(comps1, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) +
    geom_hline(yintercept = 1, linetype = "dashed") + scale_y_continuous("Effect size") + scale_x_discrete("") +
    coord_flip() + theme_classic()

Show JAGS code

library(MCMCpack)
partridge.mcmc = partridge.r2jags$BUGSoutput$sims.matrix
wch = c(which(colnames(partridge.mcmc) == "beta0"), grep("beta\\[", colnames(partridge.mcmc)))
coefs = partridge.mcmc[, wch]
newdata = with(partridge, expand.grid(TREATMENT = levels(TREATMENT), THORAX = mean(THORAX, na.rm = TRUE)))
# A Tukeys contrast matrix
library(multcomp)
tuk.mat <- contrMat(n = table(newdata$TREATMENT), type = "Tukey")
Xmat <- model.matrix(~TREATMENT + THORAX, data = newdata)
pairwise.mat <- tuk.mat %*% Xmat
pairwise.mat

              (Intercept) TREATMENTPreg1 TREATMENTPreg8 TREATMENTVirg1 TREATMENTVirg8 THORAX
Preg1 - None            0              1              0              0              0      0
Preg8 - None            0              0              1              0              0      0
Virg1 - None            0              0              0              1              0      0
Virg8 - None            0              0              0              0              1      0
Preg8 - Preg1           0             -1              1              0              0      0
Virg1 - Preg1           0             -1              0              1              0      0
Virg8 - Preg1           0             -1              0              0              1      0
Virg1 - Preg8           0              0             -1              1              0      0
Virg8 - Preg8           0              0             -1              0              1      0
Virg8 - Virg1           0              0              0             -1              1      0

mcmc_areas(coefs %*% t(pairwise.mat))

(comps = tidyMCMC(coefs %*% t(pairwise.mat), conf.int = TRUE, conf.method = "HPDinterval"))

            term    estimate  std.error    conf.low    conf.high
1   Preg1 - None  0.02222087 0.02410704 -0.02387136  0.070289150
2   Preg8 - None  0.03655562 0.02413169 -0.01101267  0.082524482
3   Virg1 - None -0.05390009 0.02394861 -0.09989605 -0.006049349
4   Virg8 - None -0.18170503 0.02426155 -0.22861460 -0.133874873
5  Preg8 - Preg1  0.01433475 0.02402038 -0.03409306  0.060470474
6  Virg1 - Preg1 -0.07612096 0.02380513 -0.12371581 -0.030310117
7  Virg8 - Preg1 -0.20392590 0.02419627 -0.25216901 -0.157132792
8  Virg1 - Preg8 -0.09045571 0.02415687 -0.13813097 -0.043683249
9  Virg8 - Preg8 -0.21826065 0.02403122 -0.26479914 -0.171535168
10 Virg8 - Virg1 -0.12780494 0.02419583 -0.17496673 -0.080363292

ggplot(comps, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) +
    geom_hline(yintercept = 0, linetype = "dashed") + scale_y_continuous("Effect size") + scale_x_discrete("") +
    coord_flip() + theme_classic()

## With back transformed
mcmc_areas(10^(coefs %*% t(pairwise.mat)))

(comps1 = tidyMCMC(10^(coefs %*% t(pairwise.mat)), conf.int = TRUE, conf.method = "HPDinterval"))

            term  estimate  std.error  conf.low conf.high
1   Preg1 - None 1.0541196 0.05855850 0.9400143 1.1681953
2   Preg8 - None 1.0894963 0.06054854 0.9744850 1.2087235
3   Virg1 - None 0.8846269 0.04881624 0.7922053 0.9837108
4   Virg8 - None 0.6591326 0.03686490 0.5880233 0.7316295
5  Preg8 - Preg1 1.0351401 0.05731642 0.9245000 1.1493981
6  Virg1 - Preg1 0.8404883 0.04613413 0.7521149 0.9325881
7  Virg8 - Preg1 0.6262506 0.03492673 0.5595398 0.6964135
8  Virg1 - Preg8 0.8132350 0.04527198 0.7270732 0.9038008
9  Virg8 - Preg8 0.6059049 0.03357668 0.5435016 0.6736973
10 Virg8 - Virg1 0.7462239 0.04162240 0.6678105 0.8304426

ggplot(comps1, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) +
    geom_hline(yintercept = 1, linetype = "dashed") + scale_y_continuous("Effect size") + scale_x_discrete("") +
    coord_flip() + theme_classic()

Show RSTAN code

library(MCMCpack)
partridge.mcmc = as.matrix(partridge.rstan)
Xmat = model.matrix(~TREATMENT + THORAX, newdata)
wch = c(which(colnames(partridge.mcmc) == "beta0"), grep("beta\\[", colnames(partridge.mcmc)))
coefs = partridge.mcmc[, wch]
newdata = with(partridge, expand.grid(TREATMENT = levels(TREATMENT), THORAX = mean(THORAX, na.rm = TRUE)))
# A Tukeys contrast matrix
library(multcomp)
tuk.mat <- contrMat(n = table(newdata$TREATMENT), type = "Tukey")
Xmat <- model.matrix(~TREATMENT + THORAX, data = newdata)
pairwise.mat <- tuk.mat %*% Xmat
pairwise.mat

              (Intercept) TREATMENTPreg1 TREATMENTPreg8 TREATMENTVirg1 TREATMENTVirg8 THORAX
Preg1 - None            0              1              0              0              0      0
Preg8 - None            0              0              1              0              0      0
Virg1 - None            0              0              0              1              0      0
Virg8 - None            0              0              0              0              1      0
Preg8 - Preg1           0             -1              1              0              0      0
Virg1 - Preg1           0             -1              0              1              0      0
Virg8 - Preg1           0             -1              0              0              1      0
Virg1 - Preg8           0              0             -1              1              0      0
Virg8 - Preg8           0              0             -1              0              1      0
Virg8 - Virg1           0              0              0             -1              1      0

mcmc_areas(coefs %*% t(pairwise.mat))

(comps = tidyMCMC(coefs %*% t(pairwise.mat), conf.int = TRUE, conf.method = "HPDinterval"))

            term    estimate  std.error    conf.low    conf.high
1   Preg1 - None  0.02299254 0.02408381 -0.02571685  0.067810084
2   Preg8 - None  0.03628671 0.02437624 -0.01049910  0.083787879
3   Virg1 - None -0.05368096 0.02366228 -0.10081880 -0.009464246
4   Virg8 - None -0.18108684 0.02451849 -0.22869622 -0.132604319
5  Preg8 - Preg1  0.01329417 0.02395326 -0.03264648  0.062360678
6  Virg1 - Preg1 -0.07667351 0.02383856 -0.12247183 -0.030528752
7  Virg8 - Preg1 -0.20407938 0.02424689 -0.25291361 -0.156602516
8  Virg1 - Preg8 -0.08996768 0.02416122 -0.13611621 -0.041840729
9  Virg8 - Preg8 -0.21737355 0.02391817 -0.26345342 -0.170901602
10 Virg8 - Virg1 -0.12740587 0.02428197 -0.17411665 -0.079437471

ggplot(comps, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) +
    geom_hline(yintercept = 0, linetype = "dashed") + scale_y_continuous("Effect size") + scale_x_discrete("") +
    coord_flip() + theme_classic()

## With back transformed
mcmc_areas(10^(coefs %*% t(pairwise.mat)))

(comps1 = tidyMCMC(10^(coefs %*% t(pairwise.mat)), conf.int = TRUE, conf.method = "HPDinterval"))

            term  estimate  std.error  conf.low conf.high
1   Preg1 - None 1.0559907 0.05858701 0.9383382 1.1646937
2   Preg8 - None 1.0888563 0.06114756 0.9758079 1.2124721
3   Virg1 - None 0.8850402 0.04819619 0.7928321 0.9784435
4   Virg8 - None 0.6600920 0.03724069 0.5898256 0.7359186
5  Preg8 - Preg1 1.0326541 0.05703137 0.9215931 1.1470933
6  Virg1 - Preg1 0.8394233 0.04615294 0.7542723 0.9321188
7  Virg8 - Preg1 0.6260333 0.03498437 0.5583422 0.6969889
8  Virg1 - Preg8 0.8141491 0.04529563 0.7309435 0.9081535
9  Virg8 - Preg8 0.6071349 0.03348412 0.5447600 0.6741852
10 Virg8 - Virg1 0.7469190 0.04185802 0.6680023 0.8307720

ggplot(comps1, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) +
    geom_hline(yintercept = 1, linetype = "dashed") + scale_y_continuous("Effect size") + scale_x_discrete("") +
    coord_flip() + theme_classic()

Show RSTANARM code

library(MCMCpack)
partridge.mcmc = as.matrix(partridge.rstanarm)
Xmat = model.matrix(~TREATMENT + THORAX, newdata)
wch = grep("Intercept|TREATMENT|THORAX", colnames(partridge.mcmc))
coefs = partridge.mcmc[, wch]
newdata = with(partridge, expand.grid(TREATMENT = levels(TREATMENT), THORAX = mean(THORAX, na.rm = TRUE)))
# A Tukeys contrast matrix
library(multcomp)
tuk.mat <- contrMat(n = table(newdata$TREATMENT), type = "Tukey")
Xmat <- model.matrix(~TREATMENT + THORAX, data = newdata)
pairwise.mat <- tuk.mat %*% Xmat
pairwise.mat

              (Intercept) TREATMENTPreg1 TREATMENTPreg8 TREATMENTVirg1 TREATMENTVirg8 THORAX
Preg1 - None            0              1              0              0              0      0
Preg8 - None            0              0              1              0              0      0
Virg1 - None            0              0              0              1              0      0
Virg8 - None            0              0              0              0              1      0
Preg8 - Preg1           0             -1              1              0              0      0
Virg1 - Preg1           0             -1              0              1              0      0
Virg8 - Preg1           0             -1              0              0              1      0
Virg1 - Preg8           0              0             -1              1              0      0
Virg8 - Preg8           0              0             -1              0              1      0
Virg8 - Virg1           0              0              0             -1              1      0

mcmc_areas(coefs %*% t(pairwise.mat))

(comps = tidyMCMC(coefs %*% t(pairwise.mat), conf.int = TRUE, conf.method = "HPDinterval"))

            term    estimate  std.error    conf.low    conf.high
1   Preg1 - None  0.02234996 0.02407488 -0.02338140  0.070865192
2   Preg8 - None  0.03641433 0.02440350 -0.01013371  0.086076915
3   Virg1 - None -0.05403215 0.02354898 -0.09925124 -0.006906492
4   Virg8 - None -0.18226103 0.02398424 -0.22966723 -0.136666158
5  Preg8 - Preg1  0.01406437 0.02412565 -0.02968572  0.065601507
6  Virg1 - Preg1 -0.07638211 0.02392343 -0.12328881 -0.029989351
7  Virg8 - Preg1 -0.20461099 0.02415189 -0.25341139 -0.159071623
8  Virg1 - Preg8 -0.09044648 0.02433552 -0.13815607 -0.042642378
9  Virg8 - Preg8 -0.21867536 0.02387648 -0.26542574 -0.170323352
10 Virg8 - Virg1 -0.12822888 0.02393159 -0.17425325 -0.080769365

ggplot(comps, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) +
    geom_hline(yintercept = 0, linetype = "dashed") + scale_y_continuous("Effect size") + scale_x_discrete("") +
    coord_flip() + theme_classic()

## With back transformed
mcmc_areas(10^(coefs %*% t(pairwise.mat)))

(comps1 = tidyMCMC(10^(coefs %*% t(pairwise.mat)), conf.int = TRUE, conf.method = "HPDinterval"))

            term  estimate  std.error  conf.low conf.high
1   Preg1 - None 1.0544289 0.05852064 0.9411695 1.1700444
2   Preg8 - None 1.0891800 0.06122090 0.9669837 1.2075192
3   Virg1 - None 0.8843145 0.04804840 0.7933975 0.9816892
4   Virg8 - None 0.6582655 0.03637684 0.5892950 0.7300185
5  Preg8 - Preg1 1.0345108 0.05759814 0.9221479 1.1490921
6  Virg1 - Preg1 0.8399958 0.04635241 0.7520730 0.9324562
7  Virg8 - Preg1 0.6252592 0.03476928 0.5579414 0.6933115
8  Virg1 - Preg8 0.8132717 0.04565201 0.7240084 0.9027741
9  Virg8 - Preg8 0.6053146 0.03332360 0.5402115 0.6728358
10 Virg8 - Virg1 0.7454694 0.04105743 0.6646395 0.8252720

ggplot(comps1, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) +
    geom_hline(yintercept = 1, linetype = "dashed") + scale_y_continuous("Effect size") + scale_x_discrete("") +
    coord_flip() + theme_classic()

Show BRMS code

library(MCMCpack)
partridge.mcmc = as.matrix(partridge.brm)
Xmat = model.matrix(~TREATMENT + THORAX, newdata)
wch = grep("b_", colnames(partridge.mcmc))
coefs = partridge.mcmc[, wch]
newdata = with(partridge, expand.grid(TREATMENT = levels(TREATMENT), THORAX = mean(THORAX, na.rm = TRUE)))
# A Tukeys contrast matrix
library(multcomp)
tuk.mat <- contrMat(n = table(newdata$TREATMENT), type = "Tukey")
Xmat <- model.matrix(~TREATMENT + THORAX, data = newdata)
pairwise.mat <- tuk.mat %*% Xmat
pairwise.mat

              (Intercept) TREATMENTPreg1 TREATMENTPreg8 TREATMENTVirg1 TREATMENTVirg8 THORAX
Preg1 - None            0              1              0              0              0      0
Preg8 - None            0              0              1              0              0      0
Virg1 - None            0              0              0              1              0      0
Virg8 - None            0              0              0              0              1      0
Preg8 - Preg1           0             -1              1              0              0      0
Virg1 - Preg1           0             -1              0              1              0      0
Virg8 - Preg1           0             -1              0              0              1      0
Virg1 - Preg8           0              0             -1              1              0      0
Virg8 - Preg8           0              0             -1              0              1      0
Virg8 - Virg1           0              0              0             -1              1      0

mcmc_areas(coefs %*% t(pairwise.mat))

(comps = tidyMCMC(coefs %*% t(pairwise.mat), conf.int = TRUE, conf.method = "HPDinterval"))

            term    estimate  std.error     conf.low    conf.high
1   Preg1 - None  0.02240460 0.02401104 -0.024874580  0.068487474
2   Preg8 - None  0.03620256 0.02384339 -0.008463557  0.084372278
3   Virg1 - None -0.05404743 0.02381586 -0.100910410 -0.008313408
4   Virg8 - None -0.18198728 0.02386935 -0.228420724 -0.135256737
5  Preg8 - Preg1  0.01379796 0.02421139 -0.033271153  0.061604483
6  Virg1 - Preg1 -0.07645203 0.02403209 -0.123225559 -0.028926763
7  Virg8 - Preg1 -0.20439189 0.02427000 -0.250323885 -0.155967121
8  Virg1 - Preg8 -0.09024999 0.02394566 -0.136639855 -0.043432356
9  Virg8 - Preg8 -0.21818985 0.02415905 -0.265948399 -0.171503685
10 Virg8 - Virg1 -0.12793985 0.02412511 -0.176962850 -0.081758344

ggplot(comps, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) +
    geom_hline(yintercept = 0, linetype = "dashed") + scale_y_continuous("Effect size") + scale_x_discrete("") +
    coord_flip() + theme_classic()

## With back transformed
mcmc_areas(10^(coefs %*% t(pairwise.mat)))

(comps1 = tidyMCMC(10^(coefs %*% t(pairwise.mat)), conf.int = TRUE, conf.method = "HPDinterval"))

            term  estimate  std.error  conf.low conf.high
1   Preg1 - None 1.0545530 0.05836951 0.9443336 1.1708128
2   Preg8 - None 1.0885725 0.05986066 0.9797715 1.2133891
3   Virg1 - None 0.8843113 0.04850064 0.7926648 0.9810397
4   Virg8 - None 0.6586722 0.03629164 0.5909888 0.7323914
5  Preg8 - Preg1 1.0338855 0.05765372 0.9262513 1.1524033
6  Virg1 - Preg1 0.8398717 0.04652671 0.7490572 0.9312345
7  Virg8 - Preg1 0.6255845 0.03497534 0.5603392 0.6965585
8  Virg1 - Preg8 0.8135973 0.04483752 0.7239839 0.8980175
9  Virg8 - Preg8 0.6060138 0.03377457 0.5403503 0.6717646
10 Virg8 - Virg1 0.7459856 0.04150406 0.6613482 0.8240819

ggplot(comps1, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) +
    geom_hline(yintercept = 1, linetype = "dashed") + scale_y_continuous("Effect size") + scale_x_discrete("") +
    coord_flip() + theme_classic()

So (for example), the longevity of male fruitflies with 8 virgin partners is approximately 0.61 that of male fruitflies with 8 pregnant partners. Similar contrasts to those defined in Tutorial 7.4b, Q3 could be defined to explore specific comparisons.

Explore $R^2$

Show MCMCpack code

library(MCMCpack)
library(broom)
partridge.mcmc <- partridge.mcmcpack
Xmat = model.matrix(~TREATMENT + THORAX, data = partridge)
wch = grep("Intercept|TREATMENT|THORAX", colnames(partridge.mcmc))
coefs = partridge.mcmc[, wch]
fit = coefs %*% t(Xmat)
resid = sweep(fit, 2, log10(partridge$LONGEV), "-")
var_f = apply(fit, 1, var)
var_e = apply(resid, 1, var)
R2 = var_f/(var_f + var_e)
tidyMCMC(as.mcmc(R2), conf.int = TRUE, conf.method = "HPDinterval")

  term  estimate  std.error  conf.low conf.high
1 var1 0.6984863 0.02600139 0.6472273 0.7451868

# for comparison with frequentist
summary(lm(log10(LONGEV) ~ TREATMENT + THORAX, data = partridge))

Call:
lm(formula = log10(LONGEV) ~ TREATMENT + THORAX, data = partridge)

Residuals:
      Min        1Q    Median        3Q       Max 
-0.226736 -0.058445 -0.003469  0.059961  0.170389 

Coefficients:
               Estimate Std. Error t value Pr(>|t|)    
(Intercept)     0.79095    0.08443   9.368 5.89e-16 ***
TREATMENTPreg1  0.02260    0.02368   0.954   0.3419    
TREATMENTPreg8  0.03648    0.02385   1.530   0.1287    
TREATMENTVirg1 -0.05381    0.02366  -2.275   0.0247 *  
TREATMENTVirg8 -0.18165    0.02392  -7.592 7.79e-12 ***
THORAX          1.19385    0.09900  12.060  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.08364 on 119 degrees of freedom
Multiple R-squared:  0.7055,	Adjusted R-squared:  0.6932 
F-statistic: 57.02 on 5 and 119 DF,  p-value: < 2.2e-16

Show JAGS code

partridge.mcmc = partridge.r2jags$BUGSoutput$sims.matrix
wch = c(which(colnames(partridge.mcmc) == "beta0"), grep("beta\\[", colnames(partridge.mcmc)))
coefs = partridge.mcmc[, wch]
Xmat = model.matrix(~TREATMENT + THORAX, data = partridge)
fit = coefs %*% t(Xmat)
resid = sweep(fit, 2, log10(partridge$LONGEV), "-")
var_f = apply(fit, 1, var)
var_e = apply(resid, 1, var)
R2 = var_f/(var_f + var_e)
tidyMCMC(as.mcmc(R2), conf.int = TRUE, conf.method = "HPDinterval")

  term  estimate  std.error  conf.low conf.high
1 var1 0.6983588 0.02576383 0.6474857 0.7445594

# for comparison with frequentist
summary(lm(log10(LONGEV) ~ TREATMENT + THORAX, data = partridge))

Call:
lm(formula = log10(LONGEV) ~ TREATMENT + THORAX, data = partridge)

Residuals:
      Min        1Q    Median        3Q       Max 
-0.226736 -0.058445 -0.003469  0.059961  0.170389 

Coefficients:
               Estimate Std. Error t value Pr(>|t|)    
(Intercept)     0.79095    0.08443   9.368 5.89e-16 ***
TREATMENTPreg1  0.02260    0.02368   0.954   0.3419    
TREATMENTPreg8  0.03648    0.02385   1.530   0.1287    
TREATMENTVirg1 -0.05381    0.02366  -2.275   0.0247 *  
TREATMENTVirg8 -0.18165    0.02392  -7.592 7.79e-12 ***
THORAX          1.19385    0.09900  12.060  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.08364 on 119 degrees of freedom
Multiple R-squared:  0.7055,	Adjusted R-squared:  0.6932 
F-statistic: 57.02 on 5 and 119 DF,  p-value: < 2.2e-16

Show RSTAN code

partridge.mcmc = as.matrix(partridge.rstan)
Xmat = model.matrix(~TREATMENT + THORAX, newdata)
wch = c(which(colnames(partridge.mcmc) == "beta0"), grep("beta\\[", colnames(partridge.mcmc)))
coefs = partridge.mcmc[, wch]
fit = coefs %*% t(Xmat)
resid = sweep(fit, 2, log10(partridge$LONGEV), "-")
var_f = apply(fit, 1, var)
var_e = apply(resid, 1, var)
R2 = var_f/(var_f + var_e)
tidyMCMC(as.mcmc(R2), conf.int = TRUE, conf.method = "HPDinterval")

  term estimate std.error  conf.low conf.high
1 var1 0.376688  0.148472 0.1275442 0.6879338

# for comparison with frequentist
summary(lm(log10(LONGEV) ~ TREATMENT + THORAX, data = partridge))

Call:
lm(formula = log10(LONGEV) ~ TREATMENT + THORAX, data = partridge)

Residuals:
      Min        1Q    Median        3Q       Max 
-0.226736 -0.058445 -0.003469  0.059961  0.170389 

Coefficients:
               Estimate Std. Error t value Pr(>|t|)    
(Intercept)     0.79095    0.08443   9.368 5.89e-16 ***
TREATMENTPreg1  0.02260    0.02368   0.954   0.3419    
TREATMENTPreg8  0.03648    0.02385   1.530   0.1287    
TREATMENTVirg1 -0.05381    0.02366  -2.275   0.0247 *  
TREATMENTVirg8 -0.18165    0.02392  -7.592 7.79e-12 ***
THORAX          1.19385    0.09900  12.060  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.08364 on 119 degrees of freedom
Multiple R-squared:  0.7055,	Adjusted R-squared:  0.6932 
F-statistic: 57.02 on 5 and 119 DF,  p-value: < 2.2e-16

Show RSTANRM code

partridge.mcmc = as.matrix(partridge.rstanarm)
Xmat = model.matrix(~TREATMENT + THORAX, newdata)
wch = grep("Intercept|TREATMENT|THORAX", colnames(partridge.mcmc))
coefs = partridge.mcmc[, wch]
fit = coefs %*% t(Xmat)
resid = sweep(fit, 2, log10(partridge$LONGEV), "-")
var_f = apply(fit, 1, var)
var_e = apply(resid, 1, var)
R2 = var_f/(var_f + var_e)
tidyMCMC(as.mcmc(R2), conf.int = TRUE, conf.method = "HPDinterval")

  term  estimate std.error  conf.low conf.high
1 var1 0.3773228 0.1478388 0.1241807 0.6818132

# for comparison with frequentist
summary(lm(log10(LONGEV) ~ TREATMENT + THORAX, data = partridge))

Call:
lm(formula = log10(LONGEV) ~ TREATMENT + THORAX, data = partridge)

Residuals:
      Min        1Q    Median        3Q       Max 
-0.226736 -0.058445 -0.003469  0.059961  0.170389 

Coefficients:
               Estimate Std. Error t value Pr(>|t|)    
(Intercept)     0.79095    0.08443   9.368 5.89e-16 ***
TREATMENTPreg1  0.02260    0.02368   0.954   0.3419    
TREATMENTPreg8  0.03648    0.02385   1.530   0.1287    
TREATMENTVirg1 -0.05381    0.02366  -2.275   0.0247 *  
TREATMENTVirg8 -0.18165    0.02392  -7.592 7.79e-12 ***
THORAX          1.19385    0.09900  12.060  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.08364 on 119 degrees of freedom
Multiple R-squared:  0.7055,	Adjusted R-squared:  0.6932 
F-statistic: 57.02 on 5 and 119 DF,  p-value: < 2.2e-16

Show BRMS code

partridge.mcmc = as.matrix(partridge.brm)
Xmat = model.matrix(~TREATMENT + THORAX, newdata)
wch = grep("b_", colnames(partridge.mcmc))
coefs = partridge.mcmc[, wch]
fit = coefs %*% t(Xmat)
resid = sweep(fit, 2, log10(partridge$LONGEV), "-")
var_f = apply(fit, 1, var)
var_e = apply(resid, 1, var)
R2 = var_f/(var_f + var_e)
tidyMCMC(as.mcmc(R2), conf.int = TRUE, conf.method = "HPDinterval")

  term  estimate std.error  conf.low conf.high
1 var1 0.3771413 0.1483151 0.1227373 0.6817201

# for comparison with frequentist
summary(lm(log10(LONGEV) ~ TREATMENT + THORAX, data = partridge))

Call:
lm(formula = log10(LONGEV) ~ TREATMENT + THORAX, data = partridge)

Residuals:
      Min        1Q    Median        3Q       Max 
-0.226736 -0.058445 -0.003469  0.059961  0.170389 

Coefficients:
               Estimate Std. Error t value Pr(>|t|)    
(Intercept)     0.79095    0.08443   9.368 5.89e-16 ***
TREATMENTPreg1  0.02260    0.02368   0.954   0.3419    
TREATMENTPreg8  0.03648    0.02385   1.530   0.1287    
TREATMENTVirg1 -0.05381    0.02366  -2.275   0.0247 *  
TREATMENTVirg8 -0.18165    0.02392  -7.592 7.79e-12 ***
THORAX          1.19385    0.09900  12.060  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.08364 on 119 degrees of freedom
Multiple R-squared:  0.7055,	Adjusted R-squared:  0.6932 
F-statistic: 57.02 on 5 and 119 DF,  p-value: < 2.2e-16

Heterogeneous slopes

Constable (1993) compared the inter-radial suture widths of urchins maintained on one of three food regimes (Initial: no additional food supplied above what was in the initial sample, low: food supplied periodically and high: food supplied ad libitum. In an attempt to control for substantial variability in urchin sizes, the initial body volume of each urchin was measured as a covariate.

Download Partridge1 data set

Format of constable.csv data file

TREAT	IV	SUTW
Initial	3.5	0.010
Initial	5.0	0.020
Initial	8.0	0.061
..	..	..

TREAT	Categorical listing of the foot treatment (Initial: no additional food supplied above what was in the initial sample, low: food supplied periodically and high: food supplied ad libium)
IV	Continuous covariate, the initial volume of the urchin
SUTW	Width of the suture. Response variable.

Open

the constable data file.

Show code

constable <- read.csv("../downloads/data/constable.csv", strip.white = T)
head(constable)

    TREAT   IV  SUTW
1 Initial  3.5 0.010
2 Initial  5.0 0.020
3 Initial  8.0 0.061
4 Initial 10.0 0.051
5 Initial 13.0 0.041
6 Initial 13.0 0.061

Perform basic exploratory data analysis to examined.

normality
homogeneity of variance
homogeneity of slopes (and equality of covariate ranges)

View full code

ggplot(constable, aes(y = SUTW, x = IV, color = TREAT)) + geom_smooth() + geom_point() + theme_classic()

# Or with linear smoother
ggplot(constable, aes(y = SUTW, x = IV, color = TREAT)) + geom_smooth(method = "lm") + geom_point() +
    theme_classic()

It would appear that it is likely that homogeneity of slopes has been violated - the slope associated with the Low treatment appears less steep than the slopes associated with either High or Initial. Although the range of the continuous covariate (IV) is slightly less for the High treatment (compared to the Initial and Low treatments), the difference is not dramatic (and appears broadly symmetrical).

To obtain a quick estimation of the likely residuals, lets explore the residuals from a simple linear model fitted in a frequentist framework.
View full code
plot(lm(SUTW ~ IV * TREAT, data = constable))

Constable applied a third root transform on IV to improve normality and linearity. For consistency sake, we will do the same.

View full code

plot(lm(SUTW ~ I(IV^(1/3)) * TREAT, data = constable))

Given the likely heterogeneous slopes, it is arguably best to fit a model that permits heterogeneous slopes (that is, includes the IV by TREAT interaction). Fit the appropriate Bayesian model to explore the effect of treatment and inital volume (and their interaction) on the width of sutures. $$ \begin{align} y_{ij} &\sim{} N(\mu_{ij}, \sigma)\\ \mu_{ij} &= \beta_0 + \mathbf{X}\boldsymbol{\beta}\\[1em] \beta_0 &\sim{} N(0,10)\\ \beta &\sim{} N(0,10)\\ \sigma &\sim{} cauchy(0,5)\\ \end{align} $$

View MCMCpack code

library(MCMCpack)
constable.mcmcpack = MCMCregress(SUTW ~ I(IV^(1/3)) * TREAT, data = constable)

Show JAGS code

modelString = "
  model {
  #Likelihood
  for (i in 1:n) {
  y[i]~dnorm(mu[i],tau)
  mu[i] <- beta0 + inprod(beta[],X[i,])
  }
  #Priors
  beta0 ~ dnorm(0.01,1.0E-6)
  for (j in 1:nX) {
  beta[j] ~ dnorm(0.01,1.0E-6)
  }
  tau <- 1 / (sigma * sigma)
  sigma~dunif(0,100)
  }
  "

X = model.matrix(~I(IV^(1/3)) * TREAT, data = constable)
constable.list <- with(constable, list(y = SUTW, X = X[, -1], nX = ncol(X) -
    1, n = nrow(constable)))

params <- c("beta0", "beta", "sigma")
burnInSteps = 3000
nChains = 3
numSavedSteps = 15000
thinSteps = 10
nIter = ceiling((numSavedSteps * thinSteps)/nChains)

constable.r2jags <- jags(data = constable.list, inits = NULL, parameters.to.save = params,
    model.file = textConnection(modelString), n.chains = nChains, n.iter = nIter,
    n.burnin = burnInSteps, n.thin = thinSteps)

Compiling model graph
   Resolving undeclared variables
   Allocating nodes
Graph information:
   Observed stochastic nodes: 72
   Unobserved stochastic nodes: 7
   Total graph size: 645

Initializing model

Show RSTAN code

					  modelString=" 
					  data { 
					  int n;   // total number of observations 
					  vector[n] Y;      // response variable 
					  int nX;  // number of effects 
					  matrix[n, nX] X;   // model matrix 
					  } 
					  transformed data { 
					  matrix[n, nX - 1] Xc;  // centered version of X 
					  vector[nX - 1] means_X;  // column means of X before centering 
					  
					  for (i in 2:nX) { 
					  means_X[i - 1] = mean(X[, i]); 
					  Xc[, i - 1] = X[, i] - means_X[i - 1]; 
					  }  
					  } 
					  parameters { 
					  vector[nX-1] beta;  // population-level effects 
					  real cbeta0;  // center-scale intercept 
					  real sigma;  // residual SD 
					  } 
					  transformed parameters { 
					  } 
					  model { 
					  vector[n] mu; 
					  mu = Xc * beta + cbeta0; 
					  // prior specifications 
					  beta ~ normal(0, 10); 
					  cbeta0 ~ normal(0, 10); 
					  sigma ~ cauchy(0, 5); 
					  // likelihood contribution 
					  Y ~ normal(mu, sigma); 
					  } 
					  generated quantities { 
					  real beta0;  // population-level intercept 
					  vector[n] log_lik;
					  beta0 = cbeta0 - dot_product(means_X, beta);
					  for (i in 1:n) {
					  log_lik[i] = normal_lpdf(Y[i] | Xc[i] * beta + cbeta0, sigma);
					  }
					  }
					  "

X = model.matrix(~I(IV^(1/3)) * TREAT, data = constable)
constable.list <- with(constable, list(Y = SUTW, X = X, nX = ncol(X), n = nrow(constable)))

library(rstan)
constable.rstan <- stan(data = constable.list, model_code = modelString, chains = 3, iter = 5000, warmup = 500,
    thin = 5)

SAMPLING FOR MODEL 'd98dbf6a02725fc3fce11306b77873e9' NOW (CHAIN 1).

Gradient evaluation took 1.9e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.19 seconds.
Adjust your expectations accordingly!


Iteration:    1 / 5000 [  0%]  (Warmup)
Iteration:  500 / 5000 [ 10%]  (Warmup)
Iteration:  501 / 5000 [ 10%]  (Sampling)
Iteration: 1000 / 5000 [ 20%]  (Sampling)
Iteration: 1500 / 5000 [ 30%]  (Sampling)
Iteration: 2000 / 5000 [ 40%]  (Sampling)
Iteration: 2500 / 5000 [ 50%]  (Sampling)
Iteration: 3000 / 5000 [ 60%]  (Sampling)
Iteration: 3500 / 5000 [ 70%]  (Sampling)
Iteration: 4000 / 5000 [ 80%]  (Sampling)
Iteration: 4500 / 5000 [ 90%]  (Sampling)
Iteration: 5000 / 5000 [100%]  (Sampling)

 Elapsed Time: 0.144418 seconds (Warm-up)
               0.984664 seconds (Sampling)
               1.12908 seconds (Total)


SAMPLING FOR MODEL 'd98dbf6a02725fc3fce11306b77873e9' NOW (CHAIN 2).

Gradient evaluation took 1e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.1 seconds.
Adjust your expectations accordingly!


Iteration:    1 / 5000 [  0%]  (Warmup)
Iteration:  500 / 5000 [ 10%]  (Warmup)
Iteration:  501 / 5000 [ 10%]  (Sampling)
Iteration: 1000 / 5000 [ 20%]  (Sampling)
Iteration: 1500 / 5000 [ 30%]  (Sampling)
Iteration: 2000 / 5000 [ 40%]  (Sampling)
Iteration: 2500 / 5000 [ 50%]  (Sampling)
Iteration: 3000 / 5000 [ 60%]  (Sampling)
Iteration: 3500 / 5000 [ 70%]  (Sampling)
Iteration: 4000 / 5000 [ 80%]  (Sampling)
Iteration: 4500 / 5000 [ 90%]  (Sampling)
Iteration: 5000 / 5000 [100%]  (Sampling)

 Elapsed Time: 0.140768 seconds (Warm-up)
               1.06787 seconds (Sampling)
               1.20863 seconds (Total)


SAMPLING FOR MODEL 'd98dbf6a02725fc3fce11306b77873e9' NOW (CHAIN 3).

Gradient evaluation took 1.6e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.16 seconds.
Adjust your expectations accordingly!


Iteration:    1 / 5000 [  0%]  (Warmup)
Iteration:  500 / 5000 [ 10%]  (Warmup)
Iteration:  501 / 5000 [ 10%]  (Sampling)
Iteration: 1000 / 5000 [ 20%]  (Sampling)
Iteration: 1500 / 5000 [ 30%]  (Sampling)
Iteration: 2000 / 5000 [ 40%]  (Sampling)
Iteration: 2500 / 5000 [ 50%]  (Sampling)
Iteration: 3000 / 5000 [ 60%]  (Sampling)
Iteration: 3500 / 5000 [ 70%]  (Sampling)
Iteration: 4000 / 5000 [ 80%]  (Sampling)
Iteration: 4500 / 5000 [ 90%]  (Sampling)
Iteration: 5000 / 5000 [100%]  (Sampling)

 Elapsed Time: 0.15773 seconds (Warm-up)
               1.077 seconds (Sampling)
               1.23473 seconds (Total)

Show RSTANARM code

constable.rstanarm = stan_glm(SUTW ~ I(IV^(1/3)) * TREAT, data = constable,
    iter = 5000, warmup = 500, chains = 3, thin = 5, refresh = 0, prior_intercept = normal(0,
        10), prior = normal(0, 10), prior_aux = cauchy(0, 5))

Gradient evaluation took 2.8e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.28 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.276502 seconds (Warm-up)
               1.72962 seconds (Sampling)
               2.00612 seconds (Total)


Gradient evaluation took 1.3e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.13 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.256832 seconds (Warm-up)
               1.64027 seconds (Sampling)
               1.8971 seconds (Total)


Gradient evaluation took 1.3e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.13 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.263878 seconds (Warm-up)
               1.5121 seconds (Sampling)
               1.77597 seconds (Total)

Show BRMS code

constable.brm = brm(SUTW ~ I(IV^(1/3)) * TREAT, data = constable, iter = 5000,
    warmup = 500, chains = 3, thin = 5, refresh = 0, prior = c(prior(normal(0,
        10), class = "Intercept"), prior(normal(0, 10), class = "b"), prior(cauchy(0,
        5), class = "sigma")))

Gradient evaluation took 1.6e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.16 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.143567 seconds (Warm-up)
               1.0836 seconds (Sampling)
               1.22716 seconds (Total)


Gradient evaluation took 9e-06 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.09 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.133804 seconds (Warm-up)
               1.12441 seconds (Sampling)
               1.25822 seconds (Total)


Gradient evaluation took 1.1e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.11 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.139708 seconds (Warm-up)
               1.10982 seconds (Sampling)
               1.24953 seconds (Total)

Explore MCMC diagnostics

Show MCMCpack code

library(MCMCpack)
plot(constable.mcmcpack)

raftery.diag(constable.mcmcpack)

Quantile (q) = 0.025
Accuracy (r) = +/- 0.005
Probability (s) = 0.95 
                                                                
                          Burn-in  Total Lower bound  Dependence
                          (M)      (N)   (Nmin)       factor (I)
 (Intercept)              2        3865  3746         1.030     
 I(IV^(1/3))              2        3650  3746         0.974     
 TREATInitial             2        3710  3746         0.990     
 TREATLow                 2        3710  3746         0.990     
 I(IV^(1/3)):TREATInitial 2        3994  3746         1.070     
 I(IV^(1/3)):TREATLow     2        3897  3746         1.040     
 sigma2                   2        3929  3746         1.050

autocorr.diag(constable.mcmcpack)

        (Intercept)  I(IV^(1/3)) TREATInitial      TREATLow I(IV^(1/3)):TREATInitial
Lag 0   1.000000000  1.000000000  1.000000000  1.0000000000              1.000000000
Lag 1  -0.005949108 -0.008338260 -0.008217864  0.0004490301             -0.010759741
Lag 5   0.020882211  0.023896208 -0.003101549  0.0115059926              0.001748665
Lag 10  0.015981305  0.019153534 -0.002528803  0.0022271117              0.001748848
Lag 50  0.007896860  0.005767901  0.008420514 -0.0058100440              0.003191181
       I(IV^(1/3)):TREATLow       sigma2
Lag 0          1.0000000000 1.0000000000
Lag 1         -0.0006921279 0.0907121678
Lag 5          0.0146699538 0.0005388206
Lag 10         0.0066382223 0.0170684058
Lag 50        -0.0064255545 0.0075084152

Show JAGS code

library(R2jags)
library(coda)
constable.mcmc = as.mcmc(constable.r2jags)
plot(constable.mcmc)

raftery.diag(constable.mcmc)

[[1]]

Quantile (q) = 0.025
Accuracy (r) = +/- 0.005
Probability (s) = 0.95 
                                                
          Burn-in  Total Lower bound  Dependence
          (M)      (N)   (Nmin)       factor (I)
 beta0    20       38330 3746         10.20     
 beta[1]  10       37670 3746         10.10     
 beta[2]  10       37660 3746         10.10     
 beta[3]  20       37020 3746          9.88     
 beta[4]  20       38330 3746         10.20     
 beta[5]  20       39000 3746         10.40     
 deviance 10       37660 3746         10.10     
 sigma    20       37020 3746          9.88     


[[2]]

Quantile (q) = 0.025
Accuracy (r) = +/- 0.005
Probability (s) = 0.95 
                                                
          Burn-in  Total Lower bound  Dependence
          (M)      (N)   (Nmin)       factor (I)
 beta0    20       37020 3746          9.88     
 beta[1]  20       36380 3746          9.71     
 beta[2]  10       37660 3746         10.10     
 beta[3]  20       37020 3746          9.88     
 beta[4]  20       36380 3746          9.71     
 beta[5]  20       36380 3746          9.71     
 deviance 20       38330 3746         10.20     
 sigma    20       38330 3746         10.20     


[[3]]

Quantile (q) = 0.025
Accuracy (r) = +/- 0.005
Probability (s) = 0.95 
                                                
          Burn-in  Total Lower bound  Dependence
          (M)      (N)   (Nmin)       factor (I)
 beta0    20       38330 3746         10.20     
 beta[1]  20       39000 3746         10.40     
 beta[2]  10       37660 3746         10.10     
 beta[3]  10       37660 3746         10.10     
 beta[4]  20       38330 3746         10.20     
 beta[5]  20       36380 3746          9.71     
 deviance 10       37660 3746         10.10     
 sigma    20       36380 3746          9.71

autocorr.diag(constable.mcmc)

                beta0       beta[1]      beta[2]       beta[3]      beta[4]       beta[5]
Lag 0    1.0000000000  1.0000000000  1.000000000  1.0000000000  1.000000000  1.0000000000
Lag 10  -0.0008583855 -0.0008855223  0.005002736  0.0044326527  0.005410357  0.0049190350
Lag 50  -0.0046942718 -0.0052147927 -0.002716165  0.0001732884 -0.003266711  0.0001285954
Lag 100  0.0073399128  0.0063289399  0.005797145  0.0107637633  0.004719665  0.0102551639
Lag 500  0.0009396995  0.0022012866 -0.004393656 -0.0112211954 -0.003541198 -0.0096620252
            deviance        sigma
Lag 0    1.000000000  1.000000000
Lag 10   0.003079761 -0.004958203
Lag 50  -0.013198225 -0.008532456
Lag 100 -0.006321487 -0.014973307
Lag 500  0.007831084 -0.006825210

Show RSTAN code

library(rstan)
library(coda)
s = as.array(constable.rstan)
constable.mcmc <- do.call(mcmc.list, plyr:::alply(s[, , c("beta0", "beta[1]", "beta[2]", "beta[3]", "beta[4]",
    "beta[5]", "beta[6]", "sigma")], 2, as.mcmc))

Error in s[, , c("beta0", "beta[1]", "beta[2]", "beta[3]", "beta[4]", : subscript out of bounds

plot(constable.mcmc)

raftery.diag(constable.mcmc)

[[1]]

Quantile (q) = 0.025
Accuracy (r) = +/- 0.005
Probability (s) = 0.95 
                                                
          Burn-in  Total Lower bound  Dependence
          (M)      (N)   (Nmin)       factor (I)
 beta0    20       38330 3746         10.20     
 beta[1]  10       37670 3746         10.10     
 beta[2]  10       37660 3746         10.10     
 beta[3]  20       37020 3746          9.88     
 beta[4]  20       38330 3746         10.20     
 beta[5]  20       39000 3746         10.40     
 deviance 10       37660 3746         10.10     
 sigma    20       37020 3746          9.88     


[[2]]

Quantile (q) = 0.025
Accuracy (r) = +/- 0.005
Probability (s) = 0.95 
                                                
          Burn-in  Total Lower bound  Dependence
          (M)      (N)   (Nmin)       factor (I)
 beta0    20       37020 3746          9.88     
 beta[1]  20       36380 3746          9.71     
 beta[2]  10       37660 3746         10.10     
 beta[3]  20       37020 3746          9.88     
 beta[4]  20       36380 3746          9.71     
 beta[5]  20       36380 3746          9.71     
 deviance 20       38330 3746         10.20     
 sigma    20       38330 3746         10.20     


[[3]]

Quantile (q) = 0.025
Accuracy (r) = +/- 0.005
Probability (s) = 0.95 
                                                
          Burn-in  Total Lower bound  Dependence
          (M)      (N)   (Nmin)       factor (I)
 beta0    20       38330 3746         10.20     
 beta[1]  20       39000 3746         10.40     
 beta[2]  10       37660 3746         10.10     
 beta[3]  10       37660 3746         10.10     
 beta[4]  20       38330 3746         10.20     
 beta[5]  20       36380 3746          9.71     
 deviance 10       37660 3746         10.10     
 sigma    20       36380 3746          9.71

autocorr.diag(constable.mcmc)

                beta0       beta[1]      beta[2]       beta[3]      beta[4]       beta[5]
Lag 0    1.0000000000  1.0000000000  1.000000000  1.0000000000  1.000000000  1.0000000000
Lag 10  -0.0008583855 -0.0008855223  0.005002736  0.0044326527  0.005410357  0.0049190350
Lag 50  -0.0046942718 -0.0052147927 -0.002716165  0.0001732884 -0.003266711  0.0001285954
Lag 100  0.0073399128  0.0063289399  0.005797145  0.0107637633  0.004719665  0.0102551639
Lag 500  0.0009396995  0.0022012866 -0.004393656 -0.0112211954 -0.003541198 -0.0096620252
            deviance        sigma
Lag 0    1.000000000  1.000000000
Lag 10   0.003079761 -0.004958203
Lag 50  -0.013198225 -0.008532456
Lag 100 -0.006321487 -0.014973307
Lag 500  0.007831084 -0.006825210

library(rstan)
library(coda)
stan_ac(constable.rstan, pars = c("beta0", "beta", "sigma"))

stan_rhat(constable.rstan, pars = c("beta0", "beta", "sigma"))

stan_ess(constable.rstan, pars = c("beta0", "beta", "sigma"))

# using Bayeseplot
library(bayesplot)
detach("package:reshape")
mcmc_trace(as.array(constable.rstan), regex_par = "beta|sigma")

mcmc_trace(as.array(constable.rstan), regex_pars = "beta|sigma")

mcmc_dens(as.array(constable.rstan), regex_par = "beta|sigma")

detach("package:reshape")
library(bayesplot)
mcmc_combo(as.array(constable.rstan), regex_par = "beta|sigma")

Show RSTANARM code

library(rstanarm)
library(coda)
s = as.array(constable.rstanarm)
colnames(as.matrix(constable.rstanarm))

[1] "(Intercept)"              "I(IV^(1/3))"              "TREATInitial"            
[4] "TREATLow"                 "I(IV^(1/3)):TREATInitial" "I(IV^(1/3)):TREATLow"    
[7] "sigma"

constable.mcmc <- do.call(mcmc.list, plyr:::alply(s[, , colnames(as.matrix(constable.rstanarm))], 2,
    as.mcmc))
plot(constable.mcmc)

raftery.diag(constable.mcmc)

$`1`

Quantile (q) = 0.025
Accuracy (r) = +/- 0.005
Probability (s) = 0.95 

You need a sample size of at least 3746 with these values of q, r and s

$`2`

Quantile (q) = 0.025
Accuracy (r) = +/- 0.005
Probability (s) = 0.95 

You need a sample size of at least 3746 with these values of q, r and s

$`3`

Quantile (q) = 0.025
Accuracy (r) = +/- 0.005
Probability (s) = 0.95 

You need a sample size of at least 3746 with these values of q, r and s

autocorr.diag(constable.mcmc)

        (Intercept)  I(IV^(1/3)) TREATInitial     TREATLow I(IV^(1/3)):TREATInitial
Lag 0   1.000000000  1.000000000   1.00000000  1.000000000              1.000000000
Lag 1   0.094390624  0.093078031   0.09427246  0.088553480              0.094495420
Lag 5   0.014550240  0.017891288   0.02036189 -0.003087282              0.025476705
Lag 10 -0.018253834 -0.015818047  -0.01071217 -0.007707348             -0.006127881
Lag 50  0.003953687  0.002826433   0.01488185  0.010416070              0.012186204
       I(IV^(1/3)):TREATLow        sigma
Lag 0           1.000000000  1.000000000
Lag 1           0.085966027  0.019297999
Lag 5           0.004263184 -0.001089781
Lag 10         -0.007911326  0.013127724
Lag 50          0.002241020  0.019541951

library(rstanarm)
library(coda)
stan_ac(constable.rstanarm, regex_pars = "Intercept|TREAT|IV|sigma")

stan_rhat(constable.rstanarm, regex_pars = "Intercept|TREAT|IV|sigma")

stan_ess(constable.rstanarm, regex_pars = "Intercept|TREAT|IV|sigma")

# using Bayeseplot
library(bayesplot)
detach("package:reshape")
mcmc_trace(as.array(constable.rstanarm), regex_par = "Intercept|TREAT|IV|sigma")

mcmc_trace(as.array(constable.rstanarm), regex_pars = "Intercept|TREAT|IV|sigma")

mcmc_dens(as.array(constable.rstanarm), regex_pars = "Intercept|TREAT|IV|sigma")

detach("package:reshape")
library(bayesplot)
mcmc_combo(as.array(constable.rstanarm), regex_par = "Intercept|TREAT|IV|sigma")

library(rstanarm)
posterior_vs_prior(constable.rstanarm, color_by = "vs", group_by = TRUE,
    facet_args = list(scales = "free_y"))

Gradient evaluation took 2.2e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.22 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.048119 seconds (Warm-up)
               0.080688 seconds (Sampling)
               0.128807 seconds (Total)


Gradient evaluation took 1.5e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.15 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.038981 seconds (Warm-up)
               0.124669 seconds (Sampling)
               0.16365 seconds (Total)

Show BRMS code

library(coda)
library(brms)
constable.mcmc = as.mcmc(constable.brm)
plot(constable.mcmc)

raftery.diag(constable.mcmc)

[[1]]

Quantile (q) = 0.025
Accuracy (r) = +/- 0.005
Probability (s) = 0.95 

You need a sample size of at least 3746 with these values of q, r and s

[[2]]

Quantile (q) = 0.025
Accuracy (r) = +/- 0.005
Probability (s) = 0.95 

You need a sample size of at least 3746 with these values of q, r and s

[[3]]

Quantile (q) = 0.025
Accuracy (r) = +/- 0.005
Probability (s) = 0.95 

You need a sample size of at least 3746 with these values of q, r and s

autocorr.diag(constable.mcmc)

Error in ts(x, start = start(x), end = end(x), deltat = thin(x)): invalid time series parameters specified

library(coda)
stan_ac(constable.brm$fit)

stan_rhat(constable.brm$fit)

stan_ess(constable.brm$fit)

The chains appear well mixed and have converged on what is likely to be a stable posterior.

Perform model validation

Show MCMCpack code

library(MCMCpack)
constable.mcmc = as.data.frame(constable.mcmcpack)
# generate a model matrix
newdata = constable
Xmat = model.matrix(~I(IV^(1/3)) * TREAT, newdata)
## get median parameter estimates
wch = grep("Intercept|TREAT|IV", colnames(constable.mcmc))
coefs = apply(constable.mcmc[, wch], 2, median)
fit = as.vector(coefs %*% t(Xmat))
resid = constable$SUTW - fit
ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))

# Plot residuals against TREAT level
newdata = newdata %>% cbind(fit, resid)
newdata.melt = newdata %>% gather(key = Pred, value = Value, TREAT, IV)
ggplot(newdata.melt) + geom_point(aes(y = resid, x = Value)) + facet_wrap(~Pred, scales = "free")

# Standardized residuals
sresid = resid/sd(resid)
ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))

library(MCMCpack)
constable.mcmc = as.matrix(constable.mcmcpack)
# generate a model matrix
Xmat = model.matrix(~I(IV^(1/3)) * TREAT, newdata)
## get median parameter estimates
wch = grep("Intercept|TREAT|IV", colnames(constable.mcmc))
coefs = constable.mcmc[, wch]
fit = (coefs %*% t(Xmat))
## draw samples from this model
yRep = sapply(1:nrow(constable.mcmc), function(i) rnorm(nrow(constable),
    fit[i, ], sqrt(constable.mcmc[i, "sigma2"])))
newdata = data.frame(TREAT = constable$TREAT, IV = constable$IV, yRep) %>%
    gather(key = Sample, value = Value, -TREAT, -IV)
ggplot(newdata) + geom_violin(aes(y = Value, x = TREAT, fill = "Model"),
    alpha = 0.5) + geom_violin(data = constable, aes(y = SUTW, x = TREAT,
    fill = "Obs"), alpha = 0.5) + geom_point(data = constable, aes(y = SUTW,
    x = TREAT), position = position_jitter(width = 0.1, height = 0), color = "black")

ggplot(newdata) + geom_violin(aes(y = Value, x = IV, fill = "Model", group = IV,
    color = TREAT), alpha = 0.5) + geom_point(data = constable, aes(y = SUTW,
    x = IV, group = IV, color = TREAT))

We can also explore the posteriors of each parameter.

library(bayesplot)
mcmc_intervals(as.matrix(constable.mcmcpack), regex_pars = "TREAT|IV")

mcmc_areas(as.matrix(constable.mcmcpack), regex_pars = "TREAT|IV")

Show JAGS code

constable.mcmc = constable.r2jags$BUGSoutput$sims.matrix
# generate a model matrix
newdata = constable
Xmat = model.matrix(~I(IV^(1/3)) * TREAT, newdata)
## get median parameter estimates
wch = c(which(colnames(constable.mcmc) == "beta0"), grep("beta\\[", colnames(constable.mcmc)))
coefs = apply(constable.mcmc[, wch], 2, median)
fit = as.vector(coefs %*% t(Xmat))
resid = constable$SUTW - fit
ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))

# Plot residuals against TREAT level
newdata = newdata %>% cbind(fit, resid)
newdata.melt = newdata %>% gather(key = Pred, value = Value, TREAT, IV)
ggplot(newdata.melt) + geom_point(aes(y = resid, x = Value)) + facet_wrap(~Pred, scales = "free")

# Standardized residuals
sresid = resid/sd(resid)
ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))

constable.mcmc = constable.r2jags$BUGSoutput$sims.matrix
# generate a model matrix
Xmat = model.matrix(~I(IV^(1/3)) * TREAT, newdata)
## get median parameter estimates
wch = c(which(colnames(constable.mcmc) == "beta0"), grep("beta\\[", colnames(constable.mcmc)))
coefs = constable.mcmc[, wch]
fit = (coefs %*% t(Xmat))
## draw samples from this model
yRep = sapply(1:nrow(constable.mcmc), function(i) rnorm(nrow(constable),
    fit[i, ], constable.mcmc[i, "sigma"]))
newdata = data.frame(TREAT = constable$TREAT, IV = constable$IV, yRep) %>%
    gather(key = Sample, value = Value, -TREAT, -IV)
ggplot(newdata) + geom_violin(aes(y = Value, x = TREAT, fill = "Model"),
    alpha = 0.5) + geom_violin(data = constable, aes(y = SUTW, x = TREAT,
    fill = "Obs"), alpha = 0.5) + geom_point(data = constable, aes(y = SUTW,
    x = TREAT), position = position_jitter(width = 0.1, height = 0), color = "black")

ggplot(newdata) + geom_violin(aes(y = Value, x = IV, fill = "Model", group = IV,
    color = TREAT), alpha = 0.5) + geom_point(data = constable, aes(y = SUTW,
    x = IV, group = IV, color = TREAT))

We can also explore the posteriors of each parameter.

library(bayesplot)
mcmc_intervals(constable.r2jags$BUGSoutput$sims.matrix, regex_pars = "beta|sigma")

mcmc_areas(constable.r2jags$BUGSoutput$sims.matrix, regex_pars = "beta|sigma")

Show RSTAN code

constable.mcmc = as.matrix(constable.rstan)
# generate a model matrix
newdata = constable
Xmat = model.matrix(~I(IV^(1/3)) * TREAT, newdata)
## get median parameter estimates
wch = c(which(colnames(constable.mcmc) == "beta0"), grep("beta\\[", colnames(constable.mcmc)))
coefs = apply(constable.mcmc[, wch], 2, median)
fit = as.vector(coefs %*% t(Xmat))
resid = constable$SUTW - fit
ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))

# Plot residuals against TREAT level
newdata = newdata %>% cbind(fit, resid)
newdata.melt = newdata %>% gather(key = Pred, value = Value, TREAT, IV)
ggplot(newdata.melt) + geom_point(aes(y = resid, x = Value)) + facet_wrap(~Pred, scales = "free")

# Standardized residuals
sresid = resid/sd(resid)
ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))

constable.mcmc = as.matrix(constable.rstan)
# generate a model matrix
Xmat = model.matrix(~I(IV^(1/3)) * TREAT, newdata)
## get median parameter estimates
wch = c(which(colnames(constable.mcmc) == "beta0"), grep("beta\\[", colnames(constable.mcmc)))
coefs = constable.mcmc[, wch]
fit = (coefs %*% t(Xmat))
## draw samples from this model
yRep = sapply(1:nrow(constable.mcmc), function(i) rnorm(nrow(constable),
    fit[i, ], constable.mcmc[i, "sigma"]))
newdata = data.frame(TREAT = constable$TREAT, IV = constable$IV, yRep) %>%
    gather(key = Sample, value = Value, -TREAT, -IV)
ggplot(newdata) + geom_violin(aes(y = Value, x = TREAT, fill = "Model"),
    alpha = 0.5) + geom_violin(data = constable, aes(y = SUTW, x = TREAT,
    fill = "Obs"), alpha = 0.5) + geom_point(data = constable, aes(y = SUTW,
    x = TREAT), position = position_jitter(width = 0.1, height = 0), color = "black")

ggplot(newdata) + geom_violin(aes(y = Value, x = IV, fill = "Model", group = IV,
    color = TREAT), alpha = 0.5) + geom_point(data = constable, aes(y = SUTW,
    x = IV, group = IV, color = TREAT))

We can also explore the posteriors of each parameter.

library(bayesplot)
mcmc_intervals(as.matrix(constable.rstan), regex_pars = "beta|sigma")

mcmc_areas(as.matrix(constable.rstan), regex_pars = "beta|sigma")

Show RSTANARM code

constable.mcmc = as.matrix(constable.rstanarm)
# generate a model matrix
newdata = constable
Xmat = model.matrix(~I(IV^(1/3)) * TREAT, newdata)
## get median parameter estimates
wch = grep("Intercept|TREAT|IV", colnames(constable.mcmc))
coefs = apply(constable.mcmc[, wch], 2, median)
fit = as.vector(coefs %*% t(Xmat))
resid = (constable$SUTW) - fit
ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))

# Plot residuals against TREAT level
newdata = newdata %>% cbind(fit, resid)
newdata.melt = newdata %>% gather(key = Pred, value = Value, TREAT, IV)
ggplot(newdata.melt) + geom_point(aes(y = resid, x = Value)) + facet_wrap(~Pred, scales = "free")

# Standardized residuals
sresid = resid/sd(resid)
ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))

constable.mcmc = as.matrix(constable.rstanarm)
# generate a model matrix
Xmat = model.matrix(~I(IV^(1/3)) * TREAT, newdata)
## get median parameter estimates
wch = grep("Intercept|TREAT|IV", colnames(constable.mcmc))
coefs = constable.mcmc[, wch]
fit = (coefs %*% t(Xmat))
## draw samples from this model
yRep = sapply(1:nrow(constable.mcmc), function(i) rnorm(nrow(constable),
    fit[i, ], constable.mcmc[i, "sigma"]))
newdata = data.frame(TREAT = constable$TREAT, IV = constable$IV, yRep) %>%
    gather(key = Sample, value = Value, -TREAT, -IV)
ggplot(newdata) + geom_violin(aes(y = Value, x = TREAT, fill = "Model"),
    alpha = 0.5) + geom_violin(data = constable, aes(y = SUTW, x = TREAT,
    fill = "Obs"), alpha = 0.5) + geom_point(data = constable, aes(y = SUTW,
    x = TREAT), position = position_jitter(width = 0.1, height = 0), color = "black")

ggplot(newdata) + geom_violin(aes(y = Value, x = IV, fill = "Model", group = IV,
    color = TREAT), alpha = 0.5) + geom_point(data = constable, aes(y = SUTW,
    x = IV, group = IV, color = TREAT))

We can also explore the posteriors of each parameter.

library(bayesplot)
mcmc_intervals(as.matrix(constable.rstanarm), regex_pars = "Intercept|TREAT|IV|sigma")

mcmc_areas(as.matrix(constable.rstanarm), regex_pars = "Intercept|TREAT|IV|sigma")

Show BRMS code

constable.mcmc = as.matrix(constable.brm)
# generate a model matrix
newdata = constable
Xmat = model.matrix(~I(IV^(1/3)) * TREAT, newdata)
## get median parameter estimates
wch = grep("b_", colnames(constable.mcmc))
coefs = apply(constable.mcmc[, wch], 2, median)
fit = as.vector(coefs %*% t(Xmat))
resid = constable$SUTW - fit
ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))

# Plot residuals against TREAT level
newdata = newdata %>% cbind(fit, resid)
newdata.melt = newdata %>% gather(key = Pred, value = Value, TREAT, IV)
ggplot(newdata.melt) + geom_point(aes(y = resid, x = Value)) + facet_wrap(~Pred, scales = "free")

# Standardized residuals
sresid = resid/sd(resid)
ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))

constable.mcmc = as.matrix(constable.brm)
# generate a model matrix
Xmat = model.matrix(~I(IV^(1/3)) * TREAT, newdata)
## get median parameter estimates
wch = grep("b_", colnames(constable.mcmc))
coefs = constable.mcmc[, wch]
fit = (coefs %*% t(Xmat))
## draw samples from this model
yRep = sapply(1:nrow(constable.mcmc), function(i) rnorm(nrow(constable),
    fit[i, ], constable.mcmc[i, "sigma"]))
newdata = data.frame(TREAT = constable$TREAT, IV = constable$IV, yRep) %>%
    gather(key = Sample, value = Value, -TREAT, -IV)
ggplot(newdata) + geom_violin(aes(y = Value, x = TREAT, fill = "Model"),
    alpha = 0.5) + geom_violin(data = constable, aes(y = SUTW, x = TREAT,
    fill = "Obs"), alpha = 0.5) + geom_point(data = constable, aes(y = SUTW,
    x = TREAT), position = position_jitter(width = 0.1, height = 0), color = "black")

ggplot(newdata) + geom_violin(aes(y = Value, x = IV, fill = "Model", group = IV,
    color = TREAT), alpha = 0.5) + geom_point(data = constable, aes(y = SUTW,
    x = IV, group = IV, color = TREAT))

We can also explore the posteriors of each parameter.

library(bayesplot)
mcmc_intervals(as.matrix(constable.brm), regex_pars = "b_|sigma")

mcmc_areas(as.matrix(constable.brm), regex_pars = "b_|sigma")

The model validation diagnostics all seem reasonable suggesting that the model is likely to be reliable.

Explore parameter estimates

Show MCMCpack code

library(MCMCpack)
summary(constable.mcmcpack)

Iterations = 1001:11000
Thinning interval = 1 
Number of chains = 1 
Sample size per chain = 10000 

1. Empirical mean and standard deviation for each variable,
   plus standard error of the mean:

                               Mean        SD  Naive SE Time-series SE
(Intercept)              -0.0281407 3.212e-02 3.212e-04      3.212e-04
I(IV^(1/3))               0.0467453 1.256e-02 1.256e-04      1.277e-04
TREATInitial             -0.0004898 3.950e-02 3.950e-04      3.950e-04
TREATLow                  0.0627394 3.903e-02 3.903e-04      3.903e-04
I(IV^(1/3)):TREATInitial -0.0113577 1.512e-02 1.512e-04      1.512e-04
I(IV^(1/3)):TREATLow     -0.0376004 1.480e-02 1.480e-04      1.480e-04
sigma2                    0.0004486 8.176e-05 8.176e-07      8.955e-07

2. Quantiles for each variable:

                               2.5%        25%        50%        75%      97.5%
(Intercept)              -0.0904732 -0.0498242 -0.0280381 -0.0068701  0.0348912
I(IV^(1/3))               0.0220741  0.0384528  0.0468257  0.0552233  0.0709941
TREATInitial             -0.0774232 -0.0271221 -0.0004452  0.0258165  0.0769013
TREATLow                 -0.0146895  0.0369222  0.0626721  0.0889185  0.1394793
I(IV^(1/3)):TREATInitial -0.0409856 -0.0213530 -0.0114781 -0.0011979  0.0179675
I(IV^(1/3)):TREATLow     -0.0663009 -0.0474367 -0.0376417 -0.0278228 -0.0084968
sigma2                    0.0003162  0.0003897  0.0004405  0.0004971  0.0006353

library(broom)
tidyMCMC(constable.mcmcpack, conf.int = TRUE, conf.method = "HPDinterval")

                      term      estimate    std.error      conf.low     conf.high
1              (Intercept) -0.0281407318 3.211710e-02 -0.0921183584  0.0328974751
2              I(IV^(1/3))  0.0467453220 1.256260e-02  0.0216638045  0.0704309315
3             TREATInitial -0.0004897981 3.950309e-02 -0.0768221293  0.0772328696
4                 TREATLow  0.0627394109 3.902797e-02 -0.0164210766  0.1365547275
5 I(IV^(1/3)):TREATInitial -0.0113576982 1.511950e-02 -0.0399768488  0.0185830940
6     I(IV^(1/3)):TREATLow -0.0376004095 1.480079e-02 -0.0661201076 -0.0083137914
7                   sigma2  0.0004485999 8.176199e-05  0.0003025742  0.0006110242

Xmat = model.matrix(~I(IV^(1/3)) * TREAT, data = constable)
terms = attr(Xmat, "assign")
## Effect effect
for (i in 1:length(terms)) mcmcpvalue(constable.mcmcpack[, i])
## Effect of covariate
mcmcpvalue(constable.mcmcpack[, which(terms == 1)])

[1] 6e-04

## Effect of treatment (marginal)
mcmcpvalue(constable.mcmcpack[, which(terms == 2)])

[1] 0.096

## Effect of interaction
mcmcpvalue(constable.mcmcpack[, which(terms == 3)])

[1] 0.0165

# Overal model
wch = grep("TREAT|IV", colnames(constable.mcmcpack))
mcmcpvalue(constable.mcmcpack[, wch])

[1] 0

## Frequentist for comparison
summary(lm(SUTW ~ I(IV^(1/3)) * TREAT, constable))

Call:
lm(formula = SUTW ~ I(IV^(1/3)) * TREAT, data = constable)

Residuals:
      Min        1Q    Median        3Q       Max 
-0.047764 -0.012727  0.001153  0.012959  0.065303 

Coefficients:
                           Estimate Std. Error t value Pr(>|t|)    
(Intercept)              -0.0282017  0.0307338  -0.918 0.362163    
I(IV^(1/3))               0.0467744  0.0119981   3.898 0.000229 ***
TREATInitial             -0.0005899  0.0380584  -0.015 0.987680    
TREATLow                  0.0630730  0.0373498   1.689 0.095994 .  
I(IV^(1/3)):TREATInitial -0.0113263  0.0145247  -0.780 0.438302    
I(IV^(1/3)):TREATLow     -0.0377210  0.0141608  -2.664 0.009700 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.02048 on 66 degrees of freedom
Multiple R-squared:  0.4958,	Adjusted R-squared:  0.4576 
F-statistic: 12.98 on 5 and 66 DF,  p-value: 8.431e-09

Show JAGS code

print(constable.r2jags)

Inference for Bugs model at "5", fit using jags,
 3 chains, each with 50000 iterations (first 3000 discarded), n.thin = 10
 n.sims = 14100 iterations saved
          mu.vect sd.vect     2.5%      25%      50%      75%    97.5%  Rhat n.eff
beta[1]     0.047   0.012    0.022    0.039    0.047    0.055    0.071 1.001 14000
beta[2]     0.000   0.039   -0.077   -0.026    0.000    0.025    0.076 1.001 14000
beta[3]     0.063   0.038   -0.013    0.037    0.063    0.088    0.139 1.001 14000
beta[4]    -0.011   0.015   -0.040   -0.021   -0.012   -0.001    0.018 1.001 14000
beta[5]    -0.038   0.015   -0.067   -0.047   -0.038   -0.028   -0.009 1.001 14000
beta0      -0.028   0.032   -0.090   -0.049   -0.028   -0.007    0.035 1.001 14000
sigma       0.021   0.002    0.018    0.020    0.021    0.022    0.025 1.001  6700
deviance -354.340   3.998 -360.045 -357.282 -355.026 -352.167 -344.666 1.001 14000

For each parameter, n.eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor (at convergence, Rhat=1).

DIC info (using the rule, pD = var(deviance)/2)
pD = 8.0 and DIC = -346.3
DIC is an estimate of expected predictive error (lower deviance is better).

library(broom)
tidyMCMC(constable.r2jags$BUGSoutput$sims.matrix, conf.int = TRUE, conf.method = "HPDinterval")

      term      estimate   std.error      conf.low     conf.high
1  beta[1]  4.670045e-02 0.012301012    0.02305671  7.161889e-02
2  beta[2] -4.557708e-04 0.038871071   -0.07545077  7.762928e-02
3  beta[3]  6.281161e-02 0.038401967   -0.01312925  1.383286e-01
4  beta[4] -1.136903e-02 0.014860096   -0.04083542  1.740297e-02
5  beta[5] -3.762094e-02 0.014575422   -0.06722313 -9.735764e-03
6    beta0 -2.801120e-02 0.031515102   -0.08943297  3.500509e-02
7 deviance -3.543402e+02 3.997734559 -360.86935897 -3.465127e+02
8    sigma  2.089732e-02 0.001885792    0.01735789  2.460693e-02

constable.mcmc = constable.r2jags$BUGSoutput$sims.matrix
wch = c(which(colnames(constable.mcmc) == "beta0"), grep("beta\\[", colnames(constable.mcmc)))
constable.mcmc = constable.mcmc[, wch]
str(constable.mcmc)

 num [1:14100, 1:6] -0.0208 -0.00522 -0.01208 -0.00336 0.07698 ...
 - attr(*, "dimnames")=List of 2
  ..$ : NULL
  ..$ : chr [1:6] "beta0" "beta[1]" "beta[2]" "beta[3]" ...

Xmat = model.matrix(~I(IV^(1/3)) * TREAT, data = constable)
terms = attr(Xmat, "assign")
## Effect effect
for (i in 1:length(terms)) print(mcmcpvalue(constable.mcmc[, i]))

[1] 0.3675887
[1] 0.000141844
[1] 0.9907801
[1] 0.1035461
[1] 0.4405674
[1] 0.01078014

## Effect of covariate
mcmcpvalue(constable.mcmc[, which(terms == 1)])

[1] 0.000141844

## Effect of treatment (marginal)
mcmcpvalue(constable.mcmc[, which(terms == 2)])

[1] 0.08524823

## Effect of interaction
mcmcpvalue(constable.mcmc[, which(terms == 3)])

[1] 0.01297872

# Overal model
wch = grep("beta\\[", colnames(constable.mcmc))
mcmcpvalue(constable.mcmcpack[, wch])

[1] 0

## Frequentist for comparison
summary(lm(SUTW ~ I(IV^(1/3)) * TREAT, constable))

Call:
lm(formula = SUTW ~ I(IV^(1/3)) * TREAT, data = constable)

Residuals:
      Min        1Q    Median        3Q       Max 
-0.047764 -0.012727  0.001153  0.012959  0.065303 

Coefficients:
                           Estimate Std. Error t value Pr(>|t|)    
(Intercept)              -0.0282017  0.0307338  -0.918 0.362163    
I(IV^(1/3))               0.0467744  0.0119981   3.898 0.000229 ***
TREATInitial             -0.0005899  0.0380584  -0.015 0.987680    
TREATLow                  0.0630730  0.0373498   1.689 0.095994 .  
I(IV^(1/3)):TREATInitial -0.0113263  0.0145247  -0.780 0.438302    
I(IV^(1/3)):TREATLow     -0.0377210  0.0141608  -2.664 0.009700 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.02048 on 66 degrees of freedom
Multiple R-squared:  0.4958,	Adjusted R-squared:  0.4576 
F-statistic: 12.98 on 5 and 66 DF,  p-value: 8.431e-09

Show RSTAN code

library(broom)
tidyMCMC(constable.rstan, conf.int = TRUE, conf.method = "HPDinterval", pars = c("beta0", "beta", "sigma"),
    ess = TRUE, rhat = TRUE)

     term      estimate   std.error    conf.low   conf.high      rhat  ess
1   beta0 -0.0278369985 0.032630197 -0.08856619  0.03976331 1.0011292 1778
2 beta[1]  0.0466281311 0.012731092  0.01859167  0.06908683 1.0010891 2018
3 beta[2] -0.0007384085 0.039219652 -0.07668034  0.07500954 0.9997629 2152
4 beta[3]  0.0618907446 0.039193081 -0.01722613  0.13859100 1.0000331 1810
5 beta[4] -0.0112233626 0.014999941 -0.04061147  0.01841634 0.9998709 2106
6 beta[5] -0.0372331205 0.014919054 -0.06591119 -0.00682343 1.0001722 1767
7   sigma  0.0208891691 0.001872266  0.01713790  0.02446211 1.0004836 2571

constable.mcmc = as.matrix(constable.rstan)
wch = c(which(colnames(constable.mcmc) == "beta0"), grep("beta\\[", colnames(constable.mcmc)))
constable.mcmc = constable.mcmc[, wch]
str(constable.mcmc)

 num [1:2700, 1:6] -0.097 -0.0165 -0.0329 -0.0255 0.0274 ...
 - attr(*, "dimnames")=List of 2
  ..$ iterations: NULL
  ..$ parameters: chr [1:6] "beta0" "beta[1]" "beta[2]" "beta[3]" ...

Xmat = model.matrix(~I(IV^(1/3)) * TREAT, data = constable)
terms = attr(Xmat, "assign")
## Effect effect
for (i in 1:length(terms)) print(mcmcpvalue(constable.mcmc[, i]))

[1] 0.387037
[1] 0.0007407407
[1] 0.987037
[1] 0.1166667
[1] 0.422963
[1] 0.01444444

## Effect of covariate
mcmcpvalue(constable.mcmc[, which(terms == 1)])

[1] 0.0007407407

## Effect of treatment (marginal)
mcmcpvalue(constable.mcmc[, which(terms == 2)])

[1] 0.0962963

## Effect of interaction
mcmcpvalue(constable.mcmc[, which(terms == 3)])

[1] 0.02074074

# Overal model
wch = grep("beta\\[", colnames(constable.mcmc))
mcmcpvalue(constable.mcmcpack[, wch])

[1] 0

## Frequentist for comparison
summary(lm(SUTW ~ I(IV^(1/3)) * TREAT, constable))

Call:
lm(formula = SUTW ~ I(IV^(1/3)) * TREAT, data = constable)

Residuals:
      Min        1Q    Median        3Q       Max 
-0.047764 -0.012727  0.001153  0.012959  0.065303 

Coefficients:
                           Estimate Std. Error t value Pr(>|t|)    
(Intercept)              -0.0282017  0.0307338  -0.918 0.362163    
I(IV^(1/3))               0.0467744  0.0119981   3.898 0.000229 ***
TREATInitial             -0.0005899  0.0380584  -0.015 0.987680    
TREATLow                  0.0630730  0.0373498   1.689 0.095994 .  
I(IV^(1/3)):TREATInitial -0.0113263  0.0145247  -0.780 0.438302    
I(IV^(1/3)):TREATLow     -0.0377210  0.0141608  -2.664 0.009700 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.02048 on 66 degrees of freedom
Multiple R-squared:  0.4958,	Adjusted R-squared:  0.4576 
F-statistic: 12.98 on 5 and 66 DF,  p-value: 8.431e-09

Show RSTANARM code

summary(constable.rstan)

$summary
                     mean      se_mean          sd         2.5%          25%           50%
beta[1]      4.662813e-02 2.834136e-04 0.012731092   0.02031882   0.03842780  4.691583e-02
beta[2]     -7.384085e-04 8.455188e-04 0.039219652  -0.07661515  -0.02554020  5.813345e-05
beta[3]      6.189074e-02 9.213031e-04 0.039193081  -0.01636681   0.03653035  6.230858e-02
beta[4]     -1.122336e-02 3.268895e-04 0.014999941  -0.04039101  -0.02063698 -1.126060e-02
beta[5]     -3.723312e-02 3.549614e-04 0.014919054  -0.06648371  -0.04703486 -3.747664e-02
cbeta0       7.244843e-02 4.899537e-05 0.002545874   0.06748716   0.07071114  7.243238e-02
sigma        2.088917e-02 3.692373e-05 0.001872266   0.01745036   0.01961179  2.082794e-02
beta0       -2.783700e-02 7.738103e-04 0.032630197  -0.09105514  -0.04933693 -2.819851e-02
log_lik[1]   2.555053e+00 8.279597e-03 0.412625825   1.48002609   2.37565170  2.668924e+00
log_lik[2]   2.689869e+00 5.886304e-03 0.291712678   1.94169390   2.56823620  2.765803e+00
log_lik[3]   2.481437e+00 6.408258e-03 0.321730766   1.70879340   2.29899530  2.541077e+00
log_lik[4]   2.897622e+00 2.274280e-03 0.116633788   2.64138042   2.83049352  2.907077e+00
log_lik[5]   2.705299e+00 3.510597e-03 0.174263133   2.31349245   2.61010610  2.728462e+00
log_lik[6]   2.876877e+00 2.286666e-03 0.117374730   2.61123537   2.80959929  2.887540e+00
log_lik[7]   2.560570e+00 3.857742e-03 0.194705349   2.12570324   2.45110459  2.581614e+00
log_lik[8]   2.752342e+00 2.840581e-03 0.146866659   2.40683117   2.67032992  2.770447e+00
log_lik[9]   1.775614e+00 6.846079e-03 0.348432336   0.99574820   1.56920160  1.809863e+00
log_lik[10]  2.778273e+00 2.698093e-03 0.135681420   2.48780917   2.70187045  2.792084e+00
log_lik[11]  2.673033e+00 3.005238e-03 0.156156766   2.33437255   2.58537417  2.689725e+00
log_lik[12]  2.686719e+00 2.928559e-03 0.152172402   2.33092385   2.59809399  2.701096e+00
log_lik[13]  1.661739e+00 6.924671e-03 0.359816441   0.87329102   1.45046387  1.695174e+00
log_lik[14]  2.316722e+00 4.462239e-03 0.231864737   1.78428279   2.18156727  2.335296e+00
log_lik[15]  2.333874e+00 4.626360e-03 0.240392696   1.79242627   2.19062150  2.361844e+00
log_lik[16]  2.745435e+00 2.852268e-03 0.148208202   2.40700659   2.66093197  2.759778e+00
log_lik[17]  2.845838e+00 2.312208e-03 0.120145827   2.58647190   2.77648380  2.855412e+00
log_lik[18]  2.849556e+00 2.509111e-03 0.128577370   2.55917320   2.77590000  2.860467e+00
log_lik[19]  2.822053e+00 2.724216e-03 0.141554399   2.49033298   2.74328713  2.837016e+00
log_lik[20]  2.873692e+00 2.581911e-03 0.131199533   2.55664162   2.80698230  2.888037e+00
log_lik[21]  2.026990e+00 8.336486e-03 0.433176512   1.04071341   1.76431161  2.086384e+00
log_lik[22]  2.787911e+00 3.709842e-03 0.192769033   2.29265457   2.70351788  2.824156e+00
log_lik[23]  2.399441e+00 6.861572e-03 0.356537753   1.53604209   2.20306355  2.459279e+00
log_lik[24]  2.839579e+00 3.259153e-03 0.169350543   2.39523909   2.76589671  2.871430e+00
log_lik[25]  2.762956e+00 4.962960e-03 0.241860710   2.11040028   2.66516845  2.820364e+00
log_lik[26]  2.334286e+00 7.549496e-03 0.372101667   1.41557147   2.13302617  2.400365e+00
log_lik[27]  2.723067e+00 4.487012e-03 0.218326652   2.16329698   2.61613234  2.763296e+00
log_lik[28]  2.078010e+00 7.032911e-03 0.364625211   1.26525032   1.86876336  2.121607e+00
log_lik[29]  2.581916e+00 4.920716e-03 0.255687918   1.97016520   2.45489963  2.616756e+00
log_lik[30]  2.886086e+00 2.276165e-03 0.114568562   2.63526733   2.82031241  2.895578e+00
log_lik[31]  2.910017e+00 2.060950e-03 0.104986513   2.68691980   2.84838520  2.915109e+00
log_lik[32]  2.222772e+00 5.339118e-03 0.275582619   1.61533570   2.06599337  2.248303e+00
log_lik[33]  2.924467e+00 1.925423e-03 0.097708975   2.72776828   2.86256038  2.927915e+00
log_lik[34]  2.924467e+00 1.925423e-03 0.097708975   2.72776828   2.86256038  2.927915e+00
log_lik[35]  2.752860e+00 2.808358e-03 0.144358905   2.43182351   2.67047426  2.770488e+00
log_lik[36]  2.490421e+00 3.964137e-03 0.200794362   2.03246369   2.37362879  2.510642e+00
log_lik[37]  2.930899e+00 1.871049e-03 0.095021752   2.73934583   2.86974024  2.932586e+00
log_lik[38]  2.929999e+00 1.879522e-03 0.095393674   2.73761400   2.86804181  2.931895e+00
log_lik[39]  2.438143e+00 4.251045e-03 0.219655037   1.93589288   2.30486686  2.461310e+00
log_lik[40]  2.927682e+00 1.885400e-03 0.096231668   2.73082310   2.86582223  2.929051e+00
log_lik[41]  2.872912e+00 2.539450e-03 0.131822831   2.56366237   2.80429922  2.889157e+00
log_lik[42]  2.615441e+00 5.050475e-03 0.262430360   1.97208172   2.47776371  2.660553e+00
log_lik[43]  2.864143e+00 2.761726e-03 0.143503495   2.50170516   2.79416131  2.883797e+00
log_lik[44]  2.851867e+00 2.988698e-03 0.155297296   2.45692210   2.77823353  2.876845e+00
log_lik[45]  2.873560e+00 2.681272e-03 0.139322967   2.54704307   2.80559996  2.891678e+00
log_lik[46]  2.689669e+00 5.963575e-03 0.280196813   1.97318990   2.56913396  2.756082e+00
log_lik[47]  2.435259e+00 6.821403e-03 0.324916984   1.68842917   2.25753450  2.485678e+00
log_lik[48]  2.699319e+00 5.452140e-03 0.238778108   2.11922450   2.58832085  2.747008e+00
log_lik[49]  2.279966e+00 8.200409e-03 0.392277547   1.34804999   2.06478509  2.343175e+00
log_lik[50]  2.719443e+00 3.563200e-03 0.177344190   2.31249922   2.62214007  2.743720e+00
log_lik[51]  2.418619e+00 4.698342e-03 0.240938206   1.86752765   2.27902509  2.445047e+00
log_lik[52]  1.504658e+00 7.710691e-03 0.400659237   0.63779990   1.25467856  1.541890e+00
log_lik[53]  2.672471e+00 3.090797e-03 0.160602534   2.32001149   2.57212431  2.685993e+00
log_lik[54]  1.854814e+00 6.997527e-03 0.360937583   1.04480966   1.63603052  1.886166e+00
log_lik[55]  2.869673e+00 2.582922e-03 0.128647935   2.56622994   2.80382516  2.884795e+00
log_lik[56]  2.714235e+00 3.056682e-03 0.158474265   2.35245828   2.62540402  2.731998e+00
log_lik[57]  1.609668e+00 7.477827e-03 0.388559294   0.76105837   1.36902866  1.643380e+00
log_lik[58]  2.883513e+00 2.143228e-03 0.109107631   2.64053696   2.82097839  2.890654e+00
log_lik[59]  2.903621e+00 2.067792e-03 0.103244226   2.68945368   2.83949878  2.906093e+00
log_lik[60]  2.841091e+00 2.293749e-03 0.119186689   2.57997266   2.77081249  2.847705e+00
log_lik[61]  2.171238e+00 5.368653e-03 0.277867834   1.58346702   1.99630633  2.198158e+00
log_lik[62]  2.894702e+00 2.153447e-03 0.107158693   2.67477624   2.83273941  2.900048e+00
log_lik[63] -2.080309e+00 2.132534e-02 1.075508483  -4.42410661  -2.74679165 -1.978252e+00
log_lik[64]  2.529744e-01 1.288737e-02 0.665509296  -1.22137507  -0.16784435  3.028655e-01
log_lik[65]  2.866611e+00 2.735208e-03 0.134152799   2.55952547   2.79899747  2.881208e+00
log_lik[66]  1.357742e+00 1.521447e-02 0.735355609  -0.30509519   0.93351831  1.446848e+00
log_lik[67]  1.958808e+00 9.352149e-03 0.462405731   0.96385176   1.67520242  2.014435e+00
log_lik[68]  2.829446e+00 3.852587e-03 0.185710982   2.35089393   2.75399152  2.865516e+00
log_lik[69]  1.858093e+00 1.656846e-02 0.786121382   0.08716029   1.44708394  2.012602e+00
log_lik[70]  2.020418e+00 9.992458e-03 0.517707061   0.79164539   1.72680316  2.100529e+00
log_lik[71]  2.828951e+00 3.433464e-03 0.178408006   2.35566306   2.75449274  2.862395e+00
log_lik[72]  2.224567e+00 6.792642e-03 0.329938655   1.44633068   2.04324904  2.270276e+00
lp__         2.394017e+02 4.082506e-02 2.031752421 234.41983863 238.38380889  2.397764e+02
                      75%         97.5%    n_eff      Rhat
beta[1]       0.055023423   0.071624769 2017.855 1.0010891
beta[2]       0.023974137   0.074990062 2151.597 0.9997629
beta[3]       0.088135179   0.140132781 1809.730 1.0000331
beta[4]      -0.001792672   0.018702992 2105.607 0.9998709
beta[5]      -0.027204509  -0.007036234 1766.527 1.0001722
cbeta0        0.074176438   0.077529500 2700.000 0.9999105
sigma         0.022074894   0.024931207 2571.130 1.0004836
beta0        -0.006626736   0.037656284 1778.158 1.0011292
log_lik[1]    2.854702141   3.027767860 2483.673 0.9998988
log_lik[2]    2.895481029   3.046566686 2455.982 0.9999209
log_lik[3]    2.723139669   2.933952105 2520.607 0.9994487
log_lik[4]    2.972367347   3.102625886 2630.032 1.0005403
log_lik[5]    2.826250193   2.983700261 2464.047 0.9997244
log_lik[6]    2.955006090   3.080157288 2634.777 1.0002897
log_lik[7]    2.694386076   2.885570014 2547.355 0.9994033
log_lik[8]    2.852596334   2.992124738 2673.204 0.9994858
log_lik[9]    2.015621018   2.348219252 2590.318 0.9992977
log_lik[10]   2.870571012   3.011641778 2528.873 0.9998255
log_lik[11]   2.780840934   2.935968260 2700.000 0.9993728
log_lik[12]   2.791360985   2.936695879 2700.000 0.9991006
log_lik[13]   1.914075113   2.262740896 2700.000 0.9992730
log_lik[14]   2.481417212   2.695023365 2700.000 0.9989629
log_lik[15]   2.504695666   2.736091080 2700.000 0.9989709
log_lik[16]   2.845698153   2.998183094 2700.000 0.9993760
log_lik[17]   2.929761685   3.048195661 2700.000 0.9996054
log_lik[18]   2.933501427   3.068342788 2625.968 0.9996591
log_lik[19]   2.917454089   3.057175764 2700.000 0.9994765
log_lik[20]   2.958389389   3.088677398 2582.152 0.9995303
log_lik[21]   2.339246000   2.712177499 2700.000 0.9991631
log_lik[22]   2.916649381   3.066762554 2700.000 0.9992182
log_lik[23]   2.659415848   2.913365312 2700.000 0.9994161
log_lik[24]   2.949364739   3.078818204 2700.000 0.9996608
log_lik[25]   2.923960888   3.063099634 2374.921 0.9992894
log_lik[26]   2.596365553   2.875991184 2429.334 0.9990705
log_lik[27]   2.877477863   3.028095503 2367.550 0.9991650
log_lik[28]   2.331170183   2.673624000 2687.962 0.9990338
log_lik[29]   2.763275814   2.961246284 2700.000 0.9990731
log_lik[30]   2.963759598   3.084771224 2533.514 0.9998388
log_lik[31]   2.979865899   3.100417988 2594.970 1.0002251
log_lik[32]   2.419334769   2.679983872 2664.186 0.9990859
log_lik[33]   2.989942674   3.104349686 2575.233 1.0003795
log_lik[34]   2.989942674   3.104349686 2575.233 1.0003795
log_lik[35]   2.851052734   2.995707580 2642.301 0.9994472
log_lik[36]   2.632252355   2.819382873 2565.699 0.9994201
log_lik[37]   2.994852604   3.109550116 2579.145 1.0004301
log_lik[38]   2.994345912   3.109560401 2575.990 1.0004253
log_lik[39]   2.592934426   2.801667081 2669.874 0.9992269
log_lik[40]   2.993273134   3.109223738 2605.128 1.0003846
log_lik[41]   2.959792583   3.082061406 2694.648 0.9998075
log_lik[42]   2.806068088   2.982019372 2700.000 0.9992635
log_lik[43]   2.959054769   3.085870780 2700.000 0.9997016
log_lik[44]   2.954728408   3.084005662 2700.000 0.9996167
log_lik[45]   2.964382735   3.087830485 2700.000 0.9997871
log_lik[46]   2.884750873   3.039018211 2207.563 1.0007124
log_lik[47]   2.673113795   2.900754398 2268.806 1.0009325
log_lik[48]   2.864892011   3.029914666 1918.029 1.0007769
log_lik[49]   2.564203145   2.851315200 2288.316 1.0009860
log_lik[50]   2.841307097   2.998910586 2477.156 1.0004831
log_lik[51]   2.590685805   2.803870352 2629.796 1.0003989
log_lik[52]   1.785915792   2.185808724 2700.000 1.0004247
log_lik[53]   2.786300204   2.939195545 2700.000 0.9999469
log_lik[54]   2.109215956   2.466082445 2660.573 1.0002311
log_lik[55]   2.951541567   3.081437035 2480.750 1.0001909
log_lik[56]   2.826197665   2.975459589 2687.924 0.9999721
log_lik[57]   1.883403557   2.264911382 2700.000 1.0004782
log_lik[58]   2.956303334   3.078375450 2591.634 0.9998096
log_lik[59]   2.973835591   3.091447248 2492.975 1.0004020
log_lik[60]   2.921379864   3.054351592 2700.000 1.0001861
log_lik[61]   2.371166175   2.625066849 2678.834 0.9994100
log_lik[62]   2.967469714   3.092051497 2476.205 1.0004038
log_lik[63]  -1.345084019  -0.151002795 2543.522 0.9998128
log_lik[64]   0.716672150   1.386447846 2666.733 1.0004144
log_lik[65]   2.954903918   3.080417715 2405.574 1.0002802
log_lik[66]   1.901290620   2.479112735 2336.044 1.0002470
log_lik[67]   2.299771383   2.682749103 2444.689 1.0001665
log_lik[68]   2.947306682   3.076343209 2323.648 1.0001059
log_lik[69]   2.455324591   2.901446564 2251.204 1.0005565
log_lik[70]   2.404917313   2.781645283 2684.253 0.9993194
log_lik[71]   2.945926717   3.076691532 2700.000 0.9994939
log_lik[72]   2.449956379   2.748148303 2359.331 0.9991587
lp__        240.880868913 242.260342463 2476.783 1.0000042

$c_summary
, , chains = chain:1

             stats
parameter             mean          sd         2.5%          25%           50%           75%
  beta[1]       0.04643878 0.012628851   0.02049142   0.03819352   0.047001702   0.054672036
  beta[2]      -0.00140730 0.038707451  -0.07566133  -0.02657238  -0.001206666   0.023579548
  beta[3]       0.06144959 0.038584649  -0.01440674   0.03616748   0.062178000   0.086730348
  beta[4]      -0.01092832 0.014820893  -0.03970598  -0.01989798  -0.010928983  -0.001653754
  beta[5]      -0.03703277 0.014717248  -0.06681856  -0.04624826  -0.037645450  -0.027113155
  cbeta0        0.07233788 0.002567254   0.06744530   0.07047948   0.072270528   0.074185837
  sigma         0.02088342 0.001801060   0.01756746   0.01968722   0.020763381   0.022048136
  beta0        -0.02752243 0.032411022  -0.09009245  -0.04939535  -0.028054042  -0.006953943
  log_lik[1]    2.56619719 0.386706491   1.59400980   2.37328301   2.672661552   2.859236864
  log_lik[2]    2.69769984 0.272845128   1.98620346   2.56874004   2.761938188   2.897393450
  log_lik[3]    2.47594474 0.323693862   1.69910023   2.28717682   2.547969182   2.728703862
  log_lik[4]    2.89727496 0.111618302   2.64978004   2.83629549   2.906427271   2.970347414
  log_lik[5]    2.70914643 0.170436041   2.32168344   2.61641677   2.726831739   2.829876539
  log_lik[6]    2.87563467 0.114646737   2.60723928   2.81059665   2.885256035   2.957128443
  log_lik[7]    2.56457166 0.192918143   2.12902567   2.45110459   2.584004526   2.697265641
  log_lik[8]    2.75010690 0.147052720   2.40386344   2.66460552   2.771962688   2.852821975
  log_lik[9]    1.77134842 0.349560716   0.99003727   1.56550220   1.815295936   2.012138440
  log_lik[10]   2.78006235 0.133421540   2.47801156   2.70383799   2.792799223   2.868294665
  log_lik[11]   2.67475691 0.155625778   2.32203922   2.58018861   2.692522610   2.782818989
  log_lik[12]   2.68498164 0.155023970   2.29977060   2.59330457   2.698320431   2.798590880
  log_lik[13]   1.65985037 0.360418884   0.89068235   1.46004939   1.697094096   1.914231916
  log_lik[14]   2.31487288 0.235761992   1.77381209   2.17536525   2.334570894   2.487175516
  log_lik[15]   2.33468537 0.244450693   1.79728694   2.18363611   2.363513939   2.507809026
  log_lik[16]   2.74531350 0.146942403   2.40098995   2.66206681   2.759728108   2.845661447
  log_lik[17]   2.84473866 0.121651793   2.56637788   2.77018750   2.855964719   2.930368125
  log_lik[18]   2.84836050 0.124991501   2.56720774   2.77502286   2.860786724   2.930722856
  log_lik[19]   2.82067855 0.138656474   2.49797805   2.74079294   2.836743786   2.914811821
  log_lik[20]   2.87160993 0.127822155   2.55664162   2.80458157   2.887362248   2.956733309
  log_lik[21]   2.02326072 0.442043884   1.01995598   1.76598398   2.091522685   2.337897774
  log_lik[22]   2.78483581 0.191962506   2.29043026   2.70465406   2.825989913   2.914067188
  log_lik[23]   2.40144099 0.360625988   1.52170229   2.21428410   2.454859425   2.663272843
  log_lik[24]   2.83878972 0.171493897   2.36936858   2.77060961   2.870977392   2.951985745
  log_lik[25]   2.76133288 0.244915410   2.08467745   2.66901967   2.814460740   2.927555797
  log_lik[26]   2.33706705 0.373495823   1.40039185   2.14087735   2.401893557   2.586771140
  log_lik[27]   2.72367597 0.217117568   2.14375907   2.62772142   2.759618192   2.880211910
  log_lik[28]   2.07252272 0.377822224   1.22616249   1.87548557   2.126163599   2.335521250
  log_lik[29]   2.57746553 0.270483601   1.91450699   2.46143195   2.616645250   2.757665637
  log_lik[30]   2.88675406 0.110410301   2.64244539   2.82040964   2.894386823   2.965726700
  log_lik[31]   2.90888211 0.105058582   2.68203084   2.85311175   2.915896937   2.980100965
  log_lik[32]   2.21823719 0.281967292   1.59731029   2.06529730   2.244658537   2.418590396
  log_lik[33]   2.92415553 0.094569850   2.73343181   2.86704316   2.927491276   2.989958024
  log_lik[34]   2.92415553 0.094569850   2.73343181   2.86704316   2.927491276   2.989958024
  log_lik[35]   2.75061357 0.145843757   2.43537000   2.66769711   2.765627025   2.848613027
  log_lik[36]   2.49607600 0.198002437   2.04899646   2.38959501   2.512395924   2.637258523
  log_lik[37]   2.93143450 0.090363117   2.75250085   2.87302732   2.934560232   2.994320317
  log_lik[38]   2.93030317 0.090979234   2.74790741   2.87207759   2.932461862   2.992971635
  log_lik[39]   2.43531206 0.213065398   1.98449071   2.30651265   2.453205797   2.588139651
  log_lik[40]   2.92917402 0.091618923   2.74163152   2.87227354   2.933409764   2.992376304
  log_lik[41]   2.87407193 0.128092826   2.57449305   2.80683064   2.888796562   2.955843418
  log_lik[42]   2.61463105 0.254516015   2.01089367   2.49005984   2.656709687   2.797789416
  log_lik[43]   2.86815347 0.141087264   2.53640461   2.80010385   2.885622298   2.967183895
  log_lik[44]   2.85616001 0.152884332   2.47186398   2.78796614   2.878885033   2.960758216
  log_lik[45]   2.87559553 0.138250730   2.56917960   2.81058096   2.890960303   2.963947123
  log_lik[46]   2.69008281 0.283917921   1.96586691   2.57628890   2.763585968   2.887108630
  log_lik[47]   2.43747472 0.328414163   1.65197874   2.26339289   2.486732406   2.676129202
  log_lik[48]   2.69839958 0.236417416   2.17502125   2.58295372   2.745340511   2.861814041
  log_lik[49]   2.27883673 0.386355982   1.45222305   2.06555808   2.343264621   2.557833239
  log_lik[50]   2.71678324 0.179708493   2.32029715   2.61505324   2.742927293   2.842417539
  log_lik[51]   2.42328589 0.242035474   1.86492726   2.28534136   2.445765955   2.598111981
  log_lik[52]   1.51516071 0.405765066   0.63754932   1.24500374   1.561765287   1.811060194
  log_lik[53]   2.66809741 0.165060831   2.31515588   2.56318601   2.684335816   2.788183906
  log_lik[54]   1.84906107 0.357617969   1.07283639   1.63513126   1.878543706   2.100243272
  log_lik[55]   2.87035773 0.125928891   2.56220684   2.81045538   2.887266935   2.951675898
  log_lik[56]   2.71733895 0.155961154   2.35872958   2.63572326   2.737815662   2.825749716
  log_lik[57]   1.61918583 0.393637151   0.74655674   1.37246160   1.661039538   1.904720334
  log_lik[58]   2.88490353 0.104276129   2.64891234   2.82659054   2.895265125   2.953385111
  log_lik[59]   2.90174367 0.103735279   2.68693494   2.83570346   2.904555372   2.973073249
  log_lik[60]   2.83763561 0.121571323   2.58481459   2.76487096   2.840841214   2.926235407
  log_lik[61]   2.16193728 0.280500856   1.57032379   1.98537353   2.188761272   2.363179124
  log_lik[62]   2.89219948 0.107328018   2.67744335   2.83274848   2.896589033   2.967400958
  log_lik[63]  -2.10034498 1.041646594  -4.24981910  -2.73987793  -2.014045595  -1.380158421
  log_lik[64]   0.27561537 0.665348554  -1.20843591  -0.15119014   0.330771393   0.725514176
  log_lik[65]   2.86342990 0.134861235   2.55853032   2.79991943   2.876403988   2.951796612
  log_lik[66]   1.33790401 0.735276393  -0.38787265   0.92206097   1.431817388   1.869504079
  log_lik[67]   1.97541155 0.463248298   0.98445109   1.70116814   2.043817962   2.308607555
  log_lik[68]   2.82591211 0.188062849   2.33717742   2.75523581   2.863449237   2.944952099
  log_lik[69]   1.88260766 0.782023975   0.09272054   1.50201925   2.030936022   2.469296567
  log_lik[70]   2.03136356 0.502394575   0.76271761   1.72487556   2.107985446   2.397895212
  log_lik[71]   2.83371694 0.175854009   2.36938926   2.76275895   2.868649328   2.955230203
  log_lik[72]   2.22923323 0.331626335   1.42320658   2.05182700   2.271105437   2.446813604
  lp__        239.44026955 2.010619423 234.67689436 238.46085590 239.797177966 240.872201552
             stats
parameter             97.5%
  beta[1]       0.070997930
  beta[2]       0.073470942
  beta[3]       0.135432823
  beta[4]       0.017911495
  beta[5]      -0.006989883
  cbeta0        0.077166216
  sigma         0.024781110
  beta0         0.036654853
  log_lik[1]    3.035155201
  log_lik[2]    3.058167044
  log_lik[3]    2.914011898
  log_lik[4]    3.087978727
  log_lik[5]    2.996974110
  log_lik[6]    3.070112311
  log_lik[7]    2.889723644
  log_lik[8]    2.988011154
  log_lik[9]    2.346405109
  log_lik[10]   3.015262487
  log_lik[11]   2.935126827
  log_lik[12]   2.934023523
  log_lik[13]   2.262333605
  log_lik[14]   2.697797765
  log_lik[15]   2.735874665
  log_lik[16]   2.998391043
  log_lik[17]   3.039732820
  log_lik[18]   3.068541307
  log_lik[19]   3.056813475
  log_lik[20]   3.088247309
  log_lik[21]   2.719985017
  log_lik[22]   3.056647660
  log_lik[23]   2.918513926
  log_lik[24]   3.070462984
  log_lik[25]   3.058959782
  log_lik[26]   2.870613272
  log_lik[27]   3.021389854
  log_lik[28]   2.663906666
  log_lik[29]   2.957058564
  log_lik[30]   3.081060162
  log_lik[31]   3.099254939
  log_lik[32]   2.662048937
  log_lik[33]   3.102196023
  log_lik[34]   3.102196023
  log_lik[35]   3.000591811
  log_lik[36]   2.810342407
  log_lik[37]   3.102119633
  log_lik[38]   3.102970987
  log_lik[39]   2.792410045
  log_lik[40]   3.105380151
  log_lik[41]   3.087018410
  log_lik[42]   2.973726612
  log_lik[43]   3.071301206
  log_lik[44]   3.072751994
  log_lik[45]   3.093960956
  log_lik[46]   3.023221570
  log_lik[47]   2.887093200
  log_lik[48]   3.018906758
  log_lik[49]   2.855439411
  log_lik[50]   3.002832613
  log_lik[51]   2.803885796
  log_lik[52]   2.193960653
  log_lik[53]   2.939089632
  log_lik[54]   2.466902790
  log_lik[55]   3.072813188
  log_lik[56]   2.959424332
  log_lik[57]   2.264911382
  log_lik[58]   3.066635705
  log_lik[59]   3.087659949
  log_lik[60]   3.045881107
  log_lik[61]   2.622928278
  log_lik[62]   3.082103318
  log_lik[63]  -0.240078709
  log_lik[64]   1.404206011
  log_lik[65]   3.074284684
  log_lik[66]   2.444088025
  log_lik[67]   2.705141768
  log_lik[68]   3.070300109
  log_lik[69]   2.907808508
  log_lik[70]   2.770486038
  log_lik[71]   3.068707350
  log_lik[72]   2.761004676
  lp__        242.332331928

, , chains = chain:2

             stats
parameter              mean          sd         2.5%          25%           50%           75%
  beta[1]       0.046573687 0.012968365   0.01922452   0.03835562  4.661458e-02   0.055590980
  beta[2]      -0.000679594 0.040203025  -0.07731360  -0.02586551  5.623988e-04   0.024419305
  beta[3]       0.061865889 0.039153730  -0.01625011   0.03530993  6.245849e-02   0.088713322
  beta[4]      -0.011300228 0.015393923  -0.04019140  -0.02072802 -1.139336e-02  -0.001914545
  beta[5]      -0.037263246 0.014880247  -0.06671254  -0.04719578 -3.782046e-02  -0.027202808
  cbeta0        0.072487794 0.002452994   0.06759219   0.07086659  7.248487e-02   0.074086053
  sigma         0.020822161 0.001911947   0.01719854   0.01954794  2.080912e-02   0.021998851
  beta0        -0.027567000 0.033285659  -0.09202049  -0.04997486 -2.714554e-02  -0.006274045
  log_lik[1]    2.555705813 0.403423762   1.48454389   2.35885186  2.666857e+00   2.857311348
  log_lik[2]    2.692566472 0.282362063   1.98030750   2.55752521  2.773079e+00   2.895380478
  log_lik[3]    2.488818904 0.314253339   1.73848551   2.30260169  2.550341e+00   2.718012127
  log_lik[4]    2.904301284 0.114464995   2.66369293   2.83579996  2.910541e+00   2.973724689
  log_lik[5]    2.708181240 0.165508750   2.34289436   2.60738262  2.726828e+00   2.825767111
  log_lik[6]    2.882838378 0.116550615   2.64664631   2.81232773  2.891636e+00   2.959361921
  log_lik[7]    2.562381074 0.185216998   2.15783856   2.44856916  2.578104e+00   2.694194514
  log_lik[8]    2.757078048 0.143584590   2.44330464   2.67191527  2.772100e+00   2.858925630
  log_lik[9]    1.772948432 0.336986219   1.01470755   1.56772350  1.812002e+00   2.003120904
  log_lik[10]   2.782270219 0.129327353   2.51908851   2.70783879  2.789442e+00   2.872401573
  log_lik[11]   2.676493721 0.149165473   2.37028892   2.58833726  2.685417e+00   2.781310637
  log_lik[12]   2.689346179 0.147548781   2.34898298   2.59842165  2.703917e+00   2.790301885
  log_lik[13]   1.655117646 0.350212508   0.89035564   1.44754192  1.692204e+00   1.897835567
  log_lik[14]   2.315735602 0.223570596   1.82856620   2.18855504  2.339754e+00   2.470968679
  log_lik[15]   2.336257552 0.234659918   1.80962277   2.20220904  2.359357e+00   2.499221976
  log_lik[16]   2.750067765 0.144799562   2.43936329   2.66846252  2.761700e+00   2.847352381
  log_lik[17]   2.849237344 0.119040432   2.59302233   2.77905012  2.855412e+00   2.936732183
  log_lik[18]   2.854873665 0.128383725   2.56594111   2.78634984  2.867374e+00   2.939324668
  log_lik[19]   2.827445694 0.141079131   2.49274560   2.75050895  2.844707e+00   2.921729538
  log_lik[20]   2.879222346 0.132562882   2.56239744   2.81618183  2.892368e+00   2.960501675
  log_lik[21]   2.031276845 0.430813964   1.04375922   1.77254493  2.092124e+00   2.340981437
  log_lik[22]   2.794210646 0.192898810   2.29975020   2.70992941  2.829427e+00   2.922225572
  log_lik[23]   2.394610738 0.354565258   1.57687928   2.19509281  2.461157e+00   2.655566303
  log_lik[24]   2.842089822 0.170187987   2.42147977   2.75955151  2.876108e+00   2.954115976
  log_lik[25]   2.766117805 0.248386281   2.10687229   2.67176434  2.825790e+00   2.924098495
  log_lik[26]   2.331365276 0.375907485   1.38641093   2.13389389  2.390509e+00   2.596039001
  log_lik[27]   2.724524902 0.222618176   2.17864248   2.62328313  2.766477e+00   2.875725092
  log_lik[28]   2.079046902 0.359767402   1.25986051   1.88166778  2.115781e+00   2.314875490
  log_lik[29]   2.586032398 0.249523499   1.99385199   2.45626700  2.620800e+00   2.763710609
  log_lik[30]   2.888643896 0.117231901   2.63984452   2.82861591  2.899498e+00   2.961617625
  log_lik[31]   2.913703139 0.107853256   2.68263237   2.84652764  2.916332e+00   2.982253251
  log_lik[32]   2.223400348 0.275288841   1.61533570   2.06813725  2.253168e+00   2.415660064
  log_lik[33]   2.927525467 0.101472925   2.72150855   2.86208301  2.929968e+00   2.992745671
  log_lik[34]   2.927525467 0.101472925   2.72150855   2.86208301  2.929968e+00   2.992745671
  log_lik[35]   2.755647376 0.147836897   2.41523010   2.67709073  2.772998e+00   2.855836564
  log_lik[36]   2.487393541 0.206046024   1.98245468   2.36167166  2.510462e+00   2.632465902
  log_lik[37]   2.933274090 0.098615500   2.74204186   2.86895264  2.934076e+00   2.997768181
  log_lik[38]   2.932564812 0.099111750   2.73933259   2.86591848  2.934297e+00   2.996539613
  log_lik[39]   2.437828466 0.228237669   1.90492161   2.29833236  2.456494e+00   2.601064487
  log_lik[40]   2.929237732 0.099076377   2.73652812   2.86610808  2.928029e+00   2.993847562
  log_lik[41]   2.872977576 0.137155080   2.53364805   2.80200486  2.891932e+00   2.961984770
  log_lik[42]   2.613639764 0.273410265   1.94461724   2.46231211  2.656621e+00   2.815443167
  log_lik[43]   2.863288232 0.147422553   2.49647431   2.79417469  2.882816e+00   2.954996633
  log_lik[44]   2.850717739 0.159741131   2.43437337   2.77547577  2.877347e+00   2.949145748
  log_lik[45]   2.872785300 0.143342279   2.52580455   2.80503759  2.892572e+00   2.962509449
  log_lik[46]   2.685380832 0.292611053   1.93400117   2.55999152  2.750716e+00   2.886388763
  log_lik[47]   2.427475556 0.333365670   1.63538610   2.24720149  2.475169e+00   2.674218088
  log_lik[48]   2.703845523 0.238349124   2.11995209   2.58779645  2.756249e+00   2.868074998
  log_lik[49]   2.284398132 0.394152205   1.31242504   2.05321112  2.354677e+00   2.571721537
  log_lik[50]   2.725743277 0.175421428   2.32753495   2.62855619  2.751185e+00   2.842482819
  log_lik[51]   2.410985515 0.243619157   1.83048111   2.27471731  2.440028e+00   2.582200688
  log_lik[52]   1.484840583 0.407022483   0.58639078   1.23938840  1.509267e+00   1.760637242
  log_lik[53]   2.679988030 0.157482077   2.34833440   2.57916554  2.699494e+00   2.793792204
  log_lik[54]   1.860775940 0.356882336   1.03674010   1.63799099  1.898505e+00   2.115788390
  log_lik[55]   2.870298842 0.134041249   2.56742267   2.80068961  2.882107e+00   2.952984959
  log_lik[56]   2.711573351 0.159903051   2.33805445   2.62600057  2.723940e+00   2.823512093
  log_lik[57]   1.591186467 0.394174023   0.73181106   1.35362082  1.613820e+00   1.866826900
  log_lik[58]   2.884902478 0.110680249   2.63966371   2.82448235  2.888689e+00   2.957854051
  log_lik[59]   2.909263246 0.105483239   2.69454185   2.84375911  2.913201e+00   2.975829342
  log_lik[60]   2.848098240 0.120460802   2.60433367   2.77054882  2.859614e+00   2.924853777
  log_lik[61]   2.179722887 0.267678945   1.63529342   2.01854716  2.206163e+00   2.368992763
  log_lik[62]   2.900948499 0.110290993   2.67519424   2.83559986  2.908828e+00   2.972706274
  log_lik[63]  -2.089037883 1.055614388  -4.46485297  -2.74823615 -1.978084e+00  -1.388993058
  log_lik[64]   0.217412591 0.691331758  -1.30244131  -0.19300339  2.577232e-01   0.718684971
  log_lik[65]   2.873034012 0.138028791   2.57703490   2.80208973  2.890364e+00   2.958817045
  log_lik[66]   1.365285258 0.724725875  -0.19645973   0.93482568  1.448547e+00   1.898831201
  log_lik[67]   1.943082513 0.468814524   0.92274043   1.65337128  2.000942e+00   2.292893347
  log_lik[68]   2.835324941 0.191329297   2.38977540   2.75220845  2.876349e+00   2.954466999
  log_lik[69]   1.837905780 0.800885947   0.03459476   1.41330361  2.013041e+00   2.437525180
  log_lik[70]   2.011678229 0.546238574   0.75603394   1.71100922  2.105416e+00   2.420740559
  log_lik[71]   2.827298672 0.183832251   2.34311443   2.74559818  2.865355e+00   2.942688040
  log_lik[72]   2.219976843 0.331687049   1.45466437   2.03431183  2.269058e+00   2.444227358
  lp__        239.409255794 2.010666043 234.21417225 238.40189389  2.397621e+02 240.927403580
             stats
parameter             97.5%
  beta[1]       0.071992668
  beta[2]       0.073293156
  beta[3]       0.141115378
  beta[4]       0.019578524
  beta[5]      -0.007556109
  cbeta0        0.077377172
  sigma         0.024884828
  beta0         0.039992168
  log_lik[1]    3.026332813
  log_lik[2]    3.047029543
  log_lik[3]    2.936209844
  log_lik[4]    3.118233931
  log_lik[5]    2.975066210
  log_lik[6]    3.095420634
  log_lik[7]    2.872583557
  log_lik[8]    2.988620973
  log_lik[9]    2.327503062
  log_lik[10]   3.010633233
  log_lik[11]   2.928746939
  log_lik[12]   2.931229033
  log_lik[13]   2.219096585
  log_lik[14]   2.685523533
  log_lik[15]   2.739687069
  log_lik[16]   2.991905960
  log_lik[17]   3.061505579
  log_lik[18]   3.070723784
  log_lik[19]   3.060320927
  log_lik[20]   3.094216870
  log_lik[21]   2.706689967
  log_lik[22]   3.069935435
  log_lik[23]   2.914197728
  log_lik[24]   3.096854114
  log_lik[25]   3.072929553
  log_lik[26]   2.874959521
  log_lik[27]   3.027264044
  log_lik[28]   2.673131999
  log_lik[29]   2.954345189
  log_lik[30]   3.091642472
  log_lik[31]   3.112538394
  log_lik[32]   2.674622119
  log_lik[33]   3.120917517
  log_lik[34]   3.120917517
  log_lik[35]   2.991665016
  log_lik[36]   2.821154493
  log_lik[37]   3.123357094
  log_lik[38]   3.121133963
  log_lik[39]   2.820126271
  log_lik[40]   3.112857625
  log_lik[41]   3.092796353
  log_lik[42]   3.000980740
  log_lik[43]   3.093727037
  log_lik[44]   3.092411004
  log_lik[45]   3.094175887
  log_lik[46]   3.048142404
  log_lik[47]   2.897845283
  log_lik[48]   3.034282903
  log_lik[49]   2.857102923
  log_lik[50]   3.005570431
  log_lik[51]   2.803074764
  log_lik[52]   2.164799968
  log_lik[53]   2.945753353
  log_lik[54]   2.481880445
  log_lik[55]   3.091733647
  log_lik[56]   2.979217346
  log_lik[57]   2.244341122
  log_lik[58]   3.084134002
  log_lik[59]   3.107411797
  log_lik[60]   3.063353061
  log_lik[61]   2.617976030
  log_lik[62]   3.107239376
  log_lik[63]  -0.104810738
  log_lik[64]   1.361214817
  log_lik[65]   3.101222371
  log_lik[66]   2.505238319
  log_lik[67]   2.673459163
  log_lik[68]   3.088420640
  log_lik[69]   2.888233643
  log_lik[70]   2.812263455
  log_lik[71]   3.088213914
  log_lik[72]   2.745370189
  lp__        242.154798132

, , chains = chain:3

             stats
parameter              mean          sd         2.5%          25%           50%           75%
  beta[1]      4.687192e-02 0.012603092   0.02224220   0.03867635   0.047139790   0.054893417
  beta[2]     -1.283318e-04 0.038763274  -0.07783005  -0.02446199   0.000084502   0.023693054
  beta[3]      6.235675e-02 0.039868629  -0.01668995   0.03842740   0.061796315   0.088541641
  beta[4]     -1.144154e-02 0.014789280  -0.04132340  -0.02073492  -0.011456795  -0.001800738
  beta[5]     -3.740335e-02 0.015170379  -0.06604108  -0.04750805  -0.037285619  -0.027253067
  cbeta0       7.251963e-02 0.002613811   0.06749783   0.07076701   0.072521665   0.074274167
  sigma        2.096193e-02 0.001901251   0.01773725   0.01956603   0.020908996   0.022155558
  beta0       -2.842157e-02 0.032212332  -0.08978096  -0.04848694  -0.029013968  -0.007174124
  log_lik[1]   2.543255e+00 0.445648129   1.35643247   2.38973617   2.668732449   2.848637326
  log_lik[2]   2.679340e+00 0.318002708   1.83242494   2.57231879   2.767129757   2.891120727
  log_lik[3]   2.479547e+00 0.327324202   1.66949962   2.31774815   2.533466420   2.720535050
  log_lik[4]   2.891291e+00 0.123267181   2.61890325   2.82533162   2.903477493   2.971991740
  log_lik[5]   2.698571e+00 0.186173125   2.27612565   2.61092779   2.729292028   2.824701722
  log_lik[6]   2.872157e+00 0.120721951   2.60140898   2.80731748   2.886327032   2.950186436
  log_lik[7]   2.554756e+00 0.205521180   2.08956405   2.45378128   2.586371941   2.691147705
  log_lik[8]   2.749841e+00 0.149942242   2.39220996   2.67530446   2.766726334   2.848744030
  log_lik[9]   1.782544e+00 0.358691448   0.96038763   1.57342869   1.801725763   2.028949554
  log_lik[10]  2.772485e+00 0.143839111   2.46159142   2.69159289   2.793381958   2.871859762
  log_lik[11]  2.667849e+00 0.163393600   2.32331294   2.58386800   2.689725491   2.778114534
  log_lik[12]  2.685829e+00 0.153970898   2.33092385   2.60561674   2.700734799   2.788799366
  log_lik[13]  1.670250e+00 0.368810375   0.81809852   1.45175679   1.697262192   1.924279780
  log_lik[14]  2.319557e+00 0.236269756   1.77811673   2.18246032   2.333584566   2.484271179
  log_lik[15]  2.330679e+00 0.242191082   1.77962269   2.18436460   2.361122876   2.501892752
  log_lik[16]  2.740925e+00 0.152790818   2.38029758   2.65480512   2.757621960   2.843398770
  log_lik[17]  2.843539e+00 0.119788649   2.60376944   2.77674104   2.854694013   2.921465129
  log_lik[18]  2.845433e+00 0.132216436   2.54584447   2.77123839   2.854735799   2.931767124
  log_lik[19]  2.818036e+00 0.144849657   2.48360608   2.73542575   2.830595022   2.914239450
  log_lik[20]  2.870244e+00 0.133116362   2.55836437   2.80230385   2.884702813   2.958389389
  log_lik[21]  2.026432e+00 0.426974261   1.06753753   1.76148862   2.075696186   2.334955456
  log_lik[22]  2.784688e+00 0.193502497   2.29078868   2.69561585   2.821072439   2.914967101
  log_lik[23]  2.402272e+00 0.354735536   1.53525342   2.21494043   2.463363730   2.664450834
  log_lik[24]  2.837859e+00 0.166489089   2.43289367   2.76790740   2.867090885   2.942032049
  log_lik[25]  2.761416e+00 0.232218915   2.16651161   2.65500375   2.818873499   2.922452774
  log_lik[26]  2.334425e+00 0.367239856   1.43915107   2.12329551   2.406861297   2.610997989
  log_lik[27]  2.720999e+00 0.215406481   2.18526727   2.60539194   2.764038407   2.877389917
  log_lik[28]  2.082459e+00 0.356254341   1.30625966   1.85106525   2.123343001   2.347841754
  log_lik[29]  2.582251e+00 0.246605185   1.99841557   2.44865824   2.614345599   2.765280355
  log_lik[30]  2.882860e+00 0.115999558   2.62788423   2.81367354   2.894781170   2.963137365
  log_lik[31]  2.907466e+00 0.101980319   2.70196050   2.84611255   2.911904471   2.977204132
  log_lik[32]  2.226680e+00 0.269593263   1.63323880   2.06213572   2.246627350   2.423137703
  log_lik[33]  2.921720e+00 0.096980106   2.72719266   2.85846907   2.925002858   2.988224740
  log_lik[34]  2.921720e+00 0.096980106   2.72719266   2.85846907   2.925002858   2.988224740
  log_lik[35]  2.752319e+00 0.139375641   2.45220917   2.66904890   2.767407925   2.850811976
  log_lik[36]  2.487794e+00 0.198335059   2.06437099   2.37395622   2.506328349   2.625296854
  log_lik[37]  2.927989e+00 0.095929992   2.73202238   2.86684606   2.929645461   2.993729564
  log_lik[38]  2.927129e+00 0.095941733   2.73326851   2.86616420   2.927792276   2.993193641
  log_lik[39]  2.441287e+00 0.217589122   1.91279392   2.31437840   2.476516993   2.590618562
  log_lik[40]  2.924636e+00 0.097867797   2.71596945   2.86192363   2.926170520   2.992888492
  log_lik[41]  2.871686e+00 0.130185605   2.56676868   2.80296213   2.887305574   2.962437543
  log_lik[42]  2.618054e+00 0.259268166   1.97570786   2.48792198   2.670616488   2.808522364
  log_lik[43]  2.860988e+00 0.141985036   2.51742669   2.78987478   2.882226215   2.954328525
  log_lik[44]  2.848723e+00 0.153247799   2.46276145   2.77345460   2.875510662   2.949810676
  log_lik[45]  2.872298e+00 0.136415664   2.56090920   2.80057521   2.892332344   2.966459442
  log_lik[46]  2.693543e+00 0.263516998   2.00993637   2.57235205   2.755530615   2.880694694
  log_lik[47]  2.440828e+00 0.312830395   1.75240830   2.25904706   2.490531141   2.665107608
  log_lik[48]  2.695712e+00 0.241731074   2.08650310   2.59475265   2.739669276   2.859903144
  log_lik[49]  2.276663e+00 0.396646685   1.30603706   2.07779517   2.335814769   2.560675260
  log_lik[50]  2.715804e+00 0.176903798   2.28948005   2.62036002   2.736187641   2.836878428
  log_lik[51]  2.421584e+00 0.237196957   1.89738930   2.27718525   2.447709815   2.593513923
  log_lik[52]  1.513972e+00 0.388635526   0.67569262   1.27511861   1.557145771   1.781665415
  log_lik[53]  2.669327e+00 0.159077357   2.27953195   2.57883574   2.678221201   2.780516762
  log_lik[54]  1.854605e+00 0.368501379   1.02196362   1.63704496   1.883634141   2.119129762
  log_lik[55]  2.868364e+00 0.125936387   2.56736765   2.80216209   2.882735413   2.950846082
  log_lik[56]  2.713792e+00 0.159650706   2.35490186   2.61368570   2.735357061   2.827847699
  log_lik[57]  1.618631e+00 0.377404851   0.79196877   1.38615520   1.660955990   1.877974517
  log_lik[58]  2.880732e+00 0.112270531   2.63823331   2.80849040   2.885498452   2.959566422
  log_lik[59]  2.899855e+00 0.100322020   2.68656831   2.83802576   2.902549832   2.972149943
  log_lik[60]  2.837540e+00 0.115255717   2.57048769   2.77478079   2.841990214   2.914173713
  log_lik[61]  2.172053e+00 0.285151335   1.56670605   1.98080320   2.206015395   2.383092413
  log_lik[62]  2.890959e+00 0.103594113   2.66722047   2.82970932   2.896851054   2.965754884
  log_lik[63] -2.051545e+00 1.127862154  -4.48890840  -2.74414429  -1.920156284  -1.241284998
  log_lik[64]  2.658952e-01 0.638058407  -1.14200598  -0.14728861   0.329610001   0.713418734
  log_lik[65]  2.863370e+00 0.129342612   2.55939992   2.79486933   2.877800212   2.951068869
  log_lik[66]  1.370036e+00 0.746314536  -0.27054195   0.93559717   1.461945778   1.951360927
  log_lik[67]  1.957931e+00 0.454997252   0.98142421   1.67520242   2.000501556   2.306393491
  log_lik[68]  2.827101e+00 0.177525467   2.35745483   2.75316732   2.859642402   2.943337063
  log_lik[69]  1.853766e+00 0.775452868   0.13718003   1.41781292   1.995595237   2.456305549
  log_lik[70]  2.018212e+00 0.503666454   0.85765690   1.74248777   2.088647125   2.386342436
  log_lik[71]  2.825837e+00 0.175513384   2.36819786   2.74830371   2.854341721   2.941178275
  log_lik[72]  2.224492e+00 0.326780516   1.46098246   2.03768262   2.272001753   2.459073299
  lp__         2.393557e+02 2.074657630 234.39765993 238.34523662 239.750981231 240.847796260
             stats
parameter             97.5%
  beta[1]       0.071101532
  beta[2]       0.078602922
  beta[3]       0.140317772
  beta[4]       0.018162768
  beta[5]      -0.007173446
  cbeta0        0.077935288
  sigma         0.025175510
  beta0         0.036585080
  log_lik[1]    3.020214142
  log_lik[2]    3.039543731
  log_lik[3]    2.944433765
  log_lik[4]    3.095518023
  log_lik[5]    2.981648625
  log_lik[6]    3.065689222
  log_lik[7]    2.889227799
  log_lik[8]    3.000572582
  log_lik[9]    2.378855814
  log_lik[10]   3.009334319
  log_lik[11]   2.939300034
  log_lik[12]   2.945842961
  log_lik[13]   2.308624793
  log_lik[14]   2.699801767
  log_lik[15]   2.731453705
  log_lik[16]   2.994657001
  log_lik[17]   3.045674943
  log_lik[18]   3.065330036
  log_lik[19]   3.056970236
  log_lik[20]   3.081258086
  log_lik[21]   2.707667044
  log_lik[22]   3.067973384
  log_lik[23]   2.906512509
  log_lik[24]   3.066921951
  log_lik[25]   3.057859099
  log_lik[26]   2.883942507
  log_lik[27]   3.034807944
  log_lik[28]   2.673624000
  log_lik[29]   2.972272327
  log_lik[30]   3.082047715
  log_lik[31]   3.088764392
  log_lik[32]   2.687800554
  log_lik[33]   3.092161010
  log_lik[34]   3.092161010
  log_lik[35]   2.989691164
  log_lik[36]   2.824570429
  log_lik[37]   3.100103357
  log_lik[38]   3.097497426
  log_lik[39]   2.791071280
  log_lik[40]   3.097253257
  log_lik[41]   3.074381737
  log_lik[42]   2.977064743
  log_lik[43]   3.095820345
  log_lik[44]   3.092828391
  log_lik[45]   3.076685914
  log_lik[46]   3.039502292
  log_lik[47]   2.907380515
  log_lik[48]   3.028609563
  log_lik[49]   2.837183843
  log_lik[50]   2.985369808
  log_lik[51]   2.805323449
  log_lik[52]   2.217762713
  log_lik[53]   2.932874831
  log_lik[54]   2.444394038
  log_lik[55]   3.071198074
  log_lik[56]   2.986448837
  log_lik[57]   2.291878056
  log_lik[58]   3.070533537
  log_lik[59]   3.084007544
  log_lik[60]   3.047210898
  log_lik[61]   2.627840652
  log_lik[62]   3.083295756
  log_lik[63]  -0.135317133
  log_lik[64]   1.394156822
  log_lik[65]   3.074725146
  log_lik[66]   2.480596310
  log_lik[67]   2.685455658
  log_lik[68]   3.069122790
  log_lik[69]   2.911229857
  log_lik[70]   2.775577287
  log_lik[71]   3.080374891
  log_lik[72]   2.729579098
  lp__        242.277502989

library(broom)
tidyMCMC(constable.rstanarm$stanfit, conf.int = TRUE, conf.method = "HPDinterval", ess = TRUE, rhat = TRUE)

                      term     estimate   std.error     conf.low     conf.high      rhat  ess
1              (Intercept)  -0.02662339 0.030640340  -0.08652069   0.032339963 0.9990184 2266
2              I(IV^(1/3))   0.04617694 0.011955033   0.02348520   0.069677622 0.9990403 2279
3             TREATInitial  -0.00262214 0.037692592  -0.07602074   0.073892281 0.9990874 2277
4                 TREATLow   0.06178365 0.037596699  -0.01092534   0.133310013 0.9989826 2298
5 I(IV^(1/3)):TREATInitial  -0.01050189 0.014390980  -0.03897242   0.018017304 0.9991925 2267
6     I(IV^(1/3)):TREATLow  -0.03725927 0.014237953  -0.06442732  -0.009792636 0.9990557 2308
7                    sigma   0.02083855 0.001867566   0.01735171   0.024641950 0.9999474 2569
8                 mean_PPD   0.07249442 0.003450626   0.06591716   0.079287256 0.9993001 2598
9            log-posterior 169.80857221 1.999673060 165.95405422 172.982238703 1.0003394 2513

constable.mcmc = as.matrix(constable.rstanarm)
str(constable.mcmc)

 num [1:2700, 1:7] -0.00323 -0.08829 -0.03744 0.04064 -0.03022 ...
 - attr(*, "dimnames")=List of 2
  ..$ iterations: NULL
  ..$ parameters: chr [1:7] "(Intercept)" "I(IV^(1/3))" "TREATInitial" "TREATLow" ...

Xmat = model.matrix(~I(IV^(1/3)) * TREAT, data = constable)
terms = attr(Xmat, "assign")
## Effect effect
for (i in 1:length(terms)) print(mcmcpvalue(constable.mcmc[, i]))

[1] 0.3711111
[1] 0.0003703704
[1] 0.9322222
[1] 0.09259259
[1] 0.4592593
[1] 0.01111111

## Effect of covariate
mcmcpvalue(constable.mcmc[, which(terms == 1)])

[1] 0.0003703704

## Effect of treatment (marginal)
mcmcpvalue(constable.mcmc[, which(terms == 2)])

[1] 0.08888889

## Effect of interaction
mcmcpvalue(constable.mcmc[, which(terms == 3)])

[1] 0.01555556

# Overal model
wch = grep("TREAT|IV", colnames(constable.mcmc))
mcmcpvalue(constable.mcmcpack[, wch])

[1] 0

## Frequentist for comparison
summary(lm(SUTW ~ I(IV^(1/3)) * TREAT, constable))

Call:
lm(formula = SUTW ~ I(IV^(1/3)) * TREAT, data = constable)

Residuals:
      Min        1Q    Median        3Q       Max 
-0.047764 -0.012727  0.001153  0.012959  0.065303 

Coefficients:
                           Estimate Std. Error t value Pr(>|t|)    
(Intercept)              -0.0282017  0.0307338  -0.918 0.362163    
I(IV^(1/3))               0.0467744  0.0119981   3.898 0.000229 ***
TREATInitial             -0.0005899  0.0380584  -0.015 0.987680    
TREATLow                  0.0630730  0.0373498   1.689 0.095994 .  
I(IV^(1/3)):TREATInitial -0.0113263  0.0145247  -0.780 0.438302    
I(IV^(1/3)):TREATLow     -0.0377210  0.0141608  -2.664 0.009700 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.02048 on 66 degrees of freedom
Multiple R-squared:  0.4958,	Adjusted R-squared:  0.4576 
F-statistic: 12.98 on 5 and 66 DF,  p-value: 8.431e-09

Show BRMS code

summary(constable.brm)

 Family: gaussian(identity) 
Formula: SUTW ~ I(IV^(1/3)) * TREAT 
   Data: constable (Number of observations: 72) 
Samples: 3 chains, each with iter = 5000; warmup = 500; thin = 5; 
         total post-warmup samples = 2700
    ICs: LOO = NA; WAIC = NA; R2 = NA
 
Population-Level Effects: 
                     Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
Intercept               -0.03      0.03    -0.09     0.03       2365    1
IIVE1D3                  0.05      0.01     0.02     0.07       2361    1
TREATInitial             0.00      0.04    -0.07     0.07       2375    1
TREATLow                 0.06      0.04    -0.01     0.14       2432    1
IIVE1D3:TREATInitial    -0.01      0.01    -0.04     0.02       2369    1
IIVE1D3:TREATLow        -0.04      0.01    -0.07    -0.01       2417    1

Family Specific Parameters: 
      Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sigma     0.02         0     0.02     0.02       2583    1

Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample 
is a crude measure of effective sample size, and Rhat is the potential 
scale reduction factor on split chains (at convergence, Rhat = 1).

library(broom)
tidyMCMC(constable.brm$fit, conf.int = TRUE, conf.method = "HPDinterval", ess = TRUE, rhat = TRUE)

                    term      estimate   std.error    conf.low    conf.high      rhat  ess
1            b_Intercept -0.0280889091 0.030786671 -0.08771759  0.030012311 1.0014465 2365
2              b_IIVE1D3  0.0467122190 0.012000835  0.02321772  0.068893899 1.0012960 2361
3         b_TREATInitial -0.0009689161 0.038416903 -0.07024299  0.077346297 1.0008347 2375
4             b_TREATLow  0.0629078547 0.038275435 -0.01974641  0.129499738 1.0000511 2432
5 b_IIVE1D3:TREATInitial -0.0111653298 0.014644996 -0.04043846  0.016701214 1.0009657 2369
6     b_IIVE1D3:TREATLow -0.0376073240 0.014462735 -0.06485635 -0.008515056 1.0000974 2417
7                  sigma  0.0209160338 0.001855845  0.01739081  0.024625223 0.9993802 2583

constable.mcmc = as.matrix(constable.rstanarm)
str(constable.mcmc)

 num [1:2700, 1:7] -0.00323 -0.08829 -0.03744 0.04064 -0.03022 ...
 - attr(*, "dimnames")=List of 2
  ..$ iterations: NULL
  ..$ parameters: chr [1:7] "(Intercept)" "I(IV^(1/3))" "TREATInitial" "TREATLow" ...

Xmat = model.matrix(~I(IV^(1/3)) * TREAT, data = constable)
terms = attr(Xmat, "assign")
## Effect effect
for (i in 1:length(terms)) print(mcmcpvalue(constable.mcmc[, i]))

[1] 0.3711111
[1] 0.0003703704
[1] 0.9322222
[1] 0.09259259
[1] 0.4592593
[1] 0.01111111

## Effect of covariate
mcmcpvalue(constable.mcmc[, which(terms == 1)])

[1] 0.0003703704

## Effect of treatment (marginal)
mcmcpvalue(constable.mcmc[, which(terms == 2)])

[1] 0.08888889

## Effect of interaction
mcmcpvalue(constable.mcmc[, which(terms == 3)])

[1] 0.01555556

# Overal model
wch = grep("b_TREAT|IV", colnames(constable.mcmc))
mcmcpvalue(constable.mcmcpack[, wch])

[1] 0

## Frequentist for comparison
summary(lm(SUTW ~ I(IV^(1/3)) * TREAT, constable))

Call:
lm(formula = SUTW ~ I(IV^(1/3)) * TREAT, data = constable)

Residuals:
      Min        1Q    Median        3Q       Max 
-0.047764 -0.012727  0.001153  0.012959  0.065303 

Coefficients:
                           Estimate Std. Error t value Pr(>|t|)    
(Intercept)              -0.0282017  0.0307338  -0.918 0.362163    
I(IV^(1/3))               0.0467744  0.0119981   3.898 0.000229 ***
TREATInitial             -0.0005899  0.0380584  -0.015 0.987680    
TREATLow                  0.0630730  0.0373498   1.689 0.095994 .  
I(IV^(1/3)):TREATInitial -0.0113263  0.0145247  -0.780 0.438302    
I(IV^(1/3)):TREATLow     -0.0377210  0.0141608  -2.664 0.009700 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.02048 on 66 degrees of freedom
Multiple R-squared:  0.4958,	Adjusted R-squared:  0.4576 
F-statistic: 12.98 on 5 and 66 DF,  p-value: 8.431e-09

The presence of an interaction means that we cannot simply make conclusions about the effect of treatment without reference to intial volume. That is, we need to delve further into the interaction. One approach we took for multiple linear regression was to explore the effect of one predictor at set levels of the other predictor.

Whilst we could take a similar approach here, Constable, took a different approach. Rather than explore the effects of the different treatments at 3-5 different suture widths, Constable elected to estimate the range of suture widths over which pairwise treatment differences were different. We will take the same approach here, just in a Bayesian framework.

Explore the range of suture widths for which pairwise treatment differences occur.

Show MCMCpack code

library(MCMCpack)
constable.mcmc = constable.mcmcpack
## It is easier to first determine the range of IV for which the the treatment pairs do not differ.
## After which we can subtract this interval from the full domain Lets consider the three treatments
## (High, Initial, Low) Start by calculating the intercepts and slopes.
a.High = as.vector(constable.mcmc[, 1])
a.Initial = as.vector(constable.mcmc[, 3]) + a.High
a.Low = as.vector(constable.mcmc[, 4]) + a.High

b.High = as.vector(constable.mcmc[, 2])
b.Initial = as.vector(constable.mcmc[, 5]) + b.High
b.Low = as.vector(constable.mcmc[, 6]) + b.High

## Generate a function that finds the intersection of two lines
inter <- function(m1, a1, m2, a2) {
    data.frame(x = ((a2 - a1)/(m1 - m2)), y = m1 * ((a2 - a1)/(m1 - m2)) + a1)
}

## We will also define the maximum and minimum domain in which to explore the range of inital volume.
minx = min(constable$IV)
maxx = max(constable$IV)

## Start with High vs Initial
intersect.HvsI = inter(b.High, a.High, b.Initial, a.Initial)
## Narrow this to the likely domain
intersect.HvsI.1 <- subset(intersect.HvsI, x < 15^(1/3) & x > minx^(1/3))$x
comp.HvsI.1 = tidyMCMC(as.mcmc(intersect.HvsI.1^3), conf.int = TRUE, conf.method = "HPDinterval")

intersect.HvsI.2 <- subset(intersect.HvsI, x < 45.5^(1/3) & x > 17^(1/3))$x
comp.HvsI.2 = tidyMCMC(as.mcmc(intersect.HvsI.2^3), conf.int = TRUE, conf.method = "HPDinterval")
(comp.HvsI = data.frame(lower = comp.HvsI.1$conf.high, upper = comp.HvsI.2$conf.low))

     lower    upper
1 9.026002 25.59764

## Now High vs Low
intersect.HvsL = inter(b.High, a.High, b.Low, a.Low)$x
comp.HvsL = tidyMCMC(as.mcmc(intersect.HvsL^3), conf.int = TRUE, conf.method = "HPDinterval")
(comp.HvsL = data.frame(lower = comp.HvsL$conf.high, upper = maxx))

     lower upper
1 10.62578  47.5

## Finally Initial vs Low
intersect.IvsL = inter(b.Initial, a.Initial, b.Low, a.Low)$x
comp.IvsL = tidyMCMC(as.mcmc(intersect.IvsL^3), conf.int = TRUE, conf.method = "HPDinterval")
(comp.IvsL = data.frame(lower = comp.IvsL$conf.high, upper = maxx))

     lower upper
1 23.49624  47.5

Show JAGS code

constable.mcmc = constable.r2jags$BUGSoutput$sims.matrix
wch = c(which(colnames(constable.mcmc) == "beta0"), grep("beta\\[", colnames(constable.mcmc)))
constable.mcmc = constable.mcmc[, wch]

## It is easier to first determine the range of IV for which the the treatment pairs do not differ.
## After which we can subtract this interval from the full domain Lets consider the three treatments
## (High, Initial, Low) Start by calculating the intercepts and slopes.
a.High = as.vector(constable.mcmc[, 1])
a.Initial = as.vector(constable.mcmc[, 3]) + a.High
a.Low = as.vector(constable.mcmc[, 4]) + a.High

b.High = as.vector(constable.mcmc[, 2])
b.Initial = as.vector(constable.mcmc[, 5]) + b.High
b.Low = as.vector(constable.mcmc[, 6]) + b.High

## Generate a function that finds the intersection of two lines
inter <- function(m1, a1, m2, a2) {
    data.frame(x = ((a2 - a1)/(m1 - m2)), y = m1 * ((a2 - a1)/(m1 - m2)) + a1)
}

## We will also define the maximum and minimum domain in which to explore the range of inital volume.
minx = min(constable$IV)
maxx = max(constable$IV)

## Start with High vs Initial
intersect.HvsI = inter(b.High, a.High, b.Initial, a.Initial)
## Narrow this to the likely domain
intersect.HvsI.1 <- subset(intersect.HvsI, x < 15^(1/3) & x > minx^(1/3))$x
comp.HvsI.1 = tidyMCMC(as.mcmc(intersect.HvsI.1^3), conf.int = TRUE, conf.method = "HPDinterval")

intersect.HvsI.2 <- subset(intersect.HvsI, x < 45.5^(1/3) & x > 17^(1/3))$x
comp.HvsI.2 = tidyMCMC(as.mcmc(intersect.HvsI.2^3), conf.int = TRUE, conf.method = "HPDinterval")
(comp.HvsI = data.frame(lower = comp.HvsI.1$conf.high, upper = comp.HvsI.2$conf.low))

     lower    upper
1 8.816773 31.58686

## Now High vs Low
intersect.HvsL = inter(b.High, a.High, b.Low, a.Low)$x
comp.HvsL = tidyMCMC(as.mcmc(intersect.HvsL^3), conf.int = TRUE, conf.method = "HPDinterval")
(comp.HvsL = data.frame(lower = comp.HvsL$conf.high, upper = maxx))

     lower upper
1 10.39011  47.5

## Finally Initial vs Low
intersect.IvsL = inter(b.Initial, a.Initial, b.Low, a.Low)$x
comp.IvsL = tidyMCMC(as.mcmc(intersect.IvsL^3), conf.int = TRUE, conf.method = "HPDinterval")
(comp.IvsL = data.frame(lower = comp.IvsL$conf.high, upper = maxx))

     lower upper
1 22.97178  47.5

Show RSTAN code

constable.mcmc = as.matrix(constable.rstan)
wch = c(which(colnames(constable.mcmc) == "beta0"), grep("beta\\[", colnames(constable.mcmc)))
constable.mcmc = constable.mcmc[, wch]

## It is easier to first determine the range of IV for which the the treatment pairs do not differ.
## After which we can subtract this interval from the full domain Lets consider the three treatments
## (High, Initial, Low) Start by calculating the intercepts and slopes.
a.High = as.vector(constable.mcmc[, 1])
a.Initial = as.vector(constable.mcmc[, 3]) + a.High
a.Low = as.vector(constable.mcmc[, 4]) + a.High

b.High = as.vector(constable.mcmc[, 2])
b.Initial = as.vector(constable.mcmc[, 5]) + b.High
b.Low = as.vector(constable.mcmc[, 6]) + b.High

## Generate a function that finds the intersection of two lines
inter <- function(m1, a1, m2, a2) {
    data.frame(x = ((a2 - a1)/(m1 - m2)), y = m1 * ((a2 - a1)/(m1 - m2)) + a1)
}

## We will also define the maximum and minimum domain in which to explore the range of inital volume.
minx = min(constable$IV)
maxx = max(constable$IV)

## Start with High vs Initial
intersect.HvsI = inter(b.High, a.High, b.Initial, a.Initial)
## Narrow this to the likely domain
intersect.HvsI.1 <- subset(intersect.HvsI, x < 15^(1/3) & x > minx^(1/3))$x
comp.HvsI.1 = tidyMCMC(as.mcmc(intersect.HvsI.1^3), conf.int = TRUE, conf.method = "HPDinterval")

intersect.HvsI.2 <- subset(intersect.HvsI, x < 45.5^(1/3) & x > 17^(1/3))$x
comp.HvsI.2 = tidyMCMC(as.mcmc(intersect.HvsI.2^3), conf.int = TRUE, conf.method = "HPDinterval")
(comp.HvsI = data.frame(lower = comp.HvsI.1$conf.high, upper = comp.HvsI.2$conf.low))

     lower    upper
1 9.191592 28.29357

## Now High vs Low
intersect.HvsL = inter(b.High, a.High, b.Low, a.Low)$x
comp.HvsL = tidyMCMC(as.mcmc(intersect.HvsL^3), conf.int = TRUE, conf.method = "HPDinterval")
(comp.HvsL = data.frame(lower = comp.HvsL$conf.high, upper = maxx))

     lower upper
1 10.52219  47.5

## Finally Initial vs Low
intersect.IvsL = inter(b.Initial, a.Initial, b.Low, a.Low)$x
comp.IvsL = tidyMCMC(as.mcmc(intersect.IvsL^3), conf.int = TRUE, conf.method = "HPDinterval")
(comp.IvsL = data.frame(lower = comp.IvsL$conf.high, upper = maxx))

     lower upper
1 23.28955  47.5

Show RSTANARM code

constable.mcmc = as.matrix(constable.rstan)
wch = c(which(colnames(constable.mcmc) == "beta0"), grep("beta\\[", colnames(constable.mcmc)))
constable.mcmc = constable.mcmc[, wch]

## It is easier to first determine the range of IV for which the the treatment pairs do not differ.
## After which we can subtract this interval from the full domain Lets consider the three treatments
## (High, Initial, Low) Start by calculating the intercepts and slopes.
a.High = as.vector(constable.mcmc[, 1])
a.Initial = as.vector(constable.mcmc[, 3]) + a.High
a.Low = as.vector(constable.mcmc[, 4]) + a.High

b.High = as.vector(constable.mcmc[, 2])
b.Initial = as.vector(constable.mcmc[, 5]) + b.High
b.Low = as.vector(constable.mcmc[, 6]) + b.High

## Generate a function that finds the intersection of two lines
inter <- function(m1, a1, m2, a2) {
    data.frame(x = ((a2 - a1)/(m1 - m2)), y = m1 * ((a2 - a1)/(m1 - m2)) + a1)
}

## We will also define the maximum and minimum domain in which to explore the range of inital volume.
minx = min(constable$IV)
maxx = max(constable$IV)

## Start with High vs Initial
intersect.HvsI = inter(b.High, a.High, b.Initial, a.Initial)
## Narrow this to the likely domain
intersect.HvsI.1 <- subset(intersect.HvsI, x < 15^(1/3) & x > minx^(1/3))$x
comp.HvsI.1 = tidyMCMC(as.mcmc(intersect.HvsI.1^3), conf.int = TRUE, conf.method = "HPDinterval")

intersect.HvsI.2 <- subset(intersect.HvsI, x < 45.5^(1/3) & x > 17^(1/3))$x
comp.HvsI.2 = tidyMCMC(as.mcmc(intersect.HvsI.2^3), conf.int = TRUE, conf.method = "HPDinterval")
(comp.HvsI = data.frame(lower = comp.HvsI.1$conf.high, upper = comp.HvsI.2$conf.low))

     lower    upper
1 9.191592 28.29357

## Now High vs Low
intersect.HvsL = inter(b.High, a.High, b.Low, a.Low)$x
comp.HvsL = tidyMCMC(as.mcmc(intersect.HvsL^3), conf.int = TRUE, conf.method = "HPDinterval")
(comp.HvsL = data.frame(lower = comp.HvsL$conf.high, upper = maxx))

     lower upper
1 10.52219  47.5

## Finally Initial vs Low
intersect.IvsL = inter(b.Initial, a.Initial, b.Low, a.Low)$x
comp.IvsL = tidyMCMC(as.mcmc(intersect.IvsL^3), conf.int = TRUE, conf.method = "HPDinterval")
(comp.IvsL = data.frame(lower = comp.IvsL$conf.high, upper = maxx))

     lower upper
1 23.28955  47.5

Show BRMS code

constable.mcmc = as.matrix(constable.brm)
wch = grep("b_", colnames(constable.mcmc))
constable.mcmc = constable.mcmc[, wch]

## It is easier to first determine the range of IV for which the the treatment pairs do not differ.
## After which we can subtract this interval from the full domain Lets consider the three treatments
## (High, Initial, Low) Start by calculating the intercepts and slopes.
a.High = as.vector(constable.mcmc[, 1])
a.Initial = as.vector(constable.mcmc[, 3]) + a.High
a.Low = as.vector(constable.mcmc[, 4]) + a.High

b.High = as.vector(constable.mcmc[, 2])
b.Initial = as.vector(constable.mcmc[, 5]) + b.High
b.Low = as.vector(constable.mcmc[, 6]) + b.High

## Generate a function that finds the intersection of two lines
inter <- function(m1, a1, m2, a2) {
    data.frame(x = ((a2 - a1)/(m1 - m2)), y = m1 * ((a2 - a1)/(m1 - m2)) + a1)
}

## We will also define the maximum and minimum domain in which to explore the range of inital volume.
minx = min(constable$IV)
maxx = max(constable$IV)

## Start with High vs Initial
intersect.HvsI = inter(b.High, a.High, b.Initial, a.Initial)
## Narrow this to the likely domain
intersect.HvsI.1 <- subset(intersect.HvsI, x < 15^(1/3) & x > minx^(1/3))$x
comp.HvsI.1 = tidyMCMC(as.mcmc(intersect.HvsI.1^3), conf.int = TRUE, conf.method = "HPDinterval")

intersect.HvsI.2 <- subset(intersect.HvsI, x < 45.5^(1/3) & x > 17^(1/3))$x
comp.HvsI.2 = tidyMCMC(as.mcmc(intersect.HvsI.2^3), conf.int = TRUE, conf.method = "HPDinterval")
(comp.HvsI = data.frame(lower = comp.HvsI.1$conf.high, upper = comp.HvsI.2$conf.low))

     lower    upper
1 8.654387 31.48765

## Now High vs Low
intersect.HvsL = inter(b.High, a.High, b.Low, a.Low)$x
comp.HvsL = tidyMCMC(as.mcmc(intersect.HvsL^3), conf.int = TRUE, conf.method = "HPDinterval")
(comp.HvsL = data.frame(lower = comp.HvsL$conf.high, upper = maxx))

     lower upper
1 10.36942  47.5

## Finally Initial vs Low
intersect.IvsL = inter(b.Initial, a.Initial, b.Low, a.Low)$x
comp.IvsL = tidyMCMC(as.mcmc(intersect.IvsL^3), conf.int = TRUE, conf.method = "HPDinterval")
(comp.IvsL = data.frame(lower = comp.IvsL$conf.high, upper = maxx))

     lower upper
1 22.98159  47.5

Generate a summary figure.

Show MCMCpack code

library(MCMCpack)
constable.mcmc = constable.mcmcpack
newdata <- with(constable, expand.grid(TREAT = levels(TREAT), IV = seq(min(IV), max(IV), len = 100)))
Xmat <- model.matrix(~I(IV^(1/3)) * TREAT, data = newdata)
wch = grep("Intercept|TREAT|IV", colnames(constable.mcmc))
coefs = constable.mcmc[, wch]
fit = coefs %*% t(Xmat)

newdata = newdata %>% bind_cols(tidyMCMC(fit, conf.int = TRUE, conf.method = "HPDinterval"))
ggplot(newdata, aes(y = estimate, x = IV, linetype = TREAT)) + geom_point(data = constable, aes(y = SUTW,
    x = IV, shape = TREAT, fill = TREAT)) + geom_segment(data = NULL, aes(y = 0.02, yend = 0.02, x = comp.HvsI[1,
    1], xend = comp.HvsI[1, 2])) + annotate(geom = "text", y = 0.025, x = rowMeans(comp.HvsI[1, 1:2]),
    label = "High vs Initial") + geom_segment(data = NULL, aes(y = 0.01, yend = 0.01, x = comp.HvsL[1,
    1], xend = comp.HvsL[1, 2])) + annotate(geom = "text", y = 0.015, x = rowMeans(comp.HvsL[1, 1:2]),
    label = "High vs Low") + geom_segment(data = NULL, aes(y = 0, yend = 0, x = comp.IvsL[1, 1], xend = comp.IvsL[1,
    2])) + annotate(geom = "text", y = 0.005, x = rowMeans(comp.IvsL[1, 1:2]), label = "Initial vs Low") +
    geom_line() + geom_ribbon(aes(ymin = conf.low, ymax = conf.high), fill = "grey", alpha = 0.2) + scale_y_continuous("Suture width (mm)") +
    scale_x_continuous("Suture width (mm)", trans = trans_new("root", function(x) x^(1/3), function(x) x^3,
        domain = c(0, Inf))) + scale_shape_manual("Food regime", values = c(21, 21, 21)) + scale_fill_manual("Food regime",
    values = c("black", "grey40", "white")) + scale_linetype_manual("Food regime", values = c("solid",
    "dashed", "dotted")) + theme_classic() + theme(legend.justification = c(0, 1), legend.position = c(0,
    1), legend.key.width = unit(1, "cm"), axis.title.y = element_text(vjust = 2, size = rel(1.25)), axis.title.x = element_text(vjust = -2,
    size = rel(1.25)), plot.margin = unit(c(0.5, 0.5, 2, 2), "lines"))

Show JAGS code

                                          constable.mcmc = constable.r2jags$BUGSoutput$sims.matrix
                                          newdata <- with(constable, expand.grid(TREAT=levels(TREAT),
                                                          IV=seq(min(IV), max(IV), len=100)))
                                          Xmat <- model.matrix(~I(IV^(1/3))*TREAT, data=newdata)
                                          wch = c(which(colnames(constable.mcmc)=='beta0'), grep('beta\\[', colnames(constable.mcmc)))

                                          coefs = constable.mcmc[,wch]
                                          fit = coefs %*% t(Xmat)

                                          newdata = newdata %>% bind_cols(tidyMCMC(fit, conf.int=TRUE, conf.method='HPDinterval'))
                                          ggplot(newdata, aes(y=estimate, x=IV, linetype=TREAT)) +
                                            geom_point(data=constable, aes(y=SUTW, x=IV, shape=TREAT, fill=TREAT))+
                                            geom_segment(data=NULL, aes(y=0.02,yend=0.02, x=comp.HvsI[1,1], xend=comp.HvsI[1,2]))+
                                            annotate(geom='text', y=0.025,x=rowMeans(comp.HvsI[1,1:2]), label='High vs Initial')+
                                            geom_segment(data=NULL, aes(y=0.01,yend=0.01, x=comp.HvsL[1,1], xend=comp.HvsL[1,2]))+
                                            annotate(geom='text', y=0.015,x=rowMeans(comp.HvsL[1,1:2]), label='High vs Low')+
                                            geom_segment(data=NULL, aes(y=0.00,yend=0.00, x=comp.IvsL[1,1], xend=comp.IvsL[1,2]))+
                                            annotate(geom='text', y=0.005,x=rowMeans(comp.IvsL[1,1:2]), label='Initial vs Low')+
                                            geom_line()+
                                                geom_ribbon(aes(ymin=conf.low, ymax=conf.high), fill='grey',alpha=0.2)+
                                                scale_y_continuous('Suture width (mm)')+
                                                scale_x_continuous('Suture width (mm)',trans=trans_new('root',function(x) x^(1/3),
                          function(x) x^3, domain=c(0,Inf)))+
                                                scale_shape_manual('Food regime', values=c(21,21,21))+
                                                scale_fill_manual('Food regime', values=c('black','grey40','white'))+
                                                scale_linetype_manual('Food regime', values=c('solid','dashed','dotted'))+
                                                theme_classic() +
                                                theme(legend.justification=c(0,1), legend.position=c(0,1), legend.key.width=unit(1,'cm'),
                                                 axis.title.y=element_text(vjust=2, size=rel(1.25)),
                                                 axis.title.x=element_text(vjust=-2, size=rel(1.25)),
                                                 plot.margin=unit(c(0.5,0.5,2,2), 'lines'))

Show RSTAN code

                                          constable.mcmc = as.matrix(constable.rstan)
                                          newdata <- with(constable, expand.grid(TREAT=levels(TREAT),
                                          IV=seq(min(IV), max(IV), len=100)))
                                          Xmat <- model.matrix(~I(IV^(1/3))*TREAT, data=newdata)
                                          wch = c(which(colnames(constable.mcmc) == "beta0"), grep("beta\\[", colnames(constable.mcmc)))
                                          coefs = constable.mcmc[,wch]
                                          fit = coefs %*% t(Xmat)

                                          newdata = newdata %>% bind_cols(tidyMCMC(fit, conf.int=TRUE, conf.method='HPDinterval'))
                                          ggplot(newdata, aes(y=estimate, x=IV, linetype=TREAT)) +
                                          geom_point(data=constable, aes(y=SUTW, x=IV, shape=TREAT, fill=TREAT))+
                                          geom_segment(data=NULL, aes(y=0.02,yend=0.02, x=comp.HvsI[1,1], xend=comp.HvsI[1,2]))+
                                          annotate(geom='text', y=0.025,x=rowMeans(comp.HvsI[1,1:2]), label='High vs Initial')+
                                          geom_segment(data=NULL, aes(y=0.01,yend=0.01, x=comp.HvsL[1,1], xend=comp.HvsL[1,2]))+
                                          annotate(geom='text', y=0.015,x=rowMeans(comp.HvsL[1,1:2]), label='High vs Low')+
                                          geom_segment(data=NULL, aes(y=0.00,yend=0.00, x=comp.IvsL[1,1], xend=comp.IvsL[1,2]))+
                                          annotate(geom='text', y=0.005,x=rowMeans(comp.IvsL[1,1:2]), label='Initial vs Low')+
                                          geom_line()+
                                          geom_ribbon(aes(ymin=conf.low, ymax=conf.high), fill='grey',alpha=0.2)+
                                          scale_y_continuous('Suture width (mm)')+
                                          scale_x_continuous('Suture width (mm)',trans=trans_new('root',function(x) x^(1/3),
                      function(x) x^3, domain=c(0,Inf)))+
                                          scale_shape_manual('Food regime', values=c(21,21,21))+
                                          scale_fill_manual('Food regime', values=c('black','grey40','white'))+
                                          scale_linetype_manual('Food regime', values=c('solid','dashed','dotted'))+
                                          theme_classic() +
                                          theme(legend.justification=c(0,1), legend.position=c(0,1), legend.key.width=unit(1,'cm'),
                                          axis.title.y=element_text(vjust=2, size=rel(1.25)),
                                          axis.title.x=element_text(vjust=-2, size=rel(1.25)),
                                          plot.margin=unit(c(0.5,0.5,2,2), 'lines'))

Show RSTANARM code

                                          constable.mcmc = as.matrix(constable.rstanarm)
                                          newdata <- with(constable, expand.grid(TREAT=levels(TREAT),
                                          IV=seq(min(IV), max(IV), len=100)))
                                          Xmat <- model.matrix(~I(IV^(1/3))*TREAT, data=newdata)
                                          wch = grep("Intercept|TREAT|IV", colnames(constable.mcmc))
                                          coefs = constable.mcmc[,wch]
                                          fit = coefs %*% t(Xmat)

                                          newdata = newdata %>% bind_cols(tidyMCMC(fit, conf.int=TRUE, conf.method='HPDinterval'))
                                          ggplot(newdata, aes(y=estimate, x=IV, linetype=TREAT)) +
                                          geom_point(data=constable, aes(y=SUTW, x=IV, shape=TREAT, fill=TREAT))+
                                          geom_segment(data=NULL, aes(y=0.02,yend=0.02, x=comp.HvsI[1,1], xend=comp.HvsI[1,2]))+
                                          annotate(geom='text', y=0.025,x=rowMeans(comp.HvsI[1,1:2]), label='High vs Initial')+
                                          geom_segment(data=NULL, aes(y=0.01,yend=0.01, x=comp.HvsL[1,1], xend=comp.HvsL[1,2]))+
                                          annotate(geom='text', y=0.015,x=rowMeans(comp.HvsL[1,1:2]), label='High vs Low')+
                                          geom_segment(data=NULL, aes(y=0.00,yend=0.00, x=comp.IvsL[1,1], xend=comp.IvsL[1,2]))+
                                          annotate(geom='text', y=0.005,x=rowMeans(comp.IvsL[1,1:2]), label='Initial vs Low')+
                                          geom_line()+
                                          geom_ribbon(aes(ymin=conf.low, ymax=conf.high), fill='grey',alpha=0.2)+
                                          scale_y_continuous('Suture width (mm)')+
                                          scale_x_continuous('Suture width (mm)',trans=trans_new('root',function(x) x^(1/3),
                      function(x) x^3, domain=c(0,Inf)))+
                                          scale_shape_manual('Food regime', values=c(21,21,21))+
                                          scale_fill_manual('Food regime', values=c('black','grey40','white'))+
                                          scale_linetype_manual('Food regime', values=c('solid','dashed','dotted'))+
                                          theme_classic() +
                                          theme(legend.justification=c(0,1), legend.position=c(0,1), legend.key.width=unit(1,'cm'),
                                          axis.title.y=element_text(vjust=2, size=rel(1.25)),
                                          axis.title.x=element_text(vjust=-2, size=rel(1.25)),
                                          plot.margin=unit(c(0.5,0.5,2,2), 'lines'))

Show BRMS code

                                          constable.mcmc = as.matrix(constable.brm)
                                          newdata <- with(constable, expand.grid(TREAT=levels(TREAT),
                                          IV=seq(min(IV), max(IV), len=100)))
                                          Xmat <- model.matrix(~I(IV^(1/3))*TREAT, data=newdata)
                                          wch = grep("b_", colnames(constable.mcmc))
                                          coefs = constable.mcmc[,wch]
                                          fit = coefs %*% t(Xmat)

                                          newdata = newdata %>% bind_cols(tidyMCMC(fit, conf.int=TRUE, conf.method='HPDinterval'))
                                          ggplot(newdata, aes(y=estimate, x=IV, linetype=TREAT)) +
                                          geom_point(data=constable, aes(y=SUTW, x=IV, shape=TREAT, fill=TREAT))+
                                          geom_segment(data=NULL, aes(y=0.02,yend=0.02, x=comp.HvsI[1,1], xend=comp.HvsI[1,2]))+
                                          annotate(geom='text', y=0.025,x=rowMeans(comp.HvsI[1,1:2]), label='High vs Initial')+
                                          geom_segment(data=NULL, aes(y=0.01,yend=0.01, x=comp.HvsL[1,1], xend=comp.HvsL[1,2]))+
                                          annotate(geom='text', y=0.015,x=rowMeans(comp.HvsL[1,1:2]), label='High vs Low')+
                                          geom_segment(data=NULL, aes(y=0.00,yend=0.00, x=comp.IvsL[1,1], xend=comp.IvsL[1,2]))+
                                          annotate(geom='text', y=0.005,x=rowMeans(comp.IvsL[1,1:2]), label='Initial vs Low')+
                                          geom_line()+
                                          geom_ribbon(aes(ymin=conf.low, ymax=conf.high), fill='grey',alpha=0.2)+
                                          scale_y_continuous('Suture width (mm)')+
                                          scale_x_continuous('Suture width (mm)',trans=trans_new('root',function(x) x^(1/3),
                      function(x) x^3, domain=c(0,Inf)))+
                                          scale_shape_manual('Food regime', values=c(21,21,21))+
                                          scale_fill_manual('Food regime', values=c('black','grey40','white'))+
                                          scale_linetype_manual('Food regime', values=c('solid','dashed','dotted'))+
                                          theme_classic() +
                                          theme(legend.justification=c(0,1), legend.position=c(0,1), legend.key.width=unit(1,'cm'),
                                          axis.title.y=element_text(vjust=2, size=rel(1.25)),
                                          axis.title.x=element_text(vjust=-2, size=rel(1.25)),
                                          plot.margin=unit(c(0.5,0.5,2,2), 'lines'))

Explore $R^2$

Show MCMCpack code

library(MCMCpack)
library(broom)
constable.mcmc <- constable.mcmcpack
Xmat = model.matrix(~I(IV^(1/3)) * TREAT, data = constable)
wch = grep("Intercept|IV|TREAT", colnames(constable.mcmc))
coefs = constable.mcmc[, wch]
fit = coefs %*% t(Xmat)
resid = sweep(fit, 2, constable$SUTW, "-")
var_f = apply(fit, 1, var)
var_e = apply(resid, 1, var)
R2 = var_f/(var_f + var_e)
tidyMCMC(as.mcmc(R2), conf.int = TRUE, conf.method = "HPDinterval")

  term estimate  std.error  conf.low conf.high
1 var1 0.488892 0.06173019 0.3642251  0.596765

# for comparison with frequentist
summary(lm(SUTW ~ I(IV^(1/3)) * TREAT, data = constable))

Call:
lm(formula = SUTW ~ I(IV^(1/3)) * TREAT, data = constable)

Residuals:
      Min        1Q    Median        3Q       Max 
-0.047764 -0.012727  0.001153  0.012959  0.065303 

Coefficients:
                           Estimate Std. Error t value Pr(>|t|)    
(Intercept)              -0.0282017  0.0307338  -0.918 0.362163    
I(IV^(1/3))               0.0467744  0.0119981   3.898 0.000229 ***
TREATInitial             -0.0005899  0.0380584  -0.015 0.987680    
TREATLow                  0.0630730  0.0373498   1.689 0.095994 .  
I(IV^(1/3)):TREATInitial -0.0113263  0.0145247  -0.780 0.438302    
I(IV^(1/3)):TREATLow     -0.0377210  0.0141608  -2.664 0.009700 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.02048 on 66 degrees of freedom
Multiple R-squared:  0.4958,	Adjusted R-squared:  0.4576 
F-statistic: 12.98 on 5 and 66 DF,  p-value: 8.431e-09

Show JAGS code

library(broom)
constable.mcmc <- constable.r2jags$BUGSoutput$sims.matrix
Xmat = model.matrix(~I(IV^(1/3)) * TREAT, data = constable)
wch = c(which(colnames(constable.mcmc) == "beta0"), grep("beta\\[", colnames(constable.mcmc)))
coefs = constable.mcmc[, wch]
fit = coefs %*% t(Xmat)
resid = sweep(fit, 2, constable$SUTW, "-")
var_f = apply(fit, 1, var)
var_e = apply(resid, 1, var)
R2 = var_f/(var_f + var_e)
tidyMCMC(as.mcmc(R2), conf.int = TRUE, conf.method = "HPDinterval")

  term  estimate  std.error  conf.low conf.high
1 var1 0.4889044 0.05956118 0.3673575 0.5925647

# for comparison with frequentist
summary(lm(SUTW ~ I(IV^(1/3)) * TREAT, data = constable))

Call:
lm(formula = SUTW ~ I(IV^(1/3)) * TREAT, data = constable)

Residuals:
      Min        1Q    Median        3Q       Max 
-0.047764 -0.012727  0.001153  0.012959  0.065303 

Coefficients:
                           Estimate Std. Error t value Pr(>|t|)    
(Intercept)              -0.0282017  0.0307338  -0.918 0.362163    
I(IV^(1/3))               0.0467744  0.0119981   3.898 0.000229 ***
TREATInitial             -0.0005899  0.0380584  -0.015 0.987680    
TREATLow                  0.0630730  0.0373498   1.689 0.095994 .  
I(IV^(1/3)):TREATInitial -0.0113263  0.0145247  -0.780 0.438302    
I(IV^(1/3)):TREATLow     -0.0377210  0.0141608  -2.664 0.009700 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.02048 on 66 degrees of freedom
Multiple R-squared:  0.4958,	Adjusted R-squared:  0.4576 
F-statistic: 12.98 on 5 and 66 DF,  p-value: 8.431e-09

Show RSTAN code

library(broom)
constable.mcmc <- as.matrix(constable.rstan)
Xmat = model.matrix(~I(IV^(1/3)) * TREAT, data = constable)
wch = c(which(colnames(constable.mcmc) == "beta0"), grep("beta\\[", colnames(constable.mcmc)))
coefs = constable.mcmc[, wch]
fit = coefs %*% t(Xmat)
resid = sweep(fit, 2, constable$SUTW, "-")
var_f = apply(fit, 1, var)
var_e = apply(resid, 1, var)
R2 = var_f/(var_f + var_e)
tidyMCMC(as.mcmc(R2), conf.int = TRUE, conf.method = "HPDinterval")

  term  estimate  std.error  conf.low conf.high
1 var1 0.4882446 0.06035995 0.3734576 0.6005657

# for comparison with frequentist
summary(lm(SUTW ~ I(IV^(1/3)) * TREAT, data = constable))

Call:
lm(formula = SUTW ~ I(IV^(1/3)) * TREAT, data = constable)

Residuals:
      Min        1Q    Median        3Q       Max 
-0.047764 -0.012727  0.001153  0.012959  0.065303 

Coefficients:
                           Estimate Std. Error t value Pr(>|t|)    
(Intercept)              -0.0282017  0.0307338  -0.918 0.362163    
I(IV^(1/3))               0.0467744  0.0119981   3.898 0.000229 ***
TREATInitial             -0.0005899  0.0380584  -0.015 0.987680    
TREATLow                  0.0630730  0.0373498   1.689 0.095994 .  
I(IV^(1/3)):TREATInitial -0.0113263  0.0145247  -0.780 0.438302    
I(IV^(1/3)):TREATLow     -0.0377210  0.0141608  -2.664 0.009700 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.02048 on 66 degrees of freedom
Multiple R-squared:  0.4958,	Adjusted R-squared:  0.4576 
F-statistic: 12.98 on 5 and 66 DF,  p-value: 8.431e-09

Show RSTANARM code

library(broom)
constable.mcmc <- as.matrix(constable.rstanarm)
Xmat = model.matrix(~I(IV^(1/3)) * TREAT, data = constable)
wch = grep("Intercept|TREAT|IV", colnames(constable.mcmc))
coefs = constable.mcmc[, wch]
fit = coefs %*% t(Xmat)
resid = sweep(fit, 2, constable$SUTW, "-")
var_f = apply(fit, 1, var)
var_e = apply(resid, 1, var)
R2 = var_f/(var_f + var_e)
tidyMCMC(as.mcmc(R2), conf.int = TRUE, conf.method = "HPDinterval")

  term  estimate  std.error  conf.low conf.high
1 var1 0.4892397 0.05991533 0.3732155 0.5966039

# for comparison with frequentist
summary(lm(SUTW ~ I(IV^(1/3)) * TREAT, data = constable))

Call:
lm(formula = SUTW ~ I(IV^(1/3)) * TREAT, data = constable)

Residuals:
      Min        1Q    Median        3Q       Max 
-0.047764 -0.012727  0.001153  0.012959  0.065303 

Coefficients:
                           Estimate Std. Error t value Pr(>|t|)    
(Intercept)              -0.0282017  0.0307338  -0.918 0.362163    
I(IV^(1/3))               0.0467744  0.0119981   3.898 0.000229 ***
TREATInitial             -0.0005899  0.0380584  -0.015 0.987680    
TREATLow                  0.0630730  0.0373498   1.689 0.095994 .  
I(IV^(1/3)):TREATInitial -0.0113263  0.0145247  -0.780 0.438302    
I(IV^(1/3)):TREATLow     -0.0377210  0.0141608  -2.664 0.009700 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.02048 on 66 degrees of freedom
Multiple R-squared:  0.4958,	Adjusted R-squared:  0.4576 
F-statistic: 12.98 on 5 and 66 DF,  p-value: 8.431e-09

Show BRMS code

library(broom)
constable.mcmc <- as.matrix(constable.brm)
Xmat = model.matrix(~I(IV^(1/3)) * TREAT, data = constable)
wch = grep("b_", colnames(constable.mcmc))
coefs = constable.mcmc[, wch]
fit = coefs %*% t(Xmat)
resid = sweep(fit, 2, constable$SUTW, "-")
var_f = apply(fit, 1, var)
var_e = apply(resid, 1, var)
R2 = var_f/(var_f + var_e)
tidyMCMC(as.mcmc(R2), conf.int = TRUE, conf.method = "HPDinterval")

  term  estimate  std.error  conf.low conf.high
1 var1 0.4888112 0.06064694 0.3706096 0.5948545

# for comparison with frequentist
summary(lm(SUTW ~ I(IV^(1/3)) * TREAT, data = constable))

Call:
lm(formula = SUTW ~ I(IV^(1/3)) * TREAT, data = constable)

Residuals:
      Min        1Q    Median        3Q       Max 
-0.047764 -0.012727  0.001153  0.012959  0.065303 

Coefficients:
                           Estimate Std. Error t value Pr(>|t|)    
(Intercept)              -0.0282017  0.0307338  -0.918 0.362163    
I(IV^(1/3))               0.0467744  0.0119981   3.898 0.000229 ***
TREATInitial             -0.0005899  0.0380584  -0.015 0.987680    
TREATLow                  0.0630730  0.0373498   1.689 0.095994 .  
I(IV^(1/3)):TREATInitial -0.0113263  0.0145247  -0.780 0.438302    
I(IV^(1/3)):TREATLow     -0.0377210  0.0141608  -2.664 0.009700 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.02048 on 66 degrees of freedom
Multiple R-squared:  0.4958,	Adjusted R-squared:  0.4576 
F-statistic: 12.98 on 5 and 66 DF,  p-value: 8.431e-09