Jump to main navigation


Tutorial 7.5b - Analysis of Covariance (Bayesian)

12 Jan 2018

Overview

Previous tutorials have concentrated on designs for either continuous (Regression) or categorical (ANOVA) predictor variables. Analysis of covariance (ANCOVA) models are essentially ANOVA models that incorporate one or more continuous and categorical variables (covariates). Although the relationship between a response variable and a covariate may itself be of substantial biological interest, typically covariate(s) are incorporated to reduce the amount of unexplained variability in the model (analogous to blocking - see tutorials ) and thereby increase the power of any treatment effects.

Linear model

The generic linear model is presented here purely for revisory purposes. If, it is unfamiliar to you or you are unsure about what parameters are to be estimated and tested, you are strongly advised to review the the tutorial on frequentist analysis of covariance.

The linear models for two and three factor design are: $$y_{ijk}=\mu+\alpha_i + \beta_{j} + \varepsilon_{ijk}$$ where $\mu$ is the overall mean, $\alpha$ is the effect of Factor A, $\beta$ is the effect of the covariate (Factor B), and $\varepsilon$ is the random unexplained or residual component.

ANCOVA in R

Data and scenario

Consider an experimental design aimed at exploring the effects of a categorical variable with three levels (Group A, Group B and Group C) on a response. From previous studies, we know that the response is influenced by another variable (covariate). Unfortunately, it was not possible to ensure that all sampling units were the same degree of the covariate. Therefore, in an attempt to account for this anticipated extra source of variability, we measured the level of the covariate for each sampling unit. Actually, in allocating treatments to the various treatment groups, we tried to ensure a similar mean and range of the covariate within each group.

Random data incorporating the following trends (effect parameters)
  • the sample size per treatment=10
  • the categorical $x$ variable with 3 levels
  • the first treatment group has a population mean of 40.
  • the other two treatments reduce the mean by 15 and 20 units respectively
  • the data are drawn from normal distributions with a mean of 0 and standard deviation of 4 ($\sigma^2=16$)
  • the covariate (B) is a continuous variable with a mean of 20 and a standard deviation of 15
set.seed(3)
n <- 10
p <- 3
A.eff <- c(40, -15, -20)
beta <- -0.45
sigma <- 4
B <- rnorm(n * p, 0, 15)
A <- gl(p, n, lab = paste("Group", LETTERS[1:3]))
mm <- model.matrix(~A + B)
data <- data.frame(A = A, B = B, Y = as.numeric(c(A.eff, beta) %*% t(mm)) + rnorm(n * p, 0, 4))
data$B <- data$B + 20
head(data)
        A         B        Y
1 Group A  5.570999 50.09555
2 Group A 15.612114 45.38163
3 Group A 23.881823 41.16404
4 Group A  2.718022 50.72290
5 Group A 22.936742 37.26995
6 Group A 20.451859 42.61873

Assumptions

As ANCOVA designs are essentially regular ANOVA designs that are first adjusted (centered) for the covariate(s), ANCOVA designs inherit all of the underlying assumptions of the appropriate ANOVA design. Readers should also eventually consult Tutorial 7.6a, Tutorial 9.2a, Tutorial 9.3a and Tutorial 9.4a. Specifically, hypothesis tests assume that:

  1. the appropriate residuals are normally distributed. Boxplots using the appropriate scale of replication (reflecting the appropriate residuals/F-ratio denominator, see the above tables) should be used to explore normality. Scale transformations are often useful.
  2. the appropriate residuals are equally varied. Boxplots and plots of means against variance (using the appropriate scale of replication) should be used to explore the spread of values. Residual plots should reveal no patterns. Scale transformations are often useful.
  3. the appropriate residuals are independent of one another.
  4. the relationship between the response variable and the covariate should be linear. Linearity can be explored using scatterplots and residual plots should reveal no patterns.
  5. for repeated measures and other designs in which treatment levels within blocks can not be be randomly ordered, the variance/covariance matrix is assumed to display sphericity.
  6. for designs that utilize blocking, it is assumed that there are no block by within block interactions
  7. homogeneity of slopes. Since a single slope is estimated for the covariate (rather than a separate slope per treatment), it is important that the trends between the response and the covariate is similar for each Group.

Exploratory data analysis

library(car)
scatterplot(Y ~ B | A, data = data)
plot of chunk tut7.5bS2.1
boxplot(Y ~ A, data)
plot of chunk tut7.5bS2.1
# OR via ggplot
library(ggplot2)
ggplot(data, aes(y = Y, x = B, group = A)) + geom_point() + geom_smooth(method = "lm")
plot of chunk tut7.5bS2.1
ggplot(data, aes(y = Y, x = A)) + geom_boxplot()
plot of chunk tut7.5bS2.1
Conclusions:
  • there is no evidence of obvious non-normality
  • the assumption of linearity seems reasonable
  • the variability of the three groups seems approximately equal
  • the slopes (Y vs B trends) appear broadly similar for each treatment group

Homogeneity of slopes

We can explore inferential evidence of unequal slopes by examining estimated effects of the interaction between the categorical variable and the covariate. Note, pay no attention to the main effects - only the interaction. Even though I intend to illustrate Bayesian analyses here, for such a simple model, it is considerably simpler to use traditional OLS for testing for the presence of an interaction..

anova(lm(Y ~ B * A, data = data))
Analysis of Variance Table

Response: Y
          Df  Sum Sq Mean Sq F value    Pr(>F)    
B          1  354.80  354.80 23.9691 5.414e-05 ***
A          2 2772.56 1386.28 93.6531 4.609e-12 ***
B:A        2   55.08   27.54  1.8606    0.1773    
Residuals 24  355.26   14.80                      
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Conclusions: there is very little evidence to suggest that the assumption of equal slopes will be inappropriate.

Model fitting or statistical analysis

Consistent with Tutorial 7.2b we will explore Bayesian modelling of ANCOVA using a variety of tools (such as MCMCpack, JAGS, RSTAN, RSTANARM and BRMS). Whilst JAGS and RSTAN are extremely flexible and thus allow models to be formulated that contain not only the simple model, but also additional derivatives, the other approaches are more restrictive. Consequently, I will mostly restrict models to just the minimum necessary and all derivatives will instead be calculated in R itself from the returned posteriors.

The observed response ($y_i$) are assumed to be drawn from a normal distribution with a given mean ($\mu$) and standard deviation ($\sigma$). The expected values ($\mu$) are themselves determined by the linear predictor ($\mathbf{X}\boldsymbol{\beta}$). In this case, $\boldsymbol{\beta}$ represents the vector of $\beta$'s - the intercept associated with the first group, the (effects) differences between this intercept and the intercepts for each other group as well as the slope associated with the continuous covariate. $\mathbf{X}$ is the model matrix.

MCMC sampling requires priors on all parameters. We will employ weakly informative priors. Specifying 'uninformative' priors is always a bit of a balancing act. If the priors are too vague (wide) the MCMC sampler can wander off into nonscence areas of likelihood rather than concentrate around areas of highest likelihood (desired when wanting the outcomes to be largely driven by the data). On the other hand, if the priors are too strong, they may have an influence on the parameters. In such a simple model, this balance is very forgiving - it is for more complex models that prior choice becomes more important.

For this simple model, we will go with zero-centered Gaussian (normal) priors with relatively large standard deviations (100) for both the intercept and the treatment effect and a wide half-cauchy (scale=5) for the standard deviation. $$ \begin{align} y_{ij} &\sim{} N(\mu_{ij}, \sigma)\\ \mu_{ij} &= \beta_0 + \mathbf{X}\boldsymbol{\beta}\\[1em] \beta_0 &\sim{} N(0,100)\\ \beta &\sim{} N(0,10)\\ \sigma &\sim{} cauchy(0,5)\\ \end{align} $$ Note, exploratory data analysis suggests that while the intercept (intercept of Group A) and categorical predictor effects (differences between intercepts of each of the Group and Group A's intercept) could be drawn from a similar distribution (with mean in the 10's and variances in the 100's), the slope (effect associated with Group A linear relationship) is likely to be an order of magnitude less. We might therefore be tempted to provide different priors for the intercept, categorical effects and slope effect. For a simple model such as this, it is unlikely to be necessary. However, for more complex models, where prior specification becomes more critical, separate priors would probably be necessary.

library(MCMCpack)
data.mcmcpack <- MCMCregress(Y ~ A + B, data = data)

Define the model

modelString = "
  model {
  #Likelihood
  for (i in 1:n) {
  y[i]~dnorm(mean[i],tau)
  mean[i] <- inprod(beta[],X[i,])
  }
  #Priors
  for (i in 1:ngroups) {
  beta[i] ~ dnorm(0, 1.0E-6) 
  }
  sigma ~ dunif(0, 100)
  tau <- 1 / (sigma * sigma)
  }
  "

Define the data

Arrange the data as a list (as required by BUGS). As input, JAGS will need to be supplied with:

  • the response variable (y)
  • the predictor model matrix (X)
  • the total number of observed items (n)
  • the number of predictor terms (nX)
This all needs to be contained within a list object. We will create two data lists, one for each of the hypotheses.

X <- model.matrix(~A + B, data)
data.list <- with(data, list(y = Y, X = X, n = nrow(data), ngroups = ncol(X)))

Define the MCMC chain parameters

Next we should define the behavioural parameters of the MCMC sampling chains. Include the following:

  • the nodes (estimated parameters) to monitor (return samples for)
  • the number of MCMC chains (3)
  • the number of burnin steps (1000)
  • the thinning factor (10)
  • the number of MCMC iterations - determined by the number of samples to save, the rate of thinning and the number of chains

params <- c("beta", "sigma")
nChains = 3
burnInSteps = 3000
thinSteps = 10
numSavedSteps = 15000  #across all chains
nIter = ceiling(burnInSteps + (numSavedSteps * thinSteps)/nChains)
nIter
[1] 53000

Fit the model

Now run the JAGS code via the R2jags interface. Note that the first time jags is run after the R2jags package is loaded, it is often necessary to run any kind of randomization function just to initiate the .Random.seed variable.

library(R2jags)
data.r2jags <- jags(data = data.list, inits = NULL, parameters.to.save = params,
    model.file = textConnection(modelString), n.chains = nChains, n.iter = nIter,
    n.burnin = burnInSteps, n.thin = thinSteps)
Compiling model graph
   Resolving undeclared variables
   Allocating nodes
Graph information:
   Observed stochastic nodes: 30
   Unobserved stochastic nodes: 5
   Total graph size: 232

Initializing model
print(data.r2jags)
Inference for Bugs model at "5", fit using jags,
 3 chains, each with 53000 iterations (first 3000 discarded), n.thin = 10
 n.sims = 15000 iterations saved
         mu.vect sd.vect    2.5%     25%     50%     75%   97.5%  Rhat n.eff
beta[1]   50.201   1.803  46.617  49.021  50.207  51.404  53.741 1.001  8400
beta[2]  -17.141   1.879 -20.832 -18.393 -17.149 -15.910 -13.402 1.001  8400
beta[3]  -22.917   1.882 -26.640 -24.161 -22.922 -21.682 -19.181 1.001  5900
beta[4]   -0.414   0.065  -0.543  -0.457  -0.414  -0.372  -0.285 1.001 15000
sigma      4.175   0.619   3.178   3.735   4.108   4.531   5.587 1.001  7200
deviance 169.232   3.548 164.517 166.636 168.502 171.075 178.013 1.001 15000

For each parameter, n.eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor (at convergence, Rhat=1).

DIC info (using the rule, pD = var(deviance)/2)
pD = 6.3 and DIC = 175.5
DIC is an estimate of expected predictive error (lower deviance is better).
data.mcmc.list <- as.mcmc(data.r2jags)

Whilst Gibbs sampling provides an elegantly simple MCMC sampling routine, very complex hierarchical models can take enormous numbers of iterations (often prohibitory large) to converge on a stable posterior distribution.

To address this, Andrew Gelman (and other collaborators) have implemented a variation on Hamiltonian Monte Carlo (HMC: a sampler that selects subsequent samples in a way that reduces the correlation between samples, thereby speeding up convergence) called the No-U-Turn (NUTS) sampler. All of these developments are brought together into a tool called Stan ("Sampling Through Adaptive Neighborhoods").

By design (to appeal to the vast BUGS users), Stan models are defined in a manner reminiscent of BUGS. Stan first converts these models into C++ code which is then compiled to allow very rapid computation.

Consistent with the use of C++, the model must be accompanied by variable declarations for all inputs and parameters.

One important difference between Stan and JAGS is that whereas BUGS (and thus JAGS) use precision rather than variance, Stan uses variance.

Stan itself is a stand-alone command line application. However, conveniently, the authors of Stan have also developed an R interface to Stan called Rstan which can be used much like R2jags.

Model matrix formulation

The minimum model in Stan required to fit the above simple regression follows. Note the following modifications from the model defined in JAGS:
  • the normal distribution is defined by variance rather than precision
  • rather than using a uniform prior for sigma, I am using a half-Cauchy

We now translate the likelihood model into STAN code.
$$\begin{align} y_{ij}&\sim{}N(\mu_{ij}, \sigma)\\ \mu_{ij} &= \mathbf{X}\boldsymbol{\beta}\\ \beta_0&\sim{}N(0,100)\\ \beta&\sim{}N(0,10)\\ \sigma&\sim{}Cauchy(0,5)\\ \end{align} $$

Define the model

modelString = "
  data {
  int<lower=1> n;
  int<lower=1> nX;
  vector [n] y;
  matrix [n,nX] X;
  }
  parameters {
  vector[nX] beta;
  real<lower=0> sigma;
  }
  transformed parameters {
  vector[n] mu;

  mu = X*beta;
  }
  model {
  #Likelihood
  y~normal(mu,sigma);
  
  #Priors
  beta ~ normal(0,100);
  sigma~cauchy(0,5);
  }
  generated quantities {
  vector[n] log_lik;
  
  for (i in 1:n) {
  log_lik[i] = normal_lpdf(y[i] | mu[i], sigma); 
  }
  }
  "

Define the data

Arrange the data as a list (as required by BUGS). As input, JAGS will need to be supplied with:

  • the response variable (y)
  • the predictor model matrix (X)
  • the total number of observed items (n)
  • the number of predictor terms (nX)
This all needs to be contained within a list object. We will create two data lists, one for each of the hypotheses.

Xmat <- model.matrix(~A + B, data)
data.list <- with(data, list(y = Y, X = Xmat, nX = ncol(Xmat), n = nrow(data)))

Fit the model

Now run the JAGS code via the R2jags interface. Note that the first time jags is run after the R2jags package is loaded, it is often necessary to run any kind of randomization function just to initiate the .Random.seed variable.

## load the rstan package
library(rstan)
data.rstan <- stan(data = data.list, model_code = modelString, chains = 3,
    iter = 2000, warmup = 500, thin = 3)
In file included from /usr/local/lib/R/site-library/BH/include/boost/config.hpp:39:0,
                 from /usr/local/lib/R/site-library/BH/include/boost/math/tools/config.hpp:13,
                 from /usr/local/lib/R/site-library/StanHeaders/include/stan/math/rev/core/var.hpp:7,
                 from /usr/local/lib/R/site-library/StanHeaders/include/stan/math/rev/core/gevv_vvv_vari.hpp:5,
                 from /usr/local/lib/R/site-library/StanHeaders/include/stan/math/rev/core.hpp:12,
                 from /usr/local/lib/R/site-library/StanHeaders/include/stan/math/rev/mat.hpp:4,
                 from /usr/local/lib/R/site-library/StanHeaders/include/stan/math.hpp:4,
                 from /usr/local/lib/R/site-library/StanHeaders/include/src/stan/model/model_header.hpp:4,
                 from file4be61ee4e612.cpp:8:
/usr/local/lib/R/site-library/BH/include/boost/config/compiler/gcc.hpp:186:0: warning: "BOOST_NO_CXX11_RVALUE_REFERENCES" redefined
 #  define BOOST_NO_CXX11_RVALUE_REFERENCES
 ^
<command-line>:0:0: note: this is the location of the previous definition

SAMPLING FOR MODEL '3d2414c9dcf4b5e12be870eadd2c894a' NOW (CHAIN 1).

Gradient evaluation took 1.5e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.15 seconds.
Adjust your expectations accordingly!


Iteration:    1 / 2000 [  0%]  (Warmup)
Iteration:  200 / 2000 [ 10%]  (Warmup)
Iteration:  400 / 2000 [ 20%]  (Warmup)
Iteration:  501 / 2000 [ 25%]  (Sampling)
Iteration:  700 / 2000 [ 35%]  (Sampling)
Iteration:  900 / 2000 [ 45%]  (Sampling)
Iteration: 1100 / 2000 [ 55%]  (Sampling)
Iteration: 1300 / 2000 [ 65%]  (Sampling)
Iteration: 1500 / 2000 [ 75%]  (Sampling)
Iteration: 1700 / 2000 [ 85%]  (Sampling)
Iteration: 1900 / 2000 [ 95%]  (Sampling)
Iteration: 2000 / 2000 [100%]  (Sampling)

 Elapsed Time: 0.062651 seconds (Warm-up)
               0.061975 seconds (Sampling)
               0.124626 seconds (Total)


SAMPLING FOR MODEL '3d2414c9dcf4b5e12be870eadd2c894a' NOW (CHAIN 2).

Gradient evaluation took 7e-06 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.07 seconds.
Adjust your expectations accordingly!


Iteration:    1 / 2000 [  0%]  (Warmup)
Iteration:  200 / 2000 [ 10%]  (Warmup)
Iteration:  400 / 2000 [ 20%]  (Warmup)
Iteration:  501 / 2000 [ 25%]  (Sampling)
Iteration:  700 / 2000 [ 35%]  (Sampling)
Iteration:  900 / 2000 [ 45%]  (Sampling)
Iteration: 1100 / 2000 [ 55%]  (Sampling)
Iteration: 1300 / 2000 [ 65%]  (Sampling)
Iteration: 1500 / 2000 [ 75%]  (Sampling)
Iteration: 1700 / 2000 [ 85%]  (Sampling)
Iteration: 1900 / 2000 [ 95%]  (Sampling)
Iteration: 2000 / 2000 [100%]  (Sampling)

 Elapsed Time: 0.053706 seconds (Warm-up)
               0.057327 seconds (Sampling)
               0.111033 seconds (Total)


SAMPLING FOR MODEL '3d2414c9dcf4b5e12be870eadd2c894a' NOW (CHAIN 3).

Gradient evaluation took 7e-06 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.07 seconds.
Adjust your expectations accordingly!


Iteration:    1 / 2000 [  0%]  (Warmup)
Iteration:  200 / 2000 [ 10%]  (Warmup)
Iteration:  400 / 2000 [ 20%]  (Warmup)
Iteration:  501 / 2000 [ 25%]  (Sampling)
Iteration:  700 / 2000 [ 35%]  (Sampling)
Iteration:  900 / 2000 [ 45%]  (Sampling)
Iteration: 1100 / 2000 [ 55%]  (Sampling)
Iteration: 1300 / 2000 [ 65%]  (Sampling)
Iteration: 1500 / 2000 [ 75%]  (Sampling)
Iteration: 1700 / 2000 [ 85%]  (Sampling)
Iteration: 1900 / 2000 [ 95%]  (Sampling)
Iteration: 2000 / 2000 [100%]  (Sampling)

 Elapsed Time: 0.054613 seconds (Warm-up)
               0.064711 seconds (Sampling)
               0.119324 seconds (Total)
print(data.rstan, par = c("beta", "sigma"))
Inference for Stan model: 3d2414c9dcf4b5e12be870eadd2c894a.
3 chains, each with iter=2000; warmup=500; thin=3; 
post-warmup draws per chain=500, total post-warmup draws=1500.

          mean se_mean   sd   2.5%    25%    50%    75%  97.5% n_eff Rhat
beta[1]  50.22    0.06 1.79  46.60  49.03  50.24  51.40  53.83  1027    1
beta[2] -17.12    0.06 1.92 -20.88 -18.31 -17.14 -15.96 -13.19  1210    1
beta[3] -22.88    0.05 1.93 -26.78 -24.11 -22.86 -21.58 -19.24  1407    1
beta[4]  -0.42    0.00 0.06  -0.54  -0.46  -0.42  -0.37  -0.29  1282    1
sigma     4.11    0.02 0.58   3.12   3.70   4.05   4.45   5.37  1353    1

Samples were drawn using NUTS(diag_e) at Fri Nov  3 15:05:11 2017.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).

The STAN team has put together pre-compiled modules (functions) to make specifying and applying STAN models much simpler. Each function offers a consistent interface that is Also reminiscent of major frequentist linear modelling routines in R.

Whilst it is not necessary to specify priors when using rstanarm functions (as defaults will be generated), there is no guarantee that the routines for determining these defaults will persist over time. Furthermore, it is always better to define your own priors if for no other reason that it forces you to thing about what you re doing. Consistent with the pure STAN version, we will employ the following priors:

  • weakly informative Gaussian prior for the intercept $\beta_0 \sim{} N(0, 100)$
  • weakly informative Gaussian prior for the treatment effect $\beta_1 \sim{} N(0, 100)$
  • half-cauchy prior for the variance $\sigma \sim{} Cauchy(0, 5)$

Note, I am using the refresh=0 option so as to suppress the larger regular output in the interest of keeping output to what is necessary for this tutorial. When running outside of a tutorial context, the regular verbose output is useful as it provides a way to gauge progress.

library(rstanarm)
library(broom)
library(coda)
data.rstanarm = stan_glm(Y ~ A + B, data = data, iter = 2000, warmup = 500,
    chains = 3, thin = 2, refresh = 0, prior_intercept = normal(0, 100),
    prior = normal(0, 100), prior_aux = cauchy(0, 2))
Gradient evaluation took 5.5e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.55 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.149605 seconds (Warm-up)
               0.181861 seconds (Sampling)
               0.331466 seconds (Total)


Gradient evaluation took 1.2e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.12 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.120154 seconds (Warm-up)
               0.198657 seconds (Sampling)
               0.318811 seconds (Total)


Gradient evaluation took 1.4e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.14 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.182133 seconds (Warm-up)
               0.174642 seconds (Sampling)
               0.356775 seconds (Total)
print(data.rstanarm)
stan_glm
 family:  gaussian [identity]
 formula: Y ~ A + B
------

Estimates:
            Median MAD_SD
(Intercept)  50.2    1.8 
AGroup B    -17.2    1.9 
AGroup C    -22.9    1.9 
B            -0.4    0.1 
sigma         4.1    0.6 

Sample avg. posterior predictive 
distribution of y (X = xbar):
         Median MAD_SD
mean_PPD 30.0    1.1  

------
For info on the priors used see help('prior_summary.stanreg').
tidyMCMC(data.rstanarm, conf.int = TRUE, conf.method = "HPDinterval")
         term    estimate std.error    conf.low   conf.high
1 (Intercept)  50.2286902 1.9365801  46.4428738  54.1263008
2    AGroup B -17.2114440 1.9386491 -21.0066484 -13.4338089
3    AGroup C -22.9037572 1.9679361 -26.6757211 -19.0189639
4           B  -0.4138748 0.0681432  -0.5451839  -0.2852376
5       sigma   4.1990238 0.6317351   3.0581470   5.4328442

The brms package serves a similar goal to the rstanarm package - to provide a simple user interface to STAN. However, unlike the rstanarm implementation, brms simply converts the formula, data, priors and family into STAN model code and data before executing stan with those elements.

Whilst it is not necessary to specify priors when using brms functions (as defaults will be generated), there is no guarantee that the routines for determining these defaults will persist over time. Furthermore, it is always better to define your own priors if for no other reason that it forces you to thing about what you are doing. Consistent with the pure STAN version, we will employ the following priors:

  • weakly informative Gaussian prior for the intercept $\beta_0 \sim{} N(0, 100)$
  • weakly informative Gaussian prior for the treatment effect $\beta_1 \sim{} N(0, 100)$
  • half-cauchy prior for the variance $\sigma \sim{} Cauchy(0, 5)$

Note, I am using the refresh=0. option so as to suppress the larger regular output in the interest of keeping output to what is necessary for this tutorial. When running outside of a tutorial context, the regular verbose output is useful as it provides a way to gauge progress.

library(brms)
library(broom)
library(coda)
data.brms = brm(Y ~ A + B, data = data, iter = 2000, warmup = 500, chains = 3,
    thin = 2, refresh = 0, prior = c(prior(normal(0, 100), class = "Intercept"),
        prior(normal(0, 100), class = "b"), prior(cauchy(0, 5), class = "sigma")))
Gradient evaluation took 1.8e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.18 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.036403 seconds (Warm-up)
               0.043672 seconds (Sampling)
               0.080075 seconds (Total)


Gradient evaluation took 8e-06 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.08 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.033241 seconds (Warm-up)
               0.041044 seconds (Sampling)
               0.074285 seconds (Total)


Gradient evaluation took 7e-06 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.07 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.036154 seconds (Warm-up)
               0.041079 seconds (Sampling)
               0.077233 seconds (Total)
print(data.brms)
 Family: gaussian(identity) 
Formula: Y ~ A + B 
   Data: data (Number of observations: 30) 
Samples: 3 chains, each with iter = 2000; warmup = 500; thin = 2; 
         total post-warmup samples = 2250
    ICs: LOO = NA; WAIC = NA; R2 = NA
 
Population-Level Effects: 
          Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
Intercept    50.20      1.76    46.79    53.75       2177    1
AGroupB     -17.17      1.83   -20.81   -13.68       2158    1
AGroupC     -22.89      1.85   -26.43   -19.25       2127    1
B            -0.41      0.06    -0.54    -0.29       2129    1

Family Specific Parameters: 
      Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sigma     4.11      0.59     3.14     5.41       1854    1

Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample 
is a crude measure of effective sample size, and Rhat is the potential 
scale reduction factor on split chains (at convergence, Rhat = 1).
tidyMCMC(data.brms, conf.int = TRUE, conf.method = "HPDinterval")
         term    estimate  std.error    conf.low   conf.high
1 b_Intercept  50.1970335 1.75540856  46.6017008  53.5440980
2   b_AGroupB -17.1728328 1.83135058 -20.6803108 -13.5582289
3   b_AGroupC -22.8915861 1.85449337 -26.3846325 -19.2178870
4         b_B  -0.4122714 0.06370931  -0.5416812  -0.2961189
5       sigma   4.1081576 0.58644550   3.1075609   5.3375706

MCMC diagnostics

In addition to the regular model diagnostic checks (such as residual plots), for Bayesian analyses, it is necessary to explore the characteristics of the MCMC chains and the sampler in general. Recall that the purpose of MCMC sampling is to replicate the posterior distribution of the model likelihood and priors by drawing a known number of samples from this posterior (thereby formulating a probability distribution). This is only reliable if the MCMC samples accurately reflect the posterior.

Unfortunately, since we only know the posterior in the most trivial of circumstances, it is necessary to rely on indirect measures of how accurately the MCMC samples are likely to reflect the likelihood. I will breifly outline the most important diagnostics, however, please refer to Tutorial 4.3, Secton 3.1: Markov Chain Monte Carlo sampling for a discussion of these diagnostics.

  • Traceplots for each parameter illustrate the MCMC sample values after each successive iteration along the chain. Bad chain mixing (characterized by any sort of pattern) suggests that the MCMC sampling chains may not have completely traversed all features of the posterior distribution and that more iterations are required to ensure the distribution has been accurately represented.
    plot of chunk sim7
  • Autocorrelation plot for each paramter illustrate the degree of correlation between MCMC samples separated by different lags. For example, a lag of 0 represents the degree of correlation between each MCMC sample and itself (obviously this will be a correlation of 1). A lag of 1 represents the degree of correlation between each MCMC sample and the next sample along the Chain and so on. In order to be able to generate unbiased estimates of parameters, the MCMC samples should be independent (uncorrelated). In the figures below, this would be violated in the top autocorrelation plot and met in the bottom autocorrelation plot.
    plot of chunk sim7 plot of chunk sim9
  • Rhat statistic for each parameter provides a measure of sampling efficiency/effectiveness. Ideally, all values should be less than 1.05. If there are values of 1.05 or greater it suggests that the sampler was not very efficient or effective. Not only does this mean that the sampler was potentiall slower than it could have been, more importantly, it could indicate that the sampler spent time sampling in a region of the likelihood that is less informative. Such a situation can arise from either a misspecified model or overly vague priors that permit sampling in otherwise nonscence parameter space.

Prior to inspecting any summaries of the parameter estimates, it is prudent to inspect a range of chain convergence diagnostics.

  • Trace plots
    View trace plots
    library(MCMCpack)
    plot(data.mcmcpack)
    
    plot of chunk tut7.5bMCMCpackTrace
    plot of chunk tut7.5bMCMCpackTrace
    Trace plots show no evidence that the chains have not reasonably traversed the entire multidimensional parameter space.
  • Raftery diagnostic
    View Raftery diagnostic
    library(MCMCpack)
    raftery.diag(data.mcmcpack)
    
    Quantile (q) = 0.025
    Accuracy (r) = +/- 0.005
    Probability (s) = 0.95 
                                                       
                 Burn-in  Total Lower bound  Dependence
                 (M)      (N)   (Nmin)       factor (I)
     (Intercept) 2        3834  3746         1.020     
     AGroup B    2        3771  3746         1.010     
     AGroup C    2        3962  3746         1.060     
     B           2        3741  3746         0.999     
     sigma2      2        3771  3746         1.010     
    
    The Raftery diagnostics estimate that we would require about 3900 samples to reach the specified level of confidence in convergence. As we have 10,000 samples, we can be confidence that convergence has occurred.
  • Autocorrelation diagnostic
    View autocorrelations
    library(MCMCpack)
    autocorr.diag(data.mcmcpack)
    
            (Intercept)      AGroup B     AGroup C            B       sigma2
    Lag 0   1.000000000  1.000000e+00  1.000000000  1.000000000  1.000000000
    Lag 1  -0.003346420 -4.064355e-03 -0.002480710  0.002480390  0.132659644
    Lag 5   0.016500991  1.299476e-02 -0.003939996  0.001939800 -0.018030548
    Lag 10 -0.005233676 -6.125242e-06 -0.013564685 -0.006005238  0.006407515
    Lag 50  0.006330549 -9.343650e-03 -0.011855223 -0.002673821  0.007502186
    
    A lag of 1 appears to be mainly sufficient to avoid autocorrelation.

Again, prior to examining the summaries, we should have explored the convergence diagnostics.

library(coda)
data.mcmc = as.mcmc(data.r2jags)
  • Trace plots
    plot(data.mcmc)
    
    plot of chunk tut7.5bJAGSTrace
    plot of chunk tut7.5bJAGSTrace
    Trace plots show no evidence that the chains have not reasonably traversed the entire multidimensional parameter space.

    When there are a lot of parameters, this can result in a very large number of traceplots. To focus on just certain parameters (such as $\beta$s)

    preds <- c("beta[1]", "beta[2]", "beta[3]", "beta[4]")
    plot(as.mcmc(data.r2jags)[, preds])
    
    plot of chunk tut7.5bJAGSTrace1
  • Raftery diagnostic
    raftery.diag(data.mcmc)
    
    [[1]]
    
    Quantile (q) = 0.025
    Accuracy (r) = +/- 0.005
    Probability (s) = 0.95 
                                                    
              Burn-in  Total Lower bound  Dependence
              (M)      (N)   (Nmin)       factor (I)
     beta[1]  20       36200 3746          9.66     
     beta[2]  20       38660 3746         10.30     
     beta[3]  20       38030 3746         10.20     
     beta[4]  20       36200 3746          9.66     
     deviance 20       37410 3746          9.99     
     sigma    20       38030 3746         10.20     
    
    
    [[2]]
    
    Quantile (q) = 0.025
    Accuracy (r) = +/- 0.005
    Probability (s) = 0.95 
                                                    
              Burn-in  Total Lower bound  Dependence
              (M)      (N)   (Nmin)       factor (I)
     beta[1]  20       39300 3746         10.50     
     beta[2]  20       38660 3746         10.30     
     beta[3]  20       39950 3746         10.70     
     beta[4]  20       37410 3746          9.99     
     deviance 20       39300 3746         10.50     
     sigma    20       36800 3746          9.82     
    
    
    [[3]]
    
    Quantile (q) = 0.025
    Accuracy (r) = +/- 0.005
    Probability (s) = 0.95 
                                                    
              Burn-in  Total Lower bound  Dependence
              (M)      (N)   (Nmin)       factor (I)
     beta[1]  20       38030 3746         10.20     
     beta[2]  20       36800 3746          9.82     
     beta[3]  20       37410 3746          9.99     
     beta[4]  20       38030 3746         10.20     
     deviance 20       36800 3746          9.82     
     sigma    20       36200 3746          9.66     
    
    The Raftery diagnostics for each chain estimate that we would require no more than 5000 samples to reach the specified level of confidence in convergence. As we have 16,667 samples, we can be confidence that convergence has occurred.
  • Autocorrelation diagnostic
    autocorr.diag(data.mcmc)
    
                  beta[1]       beta[2]      beta[3]      beta[4]      deviance         sigma
    Lag 0    1.0000000000  1.0000000000  1.000000000  1.000000000  1.0000000000  1.0000000000
    Lag 10  -0.0086789407  0.0002409218 -0.012834845 -0.014876564 -0.0155077394 -0.0009823887
    Lag 50  -0.0133637002 -0.0146591155 -0.022266022  0.007955582 -0.0131845968 -0.0066436459
    Lag 100  0.0007452402  0.0128511117 -0.001800657 -0.007853232  0.0008253937 -0.0101001901
    Lag 500 -0.0029682531  0.0011976990  0.001848702 -0.005150388 -0.0016034261 -0.0118212907
    
    A lag of 10 appears to be sufficient to avoid autocorrelation (poor mixing).

Again, prior to examining the summaries, we should have explored the convergence diagnostics. There are numerous ways of working with STAN model fits (for exploring diagnostics and summarization).

  • extract the mcmc samples and convert them into a mcmc.list to leverage the various coda routines
  • use the numerous routines that come with the rstan package
  • use the routines that come with the bayesplot package
  • explore the diagnostics interactively via shinystan

We will explore all of these:
  • via coda
    • Traceplots
    • library(coda)
      s = as.array(data.rstan)
      wch = grep("beta", dimnames(s)$parameters)
      s = s[, , wch]
      mcmc <- do.call(mcmc.list, plyr:::alply(s[, , -(length(s[1, 1, ]))], 2, as.mcmc))
      plot(mcmc)
      
      plot of chunk tut7.5bSTANcodaTraceplots
      Trace plots show no evidence that the chains have not reasonably traversed the entire multidimensional parameter space.
    • Autocorrelation
    • library(coda)
      s = as.array(data.rstan)
      wch = grep("beta", dimnames(s)$parameters)
      s = s[, , wch]
      mcmc <- do.call(mcmc.list, plyr:::alply(s[, , -(length(s[1, 1, ]))], 2, as.mcmc))
      autocorr.diag(mcmc)
      
                  beta[1]       beta[2]      beta[3]
      Lag 0   1.000000000  1.0000000000  1.000000000
      Lag 1   0.064013692  0.0605531343  0.026397084
      Lag 5   0.007321517  0.0009337118  0.006726571
      Lag 10 -0.016376224 -0.0098527765 -0.009376572
      Lag 50 -0.017683148 -0.0290390559  0.010863075
      
      Trace plots show no evidence that the chains have not reasonably traversed the entire multidimensional parameter space.
  • via rstan
    • Traceplots
      stan_trace(data.rstan)
      
      plot of chunk tut7.5bSTANTrace
      Trace plots show no evidence that the chains have not reasonably traversed the entire multidimensional parameter space.
    • Raftery diagnostic
      raftery.diag(data.rstan)
      
      Quantile (q) = 0.025
      Accuracy (r) = +/- 0.005
      Probability (s) = 0.95 
      
      You need a sample size of at least 3746 with these values of q, r and s
      
      The Raftery diagnostics for each chain estimate that we would require no more than 5000 samples to reach the specified level of confidence in convergence. As we have 16,667 samples, we can be confidence that convergence has occurred.
    • Autocorrelation diagnostic
      stan_ac(data.rstan)
      
      plot of chunk tut7.5bSTANAuto
      A lag of 2 appears broadly sufficient to avoid autocorrelation (poor mixing).
    • Rhat values. These values are a measure of sampling efficiency/effectiveness. Ideally, all values should be less than 1.05. If there are values of 1.05 or greater it suggests that the sampler was not very efficient or effective. Not only does this mean that the sampler was potentiall slower than it could have been, more importantly, it could indicate that the sampler spent time sampling in a region of the likelihood that is less informative. Such a situation can arise from either a misspecified model or overly vague priors that permit sampling in otherwise nonscence parameter space.
      stan_rhat(data.rstan)
      
      plot of chunk tut7.5bSTANRhat
      In this instance, all rhat values are well below 1.05 (a good thing).
    • Another measure of sampling efficiency is Effective Sample Size (ess). ess indicate the number samples (or proportion of samples that the sampling algorithm deamed effective. The sampler rejects samples on the basis of certain criterion and when it does so, the previous sample value is used. Hence while the MCMC sampling chain may contain 1000 samples, if there are only 10 effective samples (1%), the estimated properties are not likely to be reliable.
      stan_ess(data.rstan)
      
      plot of chunk tut7.5bSTANess
      In this instance, most of the parameters have reasonably high effective samples and thus there is likely to be a good range of values from which to estimate paramter properties.
  • via bayesplot
    • Trace plots and density plots
      library(bayesplot)
      mcmc_trace(as.matrix(data.rstan), regex_pars = "beta|sigma")
      
      plot of chunk tut7.5bSTANMCMCTrace
      library(bayesplot)
      mcmc_combo(as.matrix(data.rstan), regex_pars = "beta|sigma")
      
      plot of chunk tut7.5bSTANTrace1
      Trace plots show no evidence that the chains have not reasonably traversed the entire multidimensional parameter space.
    • Density plots
      library(bayesplot)
      mcmc_dens(as.matrix(data.rstan), regex_pars = "beta|sigma")
      
      plot of chunk tut7.5bSTANdens
      Density plots sugggest mean or median would be appropriate to describe the fixed posteriors and median is appropriate for the sigma posterior.
  • via shinystan
    library(shinystan)
    launch_shinystan(data.rstan)
    
  • It is worth exploring the influence of our priors.

Again, prior to examining the summaries, we should have explored the convergence diagnostics. There are numerous ways of working with STANARM model fits (for exploring diagnostics and summarization).

  • extract the mcmc samples and convert them into a mcmc.list to leverage the various coda routines
  • use the numerous routines that come with the rstan package
  • use the routines that come with the bayesplot package
  • explore the diagnostics interactively via shinystan

We will explore all of these:
  • via coda
    • Traceplots
    • library(coda)
      s = as.array(data.rstanarm)
      mcmc <- do.call(mcmc.list, plyr:::alply(s[, , -(length(s[1, 1, ]))], 2, as.mcmc))
      plot(mcmc)
      
      plot of chunk tut7.5bRSTANARMcodaTraceplots
      Trace plots show no evidence that the chains have not reasonably traversed the entire multidimensional parameter space.
    • Autocorrelation
    • library(coda)
      s = as.array(data.rstanarm)
      mcmc <- do.call(mcmc.list, plyr:::alply(s[, , -(length(s[1, 1, ]))], 2, as.mcmc))
      autocorr.diag(mcmc)
      
               (Intercept)     AGroup B     AGroup C            B
      Lag 0   1.0000000000  1.000000000  1.000000000  1.000000000
      Lag 1   0.0290559273  0.001379705 -0.023247646 -0.018452103
      Lag 5  -0.0035230620 -0.003667234  0.004820692  0.008991159
      Lag 10  0.0180539117 -0.016292304  0.028751812  0.024889288
      Lag 50  0.0008045354 -0.013900306  0.008315425  0.009651236
      
      Trace plots show no evidence that the chains have not reasonably traversed the entire multidimensional parameter space.
  • via rstan
    • Traceplots
      stan_trace(data.rstanarm)
      
      plot of chunk tut7.5bRSTANARMTrace
      Trace plots show no evidence that the chains have not reasonably traversed the entire multidimensional parameter space.
    • Raftery diagnostic
      raftery.diag(data.rstanarm)
      
      Quantile (q) = 0.025
      Accuracy (r) = +/- 0.005
      Probability (s) = 0.95 
      
      You need a sample size of at least 3746 with these values of q, r and s
      
      The Raftery diagnostics for each chain estimate that we would require no more than 5000 samples to reach the specified level of confidence in convergence. As we have 16,667 samples, we can be confidence that convergence has occurred.
    • Autocorrelation diagnostic
      stan_ac(data.rstanarm)
      
      plot of chunk tut7.5bRSTANARMAuto
      A lag of 2 appears broadly sufficient to avoid autocorrelation (poor mixing).
    • Rhat values. These values are a measure of sampling efficiency/effectiveness. Ideally, all values should be less than 1.05. If there are values of 1.05 or greater it suggests that the sampler was not very efficient or effective. Not only does this mean that the sampler was potentiall slower than it could have been, more importantly, it could indicate that the sampler spent time sampling in a region of the likelihood that is less informative. Such a situation can arise from either a misspecified model or overly vague priors that permit sampling in otherwise nonscence parameter space.
      stan_rhat(data.rstanarm)
      
      plot of chunk tut7.5bRSTANARMRhat
      In this instance, all rhat values are well below 1.05 (a good thing).
    • Another measure of sampling efficiency is Effective Sample Size (ess). ess indicate the number samples (or proportion of samples that the sampling algorithm deamed effective. The sampler rejects samples on the basis of certain criterion and when it does so, the previous sample value is used. Hence while the MCMC sampling chain may contain 1000 samples, if there are only 10 effective samples (1%), the estimated properties are not likely to be reliable.
      stan_ess(data.rstanarm)
      
      plot of chunk tut7.5bRSTANARMess
      In this instance, most of the parameters have reasonably high effective samples and thus there is likely to be a good range of values from which to estimate paramter properties.
  • via bayesplot
    • Trace plots and density plots
      library(bayesplot)
      mcmc_trace(as.array(data.rstanarm), regex_pars = "Intercept|x|sigma")
      
      plot of chunk tut7.5bRSTANARMMCMCTrace
      mcmc_combo(as.array(data.rstanarm))
      
      plot of chunk tut7.5bRSTANARMTrace1
      Trace plots show no evidence that the chains have not reasonably traversed the entire multidimensional parameter space.
    • Density plots
      mcmc_dens(as.array(data.rstanarm))
      
      plot of chunk tut7.5bRSTANARMdens
      Density plots sugggest mean or median would be appropriate to describe the fixed posteriors and median is appropriate for the sigma posterior.
  • via rstanarm
    The rstanarm package provides additional posterior checks.
    • Posterior vs Prior - this compares the posterior estimate for each parameter against the associated prior. If the spread of the priors is small relative to the posterior, then it is likely that the priors are too influential. On the other hand, overly wide priors can lead to computational issues.
      library(rstanarm)
      posterior_vs_prior(data.rstanarm, color_by = "vs", group_by = TRUE,
          facet_args = list(scales = "free_y"))
      
      Gradient evaluation took 3.6e-05 seconds
      1000 transitions using 10 leapfrog steps per transition would take 0.36 seconds.
      Adjust your expectations accordingly!
      
      
      
       Elapsed Time: 0.318686 seconds (Warm-up)
                     0.056968 seconds (Sampling)
                     0.375654 seconds (Total)
      
      
      Gradient evaluation took 1e-05 seconds
      1000 transitions using 10 leapfrog steps per transition would take 0.1 seconds.
      Adjust your expectations accordingly!
      
      
      
       Elapsed Time: 0.371641 seconds (Warm-up)
                     0.056263 seconds (Sampling)
                     0.427904 seconds (Total)
      
      plot of chunk tut7.5bRSTANARMposterorvsprior
  • via shinystan
    						  library(shinystan) 
    						  launch_shinystan(data.rstanarm))      
    

Again, prior to examining the summaries, we should have explored the convergence diagnostics. Rather than dublicate this for both additive and multiplicative models, we will only explore the multiplicative model. There are numerous ways of working with STAN model fits (for exploring diagnostics and summarization).

  • extract the mcmc samples and convert them into a mcmc.list to leverage the various coda routines
  • use the numerous routines that come with the rstan package
  • use the routines that come with the bayesplot package
  • explore the diagnostics interactively via shinystan

We will explore all of these:
  • via coda
    • Traceplots
    • library(coda)
      mcmc = as.mcmc(data.brms)
      plot(mcmc)
      
      plot of chunk tut7.5bBRMScodaTraceplots
      plot of chunk tut7.5bBRMScodaTraceplots
      Trace plots show no evidence that the chains have not reasonably traversed the entire multidimensional parameter space.
    • Autocorrelation
    • library(coda)
      mcmc = as.mcmc(data.brms)
      autocorr.diag(mcmc)
      
      Error in ts(x, start = start(x), end = end(x), deltat = thin(x)): invalid time series parameters specified
      
      Trace plots show no evidence that the chains have not reasonably traversed the entire multidimensional parameter space.
  • via rstan
    • Traceplots
      stan_trace(data.brms$fit)
      
      plot of chunk tut7.5bBRMSTrace
      Trace plots show no evidence that the chains have not reasonably traversed the entire multidimensional parameter space.
    • Raftery diagnostic
      raftery.diag(data.brms)
      
      Quantile (q) = 0.025
      Accuracy (r) = +/- 0.005
      Probability (s) = 0.95 
      
      You need a sample size of at least 3746 with these values of q, r and s
      
      The Raftery diagnostics for each chain estimate that we would require no more than 5000 samples to reach the specified level of confidence in convergence. As we have 16,667 samples, we can be confidence that convergence has occurred.
    • Autocorrelation diagnostic
      stan_ac(data.brms$fit)
      
      plot of chunk tut7.5bBRMSAuto
      A lag of 2 appears broadly sufficient to avoid autocorrelation (poor mixing).
    • Rhat values. These values are a measure of sampling efficiency/effectiveness. Ideally, all values should be less than 1.05. If there are values of 1.05 or greater it suggests that the sampler was not very efficient or effective. Not only does this mean that the sampler was potentiall slower than it could have been, more importantly, it could indicate that the sampler spent time sampling in a region of the likelihood that is less informative. Such a situation can arise from either a misspecified model or overly vague priors that permit sampling in otherwise nonscence parameter space.
      stan_rhat(data.brms$fit)
      
      plot of chunk tut7.5bBRMSRhat
      In this instance, all rhat values are well below 1.05 (a good thing).
    • Another measure of sampling efficiency is Effective Sample Size (ess). ess indicate the number samples (or proportion of samples that the sampling algorithm deamed effective. The sampler rejects samples on the basis of certain criterion and when it does so, the previous sample value is used. Hence while the MCMC sampling chain may contain 1000 samples, if there are only 10 effective samples (1%), the estimated properties are not likely to be reliable.
      stan_ess(data.brms$fit)
      
      plot of chunk tut7.5bBRMSess
      In this instance, most of the parameters have reasonably high effective samples and thus there is likely to be a good range of values from which to estimate paramter properties.

Model validation

Model validation involves exploring the model diagnostics and fit to ensure that the model is broadly appropriate for the data. As such, exploration of the residuals should be routine.

For more complex models (those that contain multiple effects, it is also advisable to plot the residuals against each of the individual predictors. For sampling designs that involve sample collection over space or time, it is also a good idea to explore whether there are any temporal or spatial patterns in the residuals.

There are numerous situations (e.g. when applying specific variance-covariance structures to a model) where raw residuals do not reflect the interior workings of the model. Typically, this is because they do not take into account the variance-covariance matrix or assume a very simple variance-covariance matrix. Since the purpose of exploring residuals is to evaluate the model, for these cases, it is arguably better to draw conclusions based on standardized (or studentized) residuals.

Unfortunately the definitions of standardized and studentized residuals appears to vary and the two terms get used interchangeably. I will adopt the following definitions:


Standardized residuals:the raw residuals divided by the true standard deviation of the residuals (which of course is rarely known).
Studentized residuals:the raw residuals divided by the standard deviation of the residuals.
Note that externally studentized residuals are calculated by dividing the raw residuals by a unique standard deviation for each observation that is calculated from regressions having left each successive observation out.
Pearson residuals:the raw residuals divided by the standard deviation of the response variable.

The mark of a good model is being able to predict well. In an ideal world, we would have sufficiently large sample size as to permit us to hold a fraction (such as 25%) back thereby allowing us to train the model on 75% of the data and then see how well the model can predict the withheld 25%. Unfortunately, such a luxury is still rare in ecology.

The next best option is to see how well the model can predict the observed data. Models tend to struggle most with the extremes of trends and have particular issues when the extremes approach logical boundaries (such as zero for count data and standard deviations). We can use the fitted model to generate random predicted observations and then explore some properties of these compared to the actual observed data.

Rather than dublicate this for both additive and multiplicative models, we will only explore the multiplicative model.

Residuals are not computed directly within MCMCpack. However, we can calculate them manually form the posteriors.

mcmc = as.data.frame(data.mcmcpack)
# generate a model matrix
newdata = data
Xmat = model.matrix(~A + B, newdata)
## get median parameter estimates
head(mcmc)
  (Intercept)  AGroup B  AGroup C          B   sigma2
1    50.18839 -18.23379 -22.87414 -0.4394640 14.11917
2    49.39556 -15.64877 -20.98086 -0.4528605 13.22004
3    49.44854 -16.36367 -23.66886 -0.3887300 15.57976
4    51.46891 -16.26642 -24.26313 -0.4709924 12.01597
5    50.55617 -17.54613 -22.44262 -0.4252965 16.35805
6    50.73340 -17.54078 -23.75846 -0.4598987 10.84731
coefs = apply(mcmc[, 1:4], 2, median)
fit = as.vector(coefs %*% t(Xmat))
resid = data$Y - fit
ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))
plot of chunk tut7.5bMCMCpackresid

Residuals against predictors

mcmc = as.data.frame(data.mcmcpack)
# generate a model matrix
newdata = newdata
Xmat = model.matrix(~A + B, newdata)
## get median parameter estimates
coefs = apply(mcmc[, 1:4], 2, median)
fit = as.vector(coefs %*% t(Xmat))
resid = data$Y - fit
newdata = newdata %>% cbind(fit, resid)
ggplot(newdata) + geom_point(aes(y = resid, x = A))
plot of chunk tut7.5bMCMCpackresid1
ggplot(newdata) + geom_point(aes(y = resid, x = B))
plot of chunk tut7.5bMCMCpackresid1

And now for studentized residuals

mcmc = as.data.frame(data.mcmcpack)
# generate a model matrix
newdata = data
Xmat = model.matrix(~A + B, newdata)
## get median parameter estimates
coefs = apply(mcmc[, 1:4], 2, median)
fit = as.vector(coefs %*% t(Xmat))
resid = data$Y - fit
sresid = resid/sd(resid)
ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))
plot of chunk tut7.5bMCMCpackresid2

Conclusions: for this simple model, the studentized residuals yield the same pattern as the raw residuals (or the Pearson residuals for that matter).

Lets see how well data simulated from the model reflects the raw data

mcmc = as.matrix(data.mcmcpack)
# generate a model matrix
Xmat = model.matrix(~A + B, data)
## get median parameter estimates
coefs = mcmc[, 1:4]
fit = coefs %*% t(Xmat)
## draw samples from this model
yRep = sapply(1:nrow(mcmc), function(i) rnorm(nrow(data), fit[i,
    ], sqrt(mcmc[i, "sigma2"])))
newdata = data.frame(A = data$A, B = data$B, yRep) %>% gather(key = Sample,
    value = Value, -A, -B)
ggplot(newdata) + geom_violin(aes(y = Value, x = A, fill = "Model"),
    alpha = 0.5) + geom_violin(data = data, aes(y = Y, x = A,
    fill = "Obs"), alpha = 0.5) + geom_point(data = data, aes(y = Y,
    x = A), position = position_jitter(width = 0.1, height = 0),
    color = "black")
plot of chunk tut7.5bMCMCpackFit
ggplot(newdata) + geom_violin(aes(y = Value, x = B, fill = "Model",
    group = B, color = A), alpha = 0.5) + geom_point(data = data,
    aes(y = Y, x = B, group = B, color = A))
plot of chunk tut7.5bMCMCpackFit

Conclusions the predicted trends do encapsulate the actual data, suggesting that the model is a reasonable representation of the underlying processes. Note, these are prediction intervals rather than confidence intervals as we are seeking intervals within which we can predict individual observations rather than means.

We can also explore the posteriors of each parameter.

library(bayesplot)
mcmc_intervals(as.matrix(data.mcmcpack), regex_pars = "Intercept|^A|^B|sigma")
plot of chunk tut7.5bMCMCpackArea
mcmc_areas(as.matrix(data.mcmcpack), regex_pars = "Intercept|^A|^B|sigma")
plot of chunk tut7.5bMCMCpackArea
Rather than dublicate this for both additive and multiplicative models, we will only explore the multiplicative model.

Residuals are not computed directly within JAGS. However, we can calculate them manually form the posteriors.

mcmc = data.r2jags$BUGSoutput$sims.matrix %>% as.data.frame %>%
    dplyr:::select(contains("beta"), sigma) %>% as.matrix
# generate a model matrix
newdata = data
Xmat = model.matrix(~A + B, newdata)
## get median parameter estimates
coefs = apply(mcmc[, 1:4], 2, median)
fit = as.vector(coefs %*% t(Xmat))
resid = data$Y - fit
ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))
plot of chunk tut7.5bJAGSresid

Residuals against predictors

mcmc = data.r2jags$BUGSoutput$sims.matrix %>% as.data.frame %>%
    dplyr:::select(contains("beta"), sigma) %>% as.matrix
# generate a model matrix
newdata = newdata
Xmat = model.matrix(~A + B, newdata)
## get median parameter estimates
coefs = apply(mcmc[, 1:4], 2, median)
fit = as.vector(coefs %*% t(Xmat))
resid = data$Y - fit
newdata = newdata %>% cbind(fit, resid)
ggplot(newdata) + geom_point(aes(y = resid, x = A))
plot of chunk tut7.5bJAGSresid1
ggplot(newdata) + geom_point(aes(y = resid, x = B))
plot of chunk tut7.5bJAGSresid1

And now for studentized residuals

mcmc = data.r2jags$BUGSoutput$sims.matrix %>% as.data.frame %>%
    dplyr:::select(contains("beta"), sigma) %>% as.matrix
# generate a model matrix
newdata = data
Xmat = model.matrix(~A + B, newdata)
## get median parameter estimates
coefs = apply(mcmc[, 1:4], 2, median)
fit = as.vector(coefs %*% t(Xmat))
resid = data$Y - fit
sresid = resid/sd(resid)
ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))
plot of chunk tut7.5bJAGSresid2

Conclusions: for this simple model, the studentized residuals yield the same pattern as the raw residuals (or the Pearson residuals for that matter).

Lets see how well data simulated from the model reflects the raw data

mcmc = data.r2jags$BUGSoutput$sims.matrix %>% as.data.frame %>%
    dplyr:::select(contains("beta"), sigma) %>% as.matrix
# generate a model matrix
Xmat = model.matrix(~A + B, data)
## get median parameter estimates
coefs = mcmc[, 1:4]
fit = coefs %*% t(Xmat)
## draw samples from this model
yRep = sapply(1:nrow(mcmc), function(i) rnorm(nrow(data), fit[i,
    ], mcmc[i, "sigma"]))
newdata = data.frame(A = data$A, B = data$B, yRep) %>% gather(key = Sample,
    value = Value, -A, -B)
ggplot(newdata) + geom_violin(aes(y = Value, x = A, fill = "Model"),
    alpha = 0.5) + geom_violin(data = data, aes(y = Y, x = A,
    fill = "Obs"), alpha = 0.5) + geom_point(data = data, aes(y = Y,
    x = A), position = position_jitter(width = 0.1, height = 0),
    color = "black")
plot of chunk tut7.5bJAGSFit
ggplot(newdata) + geom_violin(aes(y = Value, x = B, fill = "Model",
    group = B, color = A), alpha = 0.5) + geom_point(data = data,
    aes(y = Y, x = B, group = B, color = A))
plot of chunk tut7.5bJAGSFit

Conclusions the predicted trends do encapsulate the actual data, suggesting that the model is a reasonable representation of the underlying processes. Note, these are prediction intervals rather than confidence intervals as we are seeking intervals within which we can predict individual observations rather than means.

We can also explore the posteriors of each parameter.

library(bayesplot)
mcmc_intervals(data.r2jags$BUGSoutput$sims.matrix, regex_pars = "beta|sigma")
plot of chunk tut7.5bJAGSArea
mcmc_areas(data.r2jags$BUGSoutput$sims.matrix, regex_pars = "beta|sigma")
plot of chunk tut7.5bJAGSArea
Rather than dublicate this for both additive and multiplicative models, we will only explore the multiplicative model.

Residuals are not computed directly within RSTAN. However, we can calculate them manually form the posteriors.

mcmc = as.data.frame(data.rstan) %>% dplyr:::select(contains("beta"),
    sigma) %>% as.matrix
# generate a model matrix
newdata = data
Xmat = model.matrix(~A + B, newdata)
## get median parameter estimates
coefs = apply(mcmc[, 1:4], 2, median)
fit = as.vector(coefs %*% t(Xmat))
resid = data$Y - fit
ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))
plot of chunk tut7.5bRSTANresid

Residuals against predictors

mcmc = as.data.frame(data.rstan) %>% dplyr:::select(contains("beta"),
    sigma) %>% as.matrix
# generate a model matrix
newdata = newdata
Xmat = model.matrix(~A + B, newdata)
## get median parameter estimates
coefs = apply(mcmc[, 1:4], 2, median)
fit = as.vector(coefs %*% t(Xmat))
resid = data$Y - fit
newdata = newdata %>% cbind(fit, resid)
ggplot(newdata) + geom_point(aes(y = resid, x = A))
plot of chunk tut7.5bRSTANresid1
ggplot(newdata) + geom_point(aes(y = resid, x = B))
plot of chunk tut7.5bRSTANresid1

And now for studentized residuals

mcmc = as.data.frame(data.rstan) %>% dplyr:::select(contains("beta"),
    sigma) %>% as.matrix
# generate a model matrix
newdata = data
Xmat = model.matrix(~A + B, newdata)
## get median parameter estimates
coefs = apply(mcmc[, 1:4], 2, median)
fit = as.vector(coefs %*% t(Xmat))
resid = data$Y - fit
sresid = resid/sd(resid)
ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))
plot of chunk tut7.5bRSTANresid2

Conclusions: for this simple model, the studentized residuals yield the same pattern as the raw residuals (or the Pearson residuals for that matter).

Lets see how well data simulated from the model reflects the raw data

mcmc = as.data.frame(data.rstan) %>% dplyr:::select(contains("beta"),
    sigma) %>% as.matrix
# generate a model matrix
Xmat = model.matrix(~A + B, data)
## get median parameter estimates
coefs = mcmc[, 1:4]
fit = coefs %*% t(Xmat)
## draw samples from this model
yRep = sapply(1:nrow(mcmc), function(i) rnorm(nrow(data), fit[i,
    ], mcmc[i, "sigma"]))
newdata = data.frame(A = data$A, B = data$B, yRep) %>% gather(key = Sample,
    value = Value, -A, -B)
ggplot(newdata) + geom_violin(aes(y = Value, x = A, fill = "Model"),
    alpha = 0.5) + geom_violin(data = data, aes(y = Y, x = A,
    fill = "Obs"), alpha = 0.5) + geom_point(data = data, aes(y = Y,
    x = A), position = position_jitter(width = 0.1, height = 0),
    color = "black")
plot of chunk tut7.5bRSTANFit
ggplot(newdata) + geom_violin(aes(y = Value, x = B, fill = "Model",
    group = B, color = A), alpha = 0.5) + geom_point(data = data,
    aes(y = Y, x = B, group = B, color = A))
plot of chunk tut7.5bRSTANFit

Conclusions the predicted trends do encapsulate the actual data, suggesting that the model is a reasonable representation of the underlying processes. Note, these are prediction intervals rather than confidence intervals as we are seeking intervals within which we can predict individual observations rather than means.

We can also explore the posteriors of each parameter.

library(bayesplot)
mcmc_intervals(as.matrix(data.rstan), regex_pars = "beta|sigma")
plot of chunk tut7.5bRSTANArea
mcmc_areas(as.matrix(data.rstan), regex_pars = "beta|sigma")
plot of chunk tut7.5bRSTANArea
Rather than dublicate this for both additive and multiplicative models, we will only explore the multiplicative model.

Residuals are not computed directly within RSTANARM. However, we can calculate them manually form the posteriors.

resid = resid(data.rstanarm)
fit = fitted(data.rstanarm)
ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))
plot of chunk tut7.5bRSTANARMresid

Residuals against predictors

resid = resid(data.rstanarm)
dat = data %>% mutate(resid = resid)
ggplot(dat) + geom_point(aes(y = resid, x = A))
plot of chunk tut7.5bRSTANARMresid1
ggplot(dat) + geom_point(aes(y = resid, x = B))
plot of chunk tut7.5bRSTANARMresid1

And now for studentized residuals

resid = resid(data.rstanarm)
sresid = resid/sd(resid)
fit = fitted(data.rstanarm)
ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))
plot of chunk tut7.5bRSTANARMresid2

Conclusions: for this simple model, the studentized residuals yield the same pattern as the raw residuals (or the Pearson residuals for that matter).

Lets see how well data simulated from the model reflects the raw data

y_pred = posterior_predict(data.rstanarm)
newdata = data %>% cbind(t(y_pred)) %>% gather(key = "Rep", value = "Value",
    -Y:-B)
ggplot(newdata) + geom_violin(aes(y = Value, x = A, fill = "Model"),
    alpha = 0.5) + geom_violin(data = data, aes(y = Y, x = A,
    fill = "Obs"), alpha = 0.5) + geom_point(data = data, aes(y = Y,
    x = A), position = position_jitter(width = 0.1, height = 0),
    color = "black")
Error: Aesthetics must be either length 1 or the same as the data (67530): x, y, fill
ggplot(newdata) + geom_violin(aes(y = Value, x = B, fill = "Model",
    group = B, color = A), alpha = 0.5) + geom_point(data = data,
    aes(y = Y, x = B, group = B, color = A))
Error: Aesthetics must be either length 1 or the same as the data (67530): x, y, fill, group, colour
plot of chunk tut7.5bRSTANARMFit

Conclusions the predicted trends do encapsulate the actual data, suggesting that the model is a reasonable representation of the underlying processes. Note, these are prediction intervals rather than confidence intervals as we are seeking intervals within which we can predict individual observations rather than means.

We can also explore the posteriors of each parameter.

library(bayesplot)
mcmc_intervals(as.matrix(data.rstanarm), regex_pars = "Intercept|^A|^B|sigma")
plot of chunk tut7.5bRSTANARMArea
mcmc_areas(as.matrix(data.rstanarm), regex_pars = "Intercept|^A|^B|sigma")
plot of chunk tut7.5bRSTANARMArea
Rather than dublicate this for both additive and multiplicative models, we will only explore the multiplicative model.

Residuals are not computed directly within BRMS. However, we can calculate them manually form the posteriors.

resid = resid(data.brms)[, "Estimate"]
fit = fitted(data.brms)[, "Estimate"]
ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))
plot of chunk tut7.5bBRMSresid

Residuals against predictors

resid = resid(data.brms)[, "Estimate"]
dat = data %>% mutate(resid = resid)
ggplot(dat) + geom_point(aes(y = resid, x = A))
plot of chunk tut7.5bBRMSresid1
ggplot(dat) + geom_point(aes(y = resid, x = B))
plot of chunk tut7.5bBRMSresid1

And now for studentized residuals

resid = resid(data.brms)[, "Estimate"]
sresid = resid/sd(resid)
fit = fitted(data.brms)[, "Estimate"]
ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))
plot of chunk tut7.5bBRMSresid2

Conclusions: for this simple model, the studentized residuals yield the same pattern as the raw residuals (or the Pearson residuals for that matter).

Lets see how well data simulated from the model reflects the raw data

y_pred = posterior_predict(data.brms)
newdata = data %>% cbind(t(y_pred)) %>% gather(key = "Rep", value = "Value",
    -Y:-B)
ggplot(newdata) + geom_violin(aes(y = Value, x = A, fill = "Model"),
    alpha = 0.5) + geom_violin(data = data, aes(y = Y, x = A,
    fill = "Obs"), alpha = 0.5) + geom_point(data = data, aes(y = Y,
    x = A), position = position_jitter(width = 0.1, height = 0),
    color = "black")
Error: Aesthetics must be either length 1 or the same as the data (67530): x, y, fill
ggplot(newdata) + geom_violin(aes(y = Value, x = B, fill = "Model",
    group = B, color = A), alpha = 0.5) + geom_point(data = data,
    aes(y = Y, x = B, group = B, color = A))
Error: Aesthetics must be either length 1 or the same as the data (67530): x, y, fill, group, colour
plot of chunk tut7.5bBRMSFit

Conclusions the predicted trends do encapsulate the actual data, suggesting that the model is a reasonable representation of the underlying processes. Note, these are prediction intervals rather than confidence intervals as we are seeking intervals within which we can predict individual observations rather than means.

We can also explore the posteriors of each parameter.

library(bayesplot)
mcmc_intervals(as.matrix(data.brms), regex_pars = "b_|sigma")
plot of chunk tut7.5bBRMSArea
mcmc_areas(as.matrix(data.brms), regex_pars = "b_|sigma")
plot of chunk tut7.5bBRMSArea

Parameter estimates (posterior summaries)

Although all parameters in a Bayesian analysis are considered random and are considered a distribution, rarely would it be useful to present tables of all the samples from each distribution. On the other hand, plots of the posterior distributions are do have some use. Nevertheless, most workers prefer to present simple statistical summaries of the posteriors. Popular choices include the median (or mean) and 95% credibility intervals.

library(coda)
mcmcpvalue <- function(samp) {
    ## elementary version that creates an empirical p-value for the
    ## hypothesis that the columns of samp have mean zero versus a general
    ## multivariate distribution with elliptical contours.

    ## differences from the mean standardized by the observed
    ## variance-covariance factor

    ## Note, I put in the bit for single terms
    if (length(dim(samp)) == 0) {
        std <- backsolve(chol(var(samp)), cbind(0, t(samp)) - mean(samp),
            transpose = TRUE)
        sqdist <- colSums(std * std)
        sum(sqdist[-1] > sqdist[1])/length(samp)
    } else {
        std <- backsolve(chol(var(samp)), cbind(0, t(samp)) - colMeans(samp),
            transpose = TRUE)
        sqdist <- colSums(std * std)
        sum(sqdist[-1] > sqdist[1])/nrow(samp)
    }

}

Matrix model (MCMCpack)

summary(data.mcmcpack)
Iterations = 1001:11000
Thinning interval = 1 
Number of chains = 1 
Sample size per chain = 10000 

1. Empirical mean and standard deviation for each variable,
   plus standard error of the mean:

                Mean     SD Naive SE Time-series SE
(Intercept)  50.2015 1.7846 0.017846      0.0178458
AGroup B    -17.1433 1.8782 0.018782      0.0187821
AGroup C    -22.9133 1.8682 0.018682      0.0186817
B            -0.4138 0.0637 0.000637      0.0006543
sigma2       17.0079 5.0889 0.050889      0.0581569

2. Quantiles for each variable:

                2.5%      25%      50%      75%    97.5%
(Intercept)  46.7714  49.0274  50.2019  51.3437  53.7977
AGroup B    -20.8797 -18.3726 -17.1445 -15.8884 -13.4552
AGroup C    -26.6231 -24.1304 -22.9030 -21.6955 -19.1900
B            -0.5397  -0.4556  -0.4134  -0.3714  -0.2885
sigma2        9.7749  13.4022  16.1334  19.6823  29.3438
# OR
library(broom)
tidyMCMC(data.mcmcpack, conf.int = TRUE, conf.method = "HPDinterval")
         term    estimate  std.error    conf.low   conf.high
1 (Intercept)  50.2014506 1.78458442  46.7660798  53.7766054
2    AGroup B -17.1433463 1.87821127 -20.7503649 -13.3485818
3    AGroup C -22.9132573 1.86817189 -26.6020778 -19.1852142
4           B  -0.4137879 0.06369625  -0.5369301  -0.2860994
5      sigma2  17.0078626 5.08890812   8.9411967  27.1577316
Conclusions:
  • the intercept of the first group (Group A) is 50.2014506
  • the mean of the second group (Group B) is -17.1433463 units greater than (A)
  • the mean of the third group (Group C) is -22.9132573 units greater than (A)
  • a one unit increase in B in Group A is associated with a -0.4137879 units increase in Y
The 95% confidence interval for the effects of Group B, Group C and the partial slope associated with B do not overlapp with 0 implying a significant difference between group A and groups B, C and a significant negative relationship with B.

While workers attempt to become comfortable with a new statistical framework, it is only natural that they like to evaluate and comprehend new structures and output alongside more familiar concepts. One way to facilitate this is via Bayesian p-values that are somewhat analogous to the frequentist p-values for investigating the hypothesis that a parameter is equal to zero.

## since values are less than zero
mcmcpvalue(data.mcmcpack[, 2])  # effect of (B-A = 0)
[1] 0
mcmcpvalue(data.mcmcpack[, 3])  # effect of (C-A = 0)
[1] 0
mcmcpvalue(data.mcmcpack[, 4])  # effect of (slope = 0)
[1] 0
mcmcpvalue(data.mcmcpack[, 2:4])  # effect of (model)
[1] 0

There is evidence that the reponse differs between the groups.

Matrix model (JAGS)

print(data.r2jags)
Inference for Bugs model at "5", fit using jags,
 3 chains, each with 53000 iterations (first 3000 discarded), n.thin = 10
 n.sims = 15000 iterations saved
         mu.vect sd.vect    2.5%     25%     50%     75%   97.5%  Rhat n.eff
beta[1]   50.201   1.803  46.617  49.021  50.207  51.404  53.741 1.001  8400
beta[2]  -17.141   1.879 -20.832 -18.393 -17.149 -15.910 -13.402 1.001  8400
beta[3]  -22.917   1.882 -26.640 -24.161 -22.922 -21.682 -19.181 1.001  5900
beta[4]   -0.414   0.065  -0.543  -0.457  -0.414  -0.372  -0.285 1.001 15000
sigma      4.175   0.619   3.178   3.735   4.108   4.531   5.587 1.001  7200
deviance 169.232   3.548 164.517 166.636 168.502 171.075 178.013 1.001 15000

For each parameter, n.eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor (at convergence, Rhat=1).

DIC info (using the rule, pD = var(deviance)/2)
pD = 6.3 and DIC = 175.5
DIC is an estimate of expected predictive error (lower deviance is better).
# OR
library(broom)
tidyMCMC(as.mcmc(data.r2jags), conf.int = TRUE, conf.method = "HPDinterval")
      term    estimate  std.error    conf.low   conf.high
1  beta[1]  50.2007407 1.80285405  46.5994039  53.6923687
2  beta[2] -17.1407725 1.87934256 -20.8989661 -13.5038133
3  beta[3] -22.9165431 1.88243441 -26.5907610 -19.1480417
4  beta[4]  -0.4139048 0.06529536  -0.5434197  -0.2853424
5 deviance 169.2319386 3.54753309 163.9376050 176.1387130
6    sigma   4.1749569 0.61875758   3.0490746   5.3897463
Conclusions:
  • the intercept of the first group (Group A) is 50.2007407
  • the mean of the second group (Group B) is -17.1407725 units greater than (A)
  • the mean of the third group (Group C) is -22.9165431 units greater than (A)
  • a one unit increase in B in Group A is associated with a -0.4139048 units increase in Y
The 95% confidence interval for the effects of Group B, Group C and the partial slope associated with B do not overlapp with 0 implying a significant difference between group A and groups B, C and a significant negative relationship with B.

While workers attempt to become comfortable with a new statistical framework, it is only natural that they like to evaluate and comprehend new structures and output alongside more familiar concepts. One way to facilitate this is via Bayesian p-values that are somewhat analogous to the frequentist p-values for investigating the hypothesis that a parameter is equal to zero.

## since values are less than zero
mcmcpvalue(data.r2jags$BUGSoutput$sims.matrix[, "beta[2]"])  # effect of (B-A = 0)
[1] 0
mcmcpvalue(data.r2jags$BUGSoutput$sims.matrix[, "beta[3]"])  # effect of (C-A = 0)
[1] 0
mcmcpvalue(data.r2jags$BUGSoutput$sims.matrix[, "beta[4]"])  # effect of (slope = 0)
[1] 0
mcmcpvalue(data.r2jags$BUGSoutput$sims.matrix[, 2:4])  # effect of (model)
[1] 0

There is evidence that the reponse differs between the groups.

Matrix model (RSTAN)

print(data.rstan, pars = c("beta", "sigma"))
Inference for Stan model: 3d2414c9dcf4b5e12be870eadd2c894a.
3 chains, each with iter=2000; warmup=500; thin=3; 
post-warmup draws per chain=500, total post-warmup draws=1500.

          mean se_mean   sd   2.5%    25%    50%    75%  97.5% n_eff Rhat
beta[1]  50.22    0.06 1.79  46.60  49.03  50.24  51.40  53.83  1027    1
beta[2] -17.12    0.06 1.92 -20.88 -18.31 -17.14 -15.96 -13.19  1210    1
beta[3] -22.88    0.05 1.93 -26.78 -24.11 -22.86 -21.58 -19.24  1407    1
beta[4]  -0.42    0.00 0.06  -0.54  -0.46  -0.42  -0.37  -0.29  1282    1
sigma     4.11    0.02 0.58   3.12   3.70   4.05   4.45   5.37  1353    1

Samples were drawn using NUTS(diag_e) at Fri Nov  3 15:05:11 2017.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).
# OR
library(broom)
tidyMCMC(data.rstan, conf.int = TRUE, conf.method = "HPDinterval", pars = c("beta", "sigma"))
     term    estimate  std.error    conf.low   conf.high
1 beta[1]  50.2161150 1.78767398  46.4168478  53.5737947
2 beta[2] -17.1206004 1.91600172 -20.7216779 -13.0233814
3 beta[3] -22.8794576 1.92793878 -26.8184381 -19.3130173
4 beta[4]  -0.4157079 0.06337683  -0.5361929  -0.2891039
5   sigma   4.1094504 0.58022288   3.0779992   5.3039276
Conclusions:
  • the intercept of the first group (Group A) is 50.216115
  • the mean of the second group (Group B) is -17.1206004 units greater than (A)
  • the mean of the third group (Group C) is -22.8794576 units greater than (A)
  • a one unit increase in B in Group A is associated with a -0.4157079 units increase in Y
The 95% confidence interval for the effects of Group B, Group C and the partial slope associated with B do not overlapp with 0 implying a significant difference between group A and groups B, C and a significant negative relationship with B.

While workers attempt to become comfortable with a new statistical framework, it is only natural that they like to evaluate and comprehend new structures and output alongside more familiar concepts. One way to facilitate this is via Bayesian p-values that are somewhat analogous to the frequentist p-values for investigating the hypothesis that a parameter is equal to zero.

## since values are less than zero
mcmcpvalue(as.matrix(data.rstan)[, "beta[2]"])  # effect of (B-A = 0)
[1] 0
mcmcpvalue(as.matrix(data.rstan)[, "beta[3]"])  # effect of (C-A = 0)
[1] 0
mcmcpvalue(as.matrix(data.rstan)[, "beta[4]"])  # effect of (slope = 0)
[1] 0
mcmcpvalue(as.matrix(data.rstan)[, 2:4])  # effect of (model)
[1] 0

There is evidence that the reponse differs between the groups.

library(loo)
(full = loo(extract_log_lik(data.rstan)))
Computed from 1500 by 30 log-likelihood matrix

         Estimate  SE
elpd_loo    -87.3 3.5
p_loo         4.6 1.0
looic       174.5 7.1

Pareto k diagnostic values:
                         Count  Pct 
(-Inf, 0.5]   (good)     29    96.7%
 (0.5, 0.7]   (ok)        1     3.3%
   (0.7, 1]   (bad)       0     0.0%
   (1, Inf)   (very bad)  0     0.0%

All Pareto k estimates are ok (k < 0.7)
See help('pareto-k-diagnostic') for details.
# now fit a model without main factor
modelString = "
  data {
  int<lower=1> n;
  int<lower=1> nX;
  vector [n] y;
  matrix [n,nX] X;
  }
  parameters {
  vector[nX] beta;
  real<lower=0> sigma;
  }
  transformed parameters {
  vector[n] mu;

  mu = X*beta;
  }
  model {
  #Likelihood
  y~normal(mu,sigma);
  
  #Priors
  beta ~ normal(0,1000);
  sigma~cauchy(0,5);
  }
  generated quantities {
  vector[n] log_lik;
  
  for (i in 1:n) {
  log_lik[i] = normal_lpdf(y[i] | mu[i], sigma); 
  }
  }
  "

Xmat <- model.matrix(~1, data)
data.list <- with(data, list(y = Y, X = Xmat, n = nrow(data), nX = ncol(Xmat)))
data.rstan.red <- stan(data = data.list, model_code = modelString, chains = 3,
    iter = 2000, warmup = 500, thin = 3)
In file included from /usr/local/lib/R/site-library/BH/include/boost/config.hpp:39:0,
                 from /usr/local/lib/R/site-library/BH/include/boost/math/tools/config.hpp:13,
                 from /usr/local/lib/R/site-library/StanHeaders/include/stan/math/rev/core/var.hpp:7,
                 from /usr/local/lib/R/site-library/StanHeaders/include/stan/math/rev/core/gevv_vvv_vari.hpp:5,
                 from /usr/local/lib/R/site-library/StanHeaders/include/stan/math/rev/core.hpp:12,
                 from /usr/local/lib/R/site-library/StanHeaders/include/stan/math/rev/mat.hpp:4,
                 from /usr/local/lib/R/site-library/StanHeaders/include/stan/math.hpp:4,
                 from /usr/local/lib/R/site-library/StanHeaders/include/src/stan/model/model_header.hpp:4,
                 from file12361ebf982d.cpp:8:
/usr/local/lib/R/site-library/BH/include/boost/config/compiler/gcc.hpp:186:0: warning: "BOOST_NO_CXX11_RVALUE_REFERENCES" redefined
 #  define BOOST_NO_CXX11_RVALUE_REFERENCES
 ^
<command-line>:0:0: note: this is the location of the previous definition

SAMPLING FOR MODEL '3b057d3d81cbed2078ce678376a94574' NOW (CHAIN 1).

Gradient evaluation took 1.2e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.12 seconds.
Adjust your expectations accordingly!


Iteration:    1 / 2000 [  0%]  (Warmup)
Iteration:  200 / 2000 [ 10%]  (Warmup)
Iteration:  400 / 2000 [ 20%]  (Warmup)
Iteration:  501 / 2000 [ 25%]  (Sampling)
Iteration:  700 / 2000 [ 35%]  (Sampling)
Iteration:  900 / 2000 [ 45%]  (Sampling)
Iteration: 1100 / 2000 [ 55%]  (Sampling)
Iteration: 1300 / 2000 [ 65%]  (Sampling)
Iteration: 1500 / 2000 [ 75%]  (Sampling)
Iteration: 1700 / 2000 [ 85%]  (Sampling)
Iteration: 1900 / 2000 [ 95%]  (Sampling)
Iteration: 2000 / 2000 [100%]  (Sampling)

 Elapsed Time: 0.018509 seconds (Warm-up)
               0.041875 seconds (Sampling)
               0.060384 seconds (Total)


SAMPLING FOR MODEL '3b057d3d81cbed2078ce678376a94574' NOW (CHAIN 2).

Gradient evaluation took 7e-06 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.07 seconds.
Adjust your expectations accordingly!


Iteration:    1 / 2000 [  0%]  (Warmup)
Iteration:  200 / 2000 [ 10%]  (Warmup)
Iteration:  400 / 2000 [ 20%]  (Warmup)
Iteration:  501 / 2000 [ 25%]  (Sampling)
Iteration:  700 / 2000 [ 35%]  (Sampling)
Iteration:  900 / 2000 [ 45%]  (Sampling)
Iteration: 1100 / 2000 [ 55%]  (Sampling)
Iteration: 1300 / 2000 [ 65%]  (Sampling)
Iteration: 1500 / 2000 [ 75%]  (Sampling)
Iteration: 1700 / 2000 [ 85%]  (Sampling)
Iteration: 1900 / 2000 [ 95%]  (Sampling)
Iteration: 2000 / 2000 [100%]  (Sampling)

 Elapsed Time: 0.017436 seconds (Warm-up)
               0.046029 seconds (Sampling)
               0.063465 seconds (Total)


SAMPLING FOR MODEL '3b057d3d81cbed2078ce678376a94574' NOW (CHAIN 3).

Gradient evaluation took 6e-06 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.06 seconds.
Adjust your expectations accordingly!


Iteration:    1 / 2000 [  0%]  (Warmup)
Iteration:  200 / 2000 [ 10%]  (Warmup)
Iteration:  400 / 2000 [ 20%]  (Warmup)
Iteration:  501 / 2000 [ 25%]  (Sampling)
Iteration:  700 / 2000 [ 35%]  (Sampling)
Iteration:  900 / 2000 [ 45%]  (Sampling)
Iteration: 1100 / 2000 [ 55%]  (Sampling)
Iteration: 1300 / 2000 [ 65%]  (Sampling)
Iteration: 1500 / 2000 [ 75%]  (Sampling)
Iteration: 1700 / 2000 [ 85%]  (Sampling)
Iteration: 1900 / 2000 [ 95%]  (Sampling)
Iteration: 2000 / 2000 [100%]  (Sampling)

 Elapsed Time: 0.012316 seconds (Warm-up)
               0.026582 seconds (Sampling)
               0.038898 seconds (Total)
(reduced = loo(extract_log_lik(data.rstan.red)))
Computed from 1500 by 30 log-likelihood matrix

         Estimate  SE
elpd_loo   -116.0 3.2
p_loo         1.6 0.4
looic       232.0 6.4

All Pareto k estimates are good (k < 0.5)
See help('pareto-k-diagnostic') for details.
par(mfrow = 1:2, mar = c(5, 3.8, 1, 0) + 0.1, las = 3)
plot(full, label_points = TRUE)
plot(reduced, label_points = TRUE)
plot of chunk tut7.5bRSTANloo
compare_models(full, reduced)
Error in discrete == discrete[1]: comparison of these types is not implemented
Conclusions: the expected out-of-sample predictive accuracy is substantially lower for the model that includes x. This might be used to suggest that the inferential evidence for a general effect of x on y.

Matrix model (RSTANARM)

summary(data.rstanarm)
Model Info:

 function:  stan_glm
 family:    gaussian [identity]
 formula:   Y ~ A + B
 algorithm: sampling
 priors:    see help('prior_summary')
 sample:    2250 (posterior sample size)
 num obs:   30

Estimates:
                mean   sd     2.5%   25%    50%    75%    97.5%
(Intercept)     50.2    1.9   46.4   49.0   50.2   51.5   54.1 
AGroup B       -17.2    1.9  -21.0  -18.5  -17.2  -15.9  -13.3 
AGroup C       -22.9    2.0  -26.6  -24.2  -22.9  -21.6  -19.0 
B               -0.4    0.1   -0.5   -0.5   -0.4   -0.4   -0.3 
sigma            4.2    0.6    3.2    3.8    4.1    4.5    5.6 
mean_PPD        30.0    1.1   27.8   29.3   30.0   30.7   32.2 
log-posterior  -98.3    1.8 -103.1  -99.1  -97.9  -96.9  -95.9 

Diagnostics:
              mcse Rhat n_eff
(Intercept)   0.0  1.0  1980 
AGroup B      0.0  1.0  2145 
AGroup C      0.0  1.0  2250 
B             0.0  1.0  2250 
sigma         0.0  1.0  1200 
mean_PPD      0.0  1.0  1720 
log-posterior 0.1  1.0   960 

For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence Rhat=1).
# OR
library(broom)
tidyMCMC(data.rstanarm$stanfit, conf.int = TRUE, conf.method = "HPDinterval")
           term    estimate std.error     conf.low   conf.high
1   (Intercept)  50.2286902 1.9365801   46.4428738  54.1263008
2      AGroup B -17.2114440 1.9386491  -21.0066484 -13.4338089
3      AGroup C -22.9037572 1.9679361  -26.6757211 -19.0189639
4             B  -0.4138748 0.0681432   -0.5451839  -0.2852376
5         sigma   4.1990238 0.6317351    3.0581470   5.4328442
6      mean_PPD  30.0266842 1.1171901   27.8860300  32.2422092
7 log-posterior -98.2605466 1.8364590 -101.7802647 -95.7415132
Conclusions:
  • the intercept of the first group (Group A) is 50.2286902
  • the mean of the second group (Group B) is -17.211444 units greater than (A)
  • the mean of the third group (Group C) is -22.9037572 units greater than (A)
  • a one unit increase in B in Group A is associated with a -0.4138748 units increase in Y
The 95% confidence interval for the effects of Group B, Group C and the partial slope associated with B do not overlapp with 0 implying a significant difference between group A and groups B, C and a significant negative relationship with B.

While workers attempt to become comfortable with a new statistical framework, it is only natural that they like to evaluate and comprehend new structures and output alongside more familiar concepts. One way to facilitate this is via Bayesian p-values that are somewhat analogous to the frequentist p-values for investigating the hypothesis that a parameter is equal to zero.

## since values are less than zero
head(as.matrix(data.rstanarm))
          parameters
iterations (Intercept)  AGroup B  AGroup C          B    sigma
      [1,]    49.25161 -18.13674 -20.71374 -0.3963549 4.384061
      [2,]    50.28149 -20.18446 -23.60224 -0.3823662 4.601748
      [3,]    47.22865 -13.74348 -20.42610 -0.3439257 3.559008
      [4,]    50.36383 -18.72387 -23.75338 -0.3776768 4.211066
      [5,]    49.33028 -18.88190 -20.94786 -0.3634085 3.830714
      [6,]    49.08087 -14.40159 -22.87480 -0.4144611 3.793997
mcmcpvalue(as.matrix(data.rstanarm)[, "AGroup B"])  # effect of (B-A = 0)
[1] 0
mcmcpvalue(as.matrix(data.rstanarm)[, "AGroup C"])  # effect of (C-A = 0)
[1] 0
mcmcpvalue(as.matrix(data.rstanarm)[, "B"])  # effect of (slope = 0)
[1] 0
mcmcpvalue(as.matrix(data.rstanarm)[, 2:4])  # effect of (model)
[1] 0

There is evidence that the reponse differs between the groups. There is very little evidence that the response of group D differs from that of group A

library(loo)
(full = loo(data.rstanarm))
Computed from 2250 by 30 log-likelihood matrix

         Estimate  SE
elpd_loo    -87.2 3.3
p_loo         4.3 0.9
looic       174.4 6.7

All Pareto k estimates are good (k < 0.5)
See help('pareto-k-diagnostic') for details.
data.rstanarm.red = update(data.rstanarm, . ~ 1)
Gradient evaluation took 3.8e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.38 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.027383 seconds (Warm-up)
               0.039991 seconds (Sampling)
               0.067374 seconds (Total)


Gradient evaluation took 1.1e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.11 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.0218 seconds (Warm-up)
               0.039113 seconds (Sampling)
               0.060913 seconds (Total)


Gradient evaluation took 1.2e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.12 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.02501 seconds (Warm-up)
               0.043676 seconds (Sampling)
               0.068686 seconds (Total)
(reduced = loo(data.rstanarm.red))
Computed from 2250 by 30 log-likelihood matrix

         Estimate  SE
elpd_loo   -116.0 3.1
p_loo         1.5 0.4
looic       232.0 6.1

All Pareto k estimates are good (k < 0.5)
See help('pareto-k-diagnostic') for details.
par(mfrow = 1:2, mar = c(5, 3.8, 1, 0) + 0.1, las = 3)
plot(full, label_points = TRUE)
plot(reduced, label_points = TRUE)
plot of chunk tut7.5bRSTANARMloo
compare_models(full, reduced)
elpd_diff        se 
    -28.8       4.5 
Conclusions: the expected out-of-sample predictive accuracy is substantially lower for the model that includes x. This might be used to suggest that the inferential evidence for a general effect of x on y.

Matrix model (BRMS)

summary(data.brms)
 Family: gaussian(identity) 
Formula: Y ~ A + B 
   Data: data (Number of observations: 30) 
Samples: 3 chains, each with iter = 2000; warmup = 500; thin = 2; 
         total post-warmup samples = 2250
    ICs: LOO = NA; WAIC = NA; R2 = NA
 
Population-Level Effects: 
          Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
Intercept    50.20      1.76    46.79    53.75       2177    1
AGroupB     -17.17      1.83   -20.81   -13.68       2158    1
AGroupC     -22.89      1.85   -26.43   -19.25       2127    1
B            -0.41      0.06    -0.54    -0.29       2129    1

Family Specific Parameters: 
      Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sigma     4.11      0.59     3.14     5.41       1854    1

Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample 
is a crude measure of effective sample size, and Rhat is the potential 
scale reduction factor on split chains (at convergence, Rhat = 1).
# OR
library(broom)
tidyMCMC(data.brms$fit, conf.int = TRUE, conf.method = "HPDinterval")
         term    estimate  std.error    conf.low   conf.high
1 b_Intercept  50.1970335 1.75540856  46.6017008  53.5440980
2   b_AGroupB -17.1728328 1.83135058 -20.6803108 -13.5582289
3   b_AGroupC -22.8915861 1.85449337 -26.3846325 -19.2178870
4         b_B  -0.4122714 0.06370931  -0.5416812  -0.2961189
5       sigma   4.1081576 0.58644550   3.1075609   5.3375706
Conclusions:
  • the intercept of the first group (Group A) is 50.1970335
  • the mean of the second group (Group B) is -17.1728328 units greater than (A)
  • the mean of the third group (Group C) is -22.8915861 units greater than (A)
  • a one unit increase in B in Group A is associated with a -0.4122714 units increase in Y
The 95% confidence interval for the effects of Group B, Group C and the partial slope associated with B do not overlapp with 0 implying a significant difference between group A and groups B, C and a significant negative relationship with B.

While workers attempt to become comfortable with a new statistical framework, it is only natural that they like to evaluate and comprehend new structures and output alongside more familiar concepts. One way to facilitate this is via Bayesian p-values that are somewhat analogous to the frequentist p-values for investigating the hypothesis that a parameter is equal to zero.

## since values are less than zero
head(as.matrix(data.brms))
          parameters
iterations b_Intercept b_AGroupB b_AGroupC        b_B    sigma      lp__
      [1,]    46.14429 -17.89799 -23.20626 -0.1747196 4.234569 -112.8789
      [2,]    50.46466 -17.00736 -21.94628 -0.4189975 3.964478 -105.7635
      [3,]    49.51423 -16.74688 -22.42555 -0.4566537 3.784797 -106.7988
      [4,]    52.03476 -20.63035 -26.45616 -0.3795796 4.637355 -108.6522
      [5,]    49.58570 -18.21573 -23.70766 -0.4119922 4.153369 -106.9716
      [6,]    51.87136 -19.76110 -26.12697 -0.4639165 3.485067 -109.3362
mcmcpvalue(as.matrix(data.brms)[, "b_AGroupB"])  # effect of (B-A = 0)
[1] 0
mcmcpvalue(as.matrix(data.brms)[, "b_AGroupC"])  # effect of (C-A = 0)
[1] 0
mcmcpvalue(as.matrix(data.brms)[, "b_B"])  # effect of (slope = 0)
[1] 0
mcmcpvalue(as.matrix(data.brms)[, 2:4])  # effect of (model)
[1] 0

There is evidence that the reponse differs between the groups. There is very little evidence that the response of group D differs from that of group A

library(loo)
(full = loo(data.brms))
  LOOIC   SE
 174.34 7.05
data.brms.red = update(data.brms, . ~ 1)
SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 1).

Gradient evaluation took 1e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.1 seconds.
Adjust your expectations accordingly!


Iteration:    1 / 2000 [  0%]  (Warmup)
Iteration:  200 / 2000 [ 10%]  (Warmup)
Iteration:  400 / 2000 [ 20%]  (Warmup)
Iteration:  600 / 2000 [ 30%]  (Warmup)
Iteration:  800 / 2000 [ 40%]  (Warmup)
Iteration: 1000 / 2000 [ 50%]  (Warmup)
Iteration: 1001 / 2000 [ 50%]  (Sampling)
Iteration: 1200 / 2000 [ 60%]  (Sampling)
Iteration: 1400 / 2000 [ 70%]  (Sampling)
Iteration: 1600 / 2000 [ 80%]  (Sampling)
Iteration: 1800 / 2000 [ 90%]  (Sampling)
Iteration: 2000 / 2000 [100%]  (Sampling)

 Elapsed Time: 0.019211 seconds (Warm-up)
               0.014546 seconds (Sampling)
               0.033757 seconds (Total)


SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 2).

Gradient evaluation took 4e-06 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.04 seconds.
Adjust your expectations accordingly!


Iteration:    1 / 2000 [  0%]  (Warmup)
Iteration:  200 / 2000 [ 10%]  (Warmup)
Iteration:  400 / 2000 [ 20%]  (Warmup)
Iteration:  600 / 2000 [ 30%]  (Warmup)
Iteration:  800 / 2000 [ 40%]  (Warmup)
Iteration: 1000 / 2000 [ 50%]  (Warmup)
Iteration: 1001 / 2000 [ 50%]  (Sampling)
Iteration: 1200 / 2000 [ 60%]  (Sampling)
Iteration: 1400 / 2000 [ 70%]  (Sampling)
Iteration: 1600 / 2000 [ 80%]  (Sampling)
Iteration: 1800 / 2000 [ 90%]  (Sampling)
Iteration: 2000 / 2000 [100%]  (Sampling)

 Elapsed Time: 0.015666 seconds (Warm-up)
               0.017803 seconds (Sampling)
               0.033469 seconds (Total)


SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 3).

Gradient evaluation took 4e-06 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.04 seconds.
Adjust your expectations accordingly!


Iteration:    1 / 2000 [  0%]  (Warmup)
Iteration:  200 / 2000 [ 10%]  (Warmup)
Iteration:  400 / 2000 [ 20%]  (Warmup)
Iteration:  600 / 2000 [ 30%]  (Warmup)
Iteration:  800 / 2000 [ 40%]  (Warmup)
Iteration: 1000 / 2000 [ 50%]  (Warmup)
Iteration: 1001 / 2000 [ 50%]  (Sampling)
Iteration: 1200 / 2000 [ 60%]  (Sampling)
Iteration: 1400 / 2000 [ 70%]  (Sampling)
Iteration: 1600 / 2000 [ 80%]  (Sampling)
Iteration: 1800 / 2000 [ 90%]  (Sampling)
Iteration: 2000 / 2000 [100%]  (Sampling)

 Elapsed Time: 0.016449 seconds (Warm-up)
               0.016509 seconds (Sampling)
               0.032958 seconds (Total)
(reduced = loo(data.brms.red))
  LOOIC   SE
 232.15 6.38
par(mfrow = 1:2, mar = c(5, 3.8, 1, 0) + 0.1, las = 3)
plot(full, label_points = TRUE)
plot(reduced, label_points = TRUE)
plot of chunk tut7.5bBRMSloo
compare_models(full, reduced)
Error in discrete == discrete[1]: comparison of these types is not implemented
Conclusions: the expected out-of-sample predictive accuracy is substantially lower for the model that includes x. This might be used to suggest that the inferential evidence for a general effect of x on y.

Graphical summaries

A nice graphic is often a great accompaniment to a statistical analysis. Although there are no fixed assumptions associated with graphing (in contrast to statistical analyses), we often want the graphical summaries to reflect the associated statistical analyses. After all, the sample is just one perspective on the population(s). What we are more interested in is being able to estimate and depict likely population parameters/trends.

Thus, whilst we could easily provide a plot displaying the raw data along with simple measures of location and spread, arguably, we should use estimates that reflect the fitted model. In this case, it would be appropriate to plot the credibility interval associated with each group.

Matrix model (MCMCpack)

mcmc = data.mcmcpack
## Calculate the fitted values
newdata = expand.grid(A = levels(data$A), B = seq(min(data$B), max(data$B),
    len = 100))
Xmat = model.matrix(~A + B, newdata)
coefs = mcmc[, 1:4]
fit = coefs %*% t(Xmat)
newdata = newdata %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.method = "HPDinterval"))

ggplot(newdata, aes(y = estimate, x = B, fill = A)) + geom_ribbon(aes(ymin = conf.low,
    ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") +
    scale_x_continuous("B") + theme_classic()
plot of chunk tut7.5bMCMCpackGraphicalSummaries

As this is simple single factor ANOVA, we can simple add the raw data to this figure. For more complex designs with additional predictors, it is necessary to plot partial residuals.

## Calculate partial residuals fitted values
fdata = rdata = data
fMat = rMat = model.matrix(~A + B, fdata)
fit = as.vector(apply(coefs, 2, median) %*% t(fMat))
resid = as.vector(data$Y - apply(coefs, 2, median) %*% t(rMat))
rdata = rdata %>% mutate(partial.resid = resid + fit)

ggplot(newdata, aes(y = estimate, x = B, fill = A)) + geom_point(data = rdata,
    aes(y = partial.resid, x = B, color = A)) + geom_ribbon(aes(ymin = conf.low,
    ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") +
    scale_x_continuous("B") + theme_classic()
plot of chunk tut7.5bMCMCpackGraphicalSummaries2

Matrix model (JAGS)

mcmc = data.r2jags$BUGSoutput$sims.matrix
## Calculate the fitted values
newdata = expand.grid(A = levels(data$A), B = seq(min(data$B), max(data$B),
    len = 100))
Xmat = model.matrix(~A + B, newdata)
coefs = mcmc[, c("beta[1]", "beta[2]", "beta[3]", "beta[4]")]
fit = coefs %*% t(Xmat)
newdata = newdata %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.method = "HPDinterval"))

ggplot(newdata, aes(y = estimate, x = B, fill = A)) + geom_ribbon(aes(ymin = conf.low,
    ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") +
    scale_x_continuous("B") + theme_classic()
plot of chunk tut7.5bJAGSGraphicalSummaries

As this is simple single factor ANOVA, we can simple add the raw data to this figure. For more complex designs with additional predictors, it is necessary to plot partial residuals.

## Calculate partial residuals fitted values
fdata = rdata = data
fMat = rMat = model.matrix(~A + B, fdata)
fit = as.vector(apply(coefs, 2, median) %*% t(fMat))
resid = as.vector(data$Y - apply(coefs, 2, median) %*% t(rMat))
rdata = rdata %>% mutate(partial.resid = resid + fit)

ggplot(newdata, aes(y = estimate, x = B, fill = A)) + geom_point(data = rdata,
    aes(y = partial.resid, x = B, color = A)) + geom_ribbon(aes(ymin = conf.low,
    ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") +
    scale_x_continuous("B") + theme_classic()
plot of chunk tut7.5bJAGSGraphicalSummaries2

Matrix model (RSTAN)

mcmc = as.matrix(data.rstan)
## Calculate the fitted values
newdata = expand.grid(A = levels(data$A), B = seq(min(data$B), max(data$B),
    len = 100))
Xmat = model.matrix(~A + B, newdata)
coefs = mcmc[, c("beta[1]", "beta[2]", "beta[3]", "beta[4]")]
fit = coefs %*% t(Xmat)
newdata = newdata %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.method = "HPDinterval"))

ggplot(newdata, aes(y = estimate, x = B, fill = A)) + geom_ribbon(aes(ymin = conf.low,
    ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") +
    scale_x_continuous("B") + theme_classic()
plot of chunk tut7.5bRSTANGraphicalSummaries

As this is simple single factor ANOVA, we can simple add the raw data to this figure. For more complex designs with additional predictors, it is necessary to plot partial residuals.

## Calculate partial residuals fitted values
fdata = rdata = data
fMat = rMat = model.matrix(~A + B, fdata)
fit = as.vector(apply(coefs, 2, median) %*% t(fMat))
resid = as.vector(data$Y - apply(coefs, 2, median) %*% t(rMat))
rdata = rdata %>% mutate(partial.resid = resid + fit)

ggplot(newdata, aes(y = estimate, x = B, fill = A)) + geom_point(data = rdata,
    aes(y = partial.resid, x = B, color = A)) + geom_ribbon(aes(ymin = conf.low,
    ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") +
    scale_x_continuous("B") + theme_classic()
plot of chunk tut7.5bRSTANGraphicalSummaries2

Matrix model (RSTANARM)

## Calculate the fitted values
newdata = expand.grid(A = levels(data$A), B = seq(min(data$B),
    max(data$B), len = 100))
fit = posterior_linpred(data.rstanarm, newdata = newdata)
newdata = newdata %>% cbind(tidyMCMC(as.mcmc(fit), conf.int = TRUE,
    conf.method = "HPDinterval"))

ggplot(newdata, aes(y = estimate, x = B, fill = A)) + geom_ribbon(aes(ymin = conf.low,
    ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") +
    scale_x_continuous("B") + theme_classic()
plot of chunk tut7.5bRSTANARMGraphicalSummaries

As this is simple single factor ANOVA, we can simple add the raw data to this figure. For more complex designs with additional predictors, it is necessary to plot partial residuals.

## Calculate partial residuals
rdata = data
pp = posterior_linpred(data.rstanarm, newdata = rdata)
fit = as.vector(apply(pp, 2, median))
resid = resid(data.rstanarm)
rdata = rdata %>% mutate(partial.resid = resid + fit)

ggplot(newdata, aes(y = estimate, x = B, fill = A)) + geom_point(data = rdata,
    aes(y = partial.resid, x = B, color = A)) + geom_ribbon(aes(ymin = conf.low,
    ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") +
    scale_x_continuous("B") + theme_classic()
plot of chunk tut7.5bRSTANARMGraphicalSummaries2

Matrix model (BRMS)

Although we could calculated the fitted values via matrix multiplication of the coefficients and the model matrix (as for MCMCpack, RJAGS and RSTAN), for more complex models, it is more convenient to use the marginal_effects function that comes with brms.

plot(marginal_effects(data.brms), points = TRUE)
plot of chunk tut7.5bBRMSGraphicalSummaries.a
plot of chunk tut7.5bBRMSGraphicalSummaries.a
# OR
eff = plot(marginal_effects(data.brms), points = TRUE, plot = FALSE)
eff
$A
plot of chunk tut7.5bBRMSGraphicalSummaries.a
$B
plot of chunk tut7.5bBRMSGraphicalSummaries.a
## Calculate the fitted values
newdata = expand.grid(A = levels(data$A), B = seq(min(data$B),
    max(data$B), len = 100))
fit = fitted(data.brms, newdata = newdata, summary = FALSE)
newdata = newdata %>% cbind(tidyMCMC(as.mcmc(fit), conf.int = TRUE,
    conf.method = "HPDinterval"))

ggplot(newdata, aes(y = estimate, x = B, fill = A)) + geom_ribbon(aes(ymin = conf.low,
    ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") +
    scale_x_continuous("B") + theme_classic()
plot of chunk tut7.5bBRMSGraphicalSummaries

As this is simple single factor ANOVA, we can simple add the raw data to this figure. For more complex designs with additional predictors, it is necessary to plot partial residuals.

## Calculate partial residuals
rdata = data
fit = fitted(data.brms, summary = TRUE)[, "Estimate"]
resid = resid(data.brms)[, "Estimate"]
rdata = rdata %>% mutate(partial.resid = resid + fit)

ggplot(newdata, aes(y = estimate, x = B, fill = A)) + geom_point(data = rdata,
    aes(y = partial.resid, x = B, color = A)) + geom_ribbon(aes(ymin = conf.low,
    ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") +
    scale_x_continuous("B") + theme_classic()
plot of chunk tut7.5bBRMSGraphicalSummaries2

Posteriors

In frequentist statistics, when we have more than two groups, we are typically not only interested in whether there is evidence for an overall "effect" of a factor - we are also interested in how various groups compare to one another.

To explore these trends, we either compare each group to each other in a pairwise manner (controlling for family-wise Type I error rates) or we explore an independent subset of the possible comparisons. Although these alternate approaches can adequately address a specific research agenda, often they impose severe limitations and compromises on the scope and breadth of questions that can be asked of your data. The reason for these limitations is that in a frequentist framework, any single hypothesis carries with it a (nominally) 5% chance of a false rejection (since it is based on long-run frequency). Thus, performing multiple tests are likely to compound this error rate. The point is, that each comparison is compared to its own probability distribution (and each carries a 5% error rate).

By contrast, in Bayesian statistics, all comparisons (contrasts) are drawn from the one (hopefully stable and convergent) posterior distribution and this posterior is invariant to the type and number of comparisons drawn. Hence, the theory clearly indicates that having generated our posterior distribution, we can then query this distribution in any way that we wish thereby allowing us to explore all of our research questions simultaneously.

Bayesian "contrasts" can be performed either:

  • within the Bayesian sampling model or
  • construct them from the returned MCMC samples (they are drawn from the posteriors)
Only the latter will be demonstrated as it povides a consistent approach across all routines.

In order to allow direct comparison to the frequentist equivalents, I will explore the same set of planned and "Tukey's" test comparisons described here. For the "planned comparison" we defined two contrasts:

  1. group B vs group C
  2. group A vs the average of groups B and C
Of course each of these could be explored at multiple values of B, however, since we fit an additive model (which assumes that the slopes are homogeneous), the contrasts will be constant throughout the domain of B.

Lets start by comparing each group to each other group in a pairwise manner. Arguably the most elegant way to do this is to generate a Tukey's contrast matrix. This is a model matrix specific to comparing each group to each other group. Again, since the lines are parallel, it does not really matter what level of B we estimate these efffects at - so lets use the mean B.

mcmc = data.mcmcpack
coefs <- as.matrix(mcmc)[, 1:4]
newdata = expand.grid(A = levels(data$A), B = mean(data$B))
# A Tukeys contrast matrix
library(multcomp)
tuk.mat <- contrMat(n = table(newdata$A), type = "Tukey")
tuk.mat
	 Multiple Comparisons of Means: Tukey Contrasts

                  Group A Group B Group C
Group B - Group A      -1       1       0
Group C - Group A      -1       0       1
Group C - Group B       0      -1       1
Xmat <- model.matrix(~A + B, data = newdata)
Xmat
  (Intercept) AGroup B AGroup C        B
1           1        0        0 16.50175
2           1        1        0 16.50175
3           1        0        1 16.50175
attr(,"assign")
[1] 0 1 1 2
attr(,"contrasts")
attr(,"contrasts")$A
[1] "contr.treatment"
pairwise.mat <- tuk.mat %*% Xmat
pairwise.mat
                  (Intercept) AGroup B AGroup C B
Group B - Group A           0        1        0 0
Group C - Group A           0        0        1 0
Group C - Group B           0       -1        1 0
mcmc_areas(coefs %*% t(pairwise.mat))
plot of chunk tut7.5bMCMCpackES
(comps = tidyMCMC(coefs %*% t(pairwise.mat), conf.int = TRUE, conf.method = "HPDinterval"))
               term   estimate std.error   conf.low  conf.high
1 Group B - Group A -17.143346  1.878211 -20.750365 -13.348582
2 Group C - Group A -22.913257  1.868172 -26.602078 -19.185214
3 Group C - Group B  -5.769911  1.821653  -9.552934  -2.271914
ggplot(comps, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low,
    ymax = conf.high)) + geom_hline(yintercept = 0, linetype = "dashed") +
    scale_y_continuous("Effect size") + scale_x_discrete("") + coord_flip() +
    theme_classic()
plot of chunk tut7.5bMCMCpackES

With a couple of modifications, we could also express this as percentage changes. A percentage change represents the change (difference between groups) divided by one of the groups (determined by which group you want to express the percentage change to). Hence, we generate an additional mcmc matrix that represents the cell means for the divisor group (group we want to express change relative to).

Since the tuk.mat defines comparisons as -1 and 1 pairs, if we simply replace all the -1 with 0, the eventual matrix multiplication will result in estimates of the divisor cell means instread of the difference. We can then divide the original mcmc matrix above with this new divisor mcmc matrix to yeild a mcmc matrix of percentage change.

# Modify the tuk.mat to replace -1 with 0.  This will allow us to get a
# mcmc matrix of ..
tuk.mat[tuk.mat == -1] = 0
comp.mat <- tuk.mat %*% Xmat
comp.mat
                  (Intercept) AGroup B AGroup C        B
Group B - Group A           1        1        0 16.50175
Group C - Group A           1        0        1 16.50175
Group C - Group B           1        0        1 16.50175
comp.mcmc = 100 * (coefs %*% t(pairwise.mat))/coefs %*% t(comp.mat)
(comps = tidyMCMC(comp.mcmc, conf.int = TRUE, conf.method = "HPDinterval"))
               term   estimate std.error   conf.low  conf.high
1 Group B - Group A  -65.78138  9.862131  -84.99925 -46.617429
2 Group C - Group A -112.84960 15.200206 -144.80969 -84.547452
3 Group C - Group B  -28.70578 10.358636  -50.05632  -9.004512
ggplot(comps, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low,
    ymax = conf.high)) + geom_hline(yintercept = 0, linetype = "dashed") +
    scale_y_continuous("Effect size (%)") + scale_x_discrete("") + coord_flip() +
    theme_classic()
plot of chunk tut7.5bMCMCpackES1

And now for the specific planned comparisons (Group B vs Group C as well as Group A vs the average of Groups B and C). This is achieved by generating our own contrast matrix (defining the contributions of each group to each contrast).

c.mat = rbind(c(0, 1, -1), c(1/2, -1/3, -1/3))
c.mat
     [,1]       [,2]       [,3]
[1,]  0.0  1.0000000 -1.0000000
[2,]  0.5 -0.3333333 -0.3333333
mcmc = data.mcmcpack
coefs <- as.matrix(mcmc)[, 1:4]
newdata <- data.frame(A = levels(data$A), B = mean(data$B))
Xmat <- model.matrix(~A + B, data = newdata)
c.mat = c.mat %*% Xmat
c.mat
     (Intercept)   AGroup B   AGroup C         B
[1,]   0.0000000  1.0000000 -1.0000000  0.000000
[2,]  -0.1666667 -0.3333333 -0.3333333 -2.750292
(comps = tidyMCMC(as.mcmc(coefs %*% t(c.mat)), conf.int = TRUE, conf.method = "HPDinterval"))
  term estimate std.error conf.low conf.high
1 var1 5.769911 1.8216528 2.271914  9.552934
2 var2 6.123330 0.9187204 4.344498  7.965326

Lets start by comparing each group to each other group in a pairwise manner. Arguably the most elegant way to do this is to generate a Tukey's contrast matrix. This is a model matrix specific to comparing each group to each other group. Again, since the lines are parallel, it does not really matter what level of B we estimate these efffects at - so lets use the mean B.

mcmc = data.r2jags$BUGSoutput$sims.matrix
coefs <- as.matrix(mcmc)[, 1:4]
newdata <- data.frame(A = levels(data$A), B = mean(data$B))
# A Tukeys contrast matrix
library(multcomp)
tuk.mat <- contrMat(n = table(newdata$A), type = "Tukey")
Xmat <- model.matrix(~A + B, data = newdata)
pairwise.mat <- tuk.mat %*% Xmat
pairwise.mat
                  (Intercept) AGroup B AGroup C B
Group B - Group A           0        1        0 0
Group C - Group A           0        0        1 0
Group C - Group B           0       -1        1 0
mcmc_areas(coefs %*% t(pairwise.mat))
plot of chunk tut7.5bR2JAGSES
(comps = tidyMCMC(coefs %*% t(pairwise.mat), conf.int = TRUE, conf.method = "HPDinterval"))
               term   estimate std.error   conf.low  conf.high
1 Group B - Group A -17.140773  1.879343 -20.898966 -13.503813
2 Group C - Group A -22.916543  1.882434 -26.590761 -19.148042
3 Group C - Group B  -5.775771  1.880763  -9.402033  -1.979244
ggplot(comps, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low,
    ymax = conf.high)) + geom_hline(yintercept = 0, linetype = "dashed") +
    scale_y_continuous("Effect size") + scale_x_discrete("") + coord_flip() +
    theme_classic()
plot of chunk tut7.5bR2JAGSES

With a couple of modifications, we could also express this as percentage changes. A percentage change represents the change (difference between groups) divided by one of the groups (determined by which group you want to express the percentage change to). Hence, we generate an additional mcmc matrix that represents the cell means for the divisor group (group we want to express change relative to).

Since the tuk.mat defines comparisons as -1 and 1 pairs, if we simply replace all the -1 with 0, the eventual matrix multiplication will result in estimates of the divisor cell means instread of the difference. We can then divide the original mcmc matrix above with this new divisor mcmc matrix to yeild a mcmc matrix of percentage change.

# Modify the tuk.mat to replace -1 with 0.  This will allow us to get a
# mcmc matrix of ..
tuk.mat[tuk.mat == -1] = 0
comp.mat <- tuk.mat %*% Xmat
comp.mat
                  (Intercept) AGroup B AGroup C        B
Group B - Group A           1        1        0 16.50175
Group C - Group A           1        0        1 16.50175
Group C - Group B           1        0        1 16.50175
comp.mcmc = 100 * (coefs %*% t(pairwise.mat))/coefs %*% t(comp.mat)
(comps = tidyMCMC(comp.mcmc, conf.int = TRUE, conf.method = "HPDinterval"))
               term   estimate std.error   conf.low  conf.high
1 Group B - Group A  -65.77396  9.861029  -85.38307 -46.803881
2 Group C - Group A -112.95081 15.533036 -144.44128 -83.512153
3 Group C - Group B  -28.79007 10.740187  -50.07490  -8.241323
ggplot(comps, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low,
    ymax = conf.high)) + geom_hline(yintercept = 0, linetype = "dashed") +
    scale_y_continuous("Effect size (%)") + scale_x_discrete("") + coord_flip() +
    theme_classic()
plot of chunk tut7.5bR2JAGSES1

And now for the specific planned comparisons (Group B vs Group C as well as Group A vs the average of Groups B and C). This is achieved by generating our own contrast matrix (defining the contributions of each group to each contrast).

c.mat = rbind(c(0, 1, -1), c(1/2, -1/3, -1/3))
c.mat
     [,1]       [,2]       [,3]
[1,]  0.0  1.0000000 -1.0000000
[2,]  0.5 -0.3333333 -0.3333333
mcmc = data.r2jags$BUGSoutput$sims.matrix
coefs <- as.matrix(mcmc)[, 1:4]
newdata <- data.frame(A = levels(data$A), B = mean(data$B))
Xmat <- model.matrix(~A + B, data = newdata)
c.mat = c.mat %*% Xmat
c.mat
     (Intercept)   AGroup B   AGroup C         B
[1,]   0.0000000  1.0000000 -1.0000000  0.000000
[2,]  -0.1666667 -0.3333333 -0.3333333 -2.750292
(comps = tidyMCMC(as.mcmc(coefs %*% t(c.mat)), conf.int = TRUE, conf.method = "HPDinterval"))
  term estimate std.error conf.low conf.high
1 var1 5.775771  1.880763 1.979244  9.402033
2 var2 6.124007  0.913995 4.278228  7.879558

Lets start by comparing each group to each other group in a pairwise manner. Arguably the most elegant way to do this is to generate a Tukey's contrast matrix. This is a model matrix specific to comparing each group to each other group. Again, since the lines are parallel, it does not really matter what level of B we estimate these efffects at - so lets use the mean B.

mcmc = data.rstan
coefs <- as.matrix(mcmc)[, 1:4]
newdata <- data.frame(A = levels(data$A), B = mean(data$B))
# A Tukeys contrast matrix
library(multcomp)
tuk.mat <- contrMat(n = table(newdata$A), type = "Tukey")
Xmat <- model.matrix(~A + B, data = newdata)
pairwise.mat <- tuk.mat %*% Xmat
pairwise.mat
                  (Intercept) AGroup B AGroup C B
Group B - Group A           0        1        0 0
Group C - Group A           0        0        1 0
Group C - Group B           0       -1        1 0
mcmc_areas(coefs %*% t(pairwise.mat))
plot of chunk tut7.5bRSTANES
(comps = tidyMCMC(coefs %*% t(pairwise.mat), conf.int = TRUE, conf.method = "HPDinterval"))
               term   estimate std.error   conf.low  conf.high
1 Group B - Group A -17.120600  1.916002 -20.721678 -13.023381
2 Group C - Group A -22.879458  1.927939 -26.818438 -19.313017
3 Group C - Group B  -5.758857  1.908307  -9.465031  -1.972144
ggplot(comps, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low,
    ymax = conf.high)) + geom_hline(yintercept = 0, linetype = "dashed") +
    scale_y_continuous("Effect size") + scale_x_discrete("") + coord_flip() +
    theme_classic()
plot of chunk tut7.5bRSTANES

With a couple of modifications, we could also express this as percentage changes. A percentage change represents the change (difference between groups) divided by one of the groups (determined by which group you want to express the percentage change to). Hence, we generate an additional mcmc matrix that represents the cell means for the divisor group (group we want to express change relative to).

Since the tuk.mat defines comparisons as -1 and 1 pairs, if we simply replace all the -1 with 0, the eventual matrix multiplication will result in estimates of the divisor cell means instread of the difference. We can then divide the original mcmc matrix above with this new divisor mcmc matrix to yeild a mcmc matrix of percentage change.

# Modify the tuk.mat to replace -1 with 0.  This will allow us to get a
# mcmc matrix of ..
tuk.mat[tuk.mat == -1] = 0
comp.mat <- tuk.mat %*% Xmat
comp.mat
                  (Intercept) AGroup B AGroup C        B
Group B - Group A           1        1        0 16.50175
Group C - Group A           1        0        1 16.50175
Group C - Group B           1        0        1 16.50175
comp.mcmc = 100 * (coefs %*% t(pairwise.mat))/coefs %*% t(comp.mat)
(comps = tidyMCMC(comp.mcmc, conf.int = TRUE, conf.method = "HPDinterval"))
               term   estimate std.error   conf.low  conf.high
1 Group B - Group A  -65.68005  9.938618  -86.57138 -46.874022
2 Group C - Group A -112.65860 15.767824 -145.18904 -84.337691
3 Group C - Group B  -28.69047 10.825916  -48.80312  -6.685448
ggplot(comps, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low,
    ymax = conf.high)) + geom_hline(yintercept = 0, linetype = "dashed") +
    scale_y_continuous("Effect size (%)") + scale_x_discrete("") + coord_flip() +
    theme_classic()
plot of chunk tut7.5bRSTANES1

And now for the specific planned comparisons (Group B vs Group C as well as Group A vs the average of Groups B and C). This is achieved by generating our own contrast matrix (defining the contributions of each group to each contrast).

c.mat = rbind(c(0, 1, -1), c(1/2, -1/3, -1/3))
c.mat
     [,1]       [,2]       [,3]
[1,]  0.0  1.0000000 -1.0000000
[2,]  0.5 -0.3333333 -0.3333333
mcmc = data.rstan
coefs <- as.matrix(mcmc)[, 1:4]
newdata <- data.frame(A = levels(data$A), B = mean(data$B))
Xmat <- model.matrix(~A + B, data = newdata)
c.mat = c.mat %*% Xmat
c.mat
     (Intercept)   AGroup B   AGroup C         B
[1,]   0.0000000  1.0000000 -1.0000000  0.000000
[2,]  -0.1666667 -0.3333333 -0.3333333 -2.750292
(comps = tidyMCMC(as.mcmc(coefs %*% t(c.mat)), conf.int = TRUE, conf.method = "HPDinterval"))
  term estimate std.error conf.low conf.high
1 var1 5.758857 1.9083068 1.972144  9.465031
2 var2 6.107318 0.9312412 4.373877  8.159077

Lets start by comparing each group to each other group in a pairwise manner. Arguably the most elegant way to do this is to generate a Tukey's contrast matrix. This is a model matrix specific to comparing each group to each other group. Again, since the lines are parallel, it does not really matter what level of B we estimate these efffects at - so lets use the mean B.

mcmc = data.rstanarm
coefs <- as.matrix(mcmc)[, 1:4]
newdata <- data.frame(A = levels(data$A), B = mean(data$B))
# A Tukeys contrast matrix
library(multcomp)
tuk.mat <- contrMat(n = table(newdata$A), type = "Tukey")
Xmat <- model.matrix(~A + B, data = newdata)
pairwise.mat <- tuk.mat %*% Xmat
pairwise.mat
                  (Intercept) AGroup B AGroup C B
Group B - Group A           0        1        0 0
Group C - Group A           0        0        1 0
Group C - Group B           0       -1        1 0
mcmc_areas(coefs %*% t(pairwise.mat))
plot of chunk tut7.5bRSTANARMES
(comps = tidyMCMC(coefs %*% t(pairwise.mat), conf.int = TRUE, conf.method = "HPDinterval"))
               term   estimate std.error   conf.low  conf.high
1 Group B - Group A -17.211444  1.938649 -21.006648 -13.433809
2 Group C - Group A -22.903757  1.967936 -26.675721 -19.018964
3 Group C - Group B  -5.692313  1.892174  -9.159957  -1.799432
ggplot(comps, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low,
    ymax = conf.high)) + geom_hline(yintercept = 0, linetype = "dashed") +
    scale_y_continuous("Effect size") + scale_x_discrete("") + coord_flip() +
    theme_classic()
plot of chunk tut7.5bRSTANARMES

With a couple of modifications, we could also express this as percentage changes. A percentage change represents the change (difference between groups) divided by one of the groups (determined by which group you want to express the percentage change to). Hence, we generate an additional mcmc matrix that represents the cell means for the divisor group (group we want to express change relative to).

Since the tuk.mat defines comparisons as -1 and 1 pairs, if we simply replace all the -1 with 0, the eventual matrix multiplication will result in estimates of the divisor cell means instread of the difference. We can then divide the original mcmc matrix above with this new divisor mcmc matrix to yeild a mcmc matrix of percentage change.

# Modify the tuk.mat to replace -1 with 0.  This will allow us to get a
# mcmc matrix of ..
tuk.mat[tuk.mat == -1] = 0
comp.mat <- tuk.mat %*% Xmat
comp.mat
                  (Intercept) AGroup B AGroup C        B
Group B - Group A           1        1        0 16.50175
Group C - Group A           1        0        1 16.50175
Group C - Group B           1        0        1 16.50175
comp.mcmc = 100 * (coefs %*% t(pairwise.mat))/coefs %*% t(comp.mat)
(comps = tidyMCMC(comp.mcmc, conf.int = TRUE, conf.method = "HPDinterval"))
               term   estimate std.error   conf.low  conf.high
1 Group B - Group A  -66.16030  10.09871  -86.76507 -46.675359
2 Group C - Group A -112.70201  15.95929 -144.71321 -82.068118
3 Group C - Group B  -28.33604  10.75415  -50.15437  -8.304887
ggplot(comps, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low,
    ymax = conf.high)) + geom_hline(yintercept = 0, linetype = "dashed") +
    scale_y_continuous("Effect size (%)") + scale_x_discrete("") + coord_flip() +
    theme_classic()
plot of chunk tut7.5bRSTANARMES1

And now for the specific planned comparisons (Group B vs Group C as well as Group A vs the average of Groups B and C). This is achieved by generating our own contrast matrix (defining the contributions of each group to each contrast).

c.mat = rbind(c(0, 1, -1), c(1/2, -1/3, -1/3))
c.mat
     [,1]       [,2]       [,3]
[1,]  0.0  1.0000000 -1.0000000
[2,]  0.5 -0.3333333 -0.3333333
mcmc = data.rstanarm
coefs <- as.matrix(mcmc)[, 1:4]
newdata <- data.frame(A = levels(data$A), B = mean(data$B))
Xmat <- model.matrix(~A + B, data = newdata)
c.mat = c.mat %*% Xmat
c.mat
     (Intercept)   AGroup B   AGroup C         B
[1,]   0.0000000  1.0000000 -1.0000000  0.000000
[2,]  -0.1666667 -0.3333333 -0.3333333 -2.750292
(comps = tidyMCMC(as.mcmc(coefs %*% t(c.mat)), conf.int = TRUE, conf.method = "HPDinterval"))
  term estimate std.error conf.low conf.high
1 var1 5.692313 1.8921744 1.799432  9.159957
2 var2 6.138562 0.9558204 4.235113  7.985014

Lets start by comparing each group to each other group in a pairwise manner. Arguably the most elegant way to do this is to generate a Tukey's contrast matrix. This is a model matrix specific to comparing each group to each other group. Again, since the lines are parallel, it does not really matter what level of B we estimate these efffects at - so lets use the mean B.

mcmc = data.brms
coefs <- as.matrix(mcmc)[, 1:4]
newdata <- data.frame(A = levels(data$A), B = mean(data$B))
# A Tukeys contrast matrix
library(multcomp)
tuk.mat <- contrMat(n = table(newdata$A), type = "Tukey")
Xmat <- model.matrix(~A + B, data = newdata)
pairwise.mat <- tuk.mat %*% Xmat
pairwise.mat
                  (Intercept) AGroup B AGroup C B
Group B - Group A           0        1        0 0
Group C - Group A           0        0        1 0
Group C - Group B           0       -1        1 0
mcmc_areas(coefs %*% t(pairwise.mat))
plot of chunk tut7.5bBRMSES
(comps = tidyMCMC(coefs %*% t(pairwise.mat), conf.int = TRUE, conf.method = "HPDinterval"))
               term   estimate std.error   conf.low  conf.high
1 Group B - Group A -17.172833  1.831351 -20.680311 -13.558229
2 Group C - Group A -22.891586  1.854493 -26.384632 -19.217887
3 Group C - Group B  -5.718753  1.846079  -9.169838  -2.065845
ggplot(comps, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low,
    ymax = conf.high)) + geom_hline(yintercept = 0, linetype = "dashed") +
    scale_y_continuous("Effect size") + scale_x_discrete("") + coord_flip() +
    theme_classic()
plot of chunk tut7.5bBRMSES

With a couple of modifications, we could also express this as percentage changes. A percentage change represents the change (difference between groups) divided by one of the groups (determined by which group you want to express the percentage change to). Hence, we generate an additional mcmc matrix that represents the cell means for the divisor group (group we want to express change relative to).

Since the tuk.mat defines comparisons as -1 and 1 pairs, if we simply replace all the -1 with 0, the eventual matrix multiplication will result in estimates of the divisor cell means instread of the difference. We can then divide the original mcmc matrix above with this new divisor mcmc matrix to yeild a mcmc matrix of percentage change.

# Modify the tuk.mat to replace -1 with 0.  This will allow us to get a
# mcmc matrix of ..
tuk.mat[tuk.mat == -1] = 0
comp.mat <- tuk.mat %*% Xmat
comp.mat
                  (Intercept) AGroup B AGroup C        B
Group B - Group A           1        1        0 16.50175
Group C - Group A           1        0        1 16.50175
Group C - Group B           1        0        1 16.50175
comp.mcmc = 100 * (coefs %*% t(pairwise.mat))/coefs %*% t(comp.mat)
(comps = tidyMCMC(comp.mcmc, conf.int = TRUE, conf.method = "HPDinterval"))
               term   estimate std.error   conf.low  conf.high
1 Group B - Group A  -65.90339  9.650503  -84.05345 -46.582017
2 Group C - Group A -112.55948 15.341280 -143.66383 -84.530670
3 Group C - Group B  -28.43318 10.498519  -47.72283  -7.691681
ggplot(comps, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low,
    ymax = conf.high)) + geom_hline(yintercept = 0, linetype = "dashed") +
    scale_y_continuous("Effect size (%)") + scale_x_discrete("") + coord_flip() +
    theme_classic()
plot of chunk tut7.5bBRMSES1

And now for the specific planned comparisons (Group B vs Group C as well as Group A vs the average of Groups B and C). This is achieved by generating our own contrast matrix (defining the contributions of each group to each contrast).

c.mat = rbind(c(0, 1, -1), c(1/2, -1/3, -1/3))
c.mat
     [,1]       [,2]       [,3]
[1,]  0.0  1.0000000 -1.0000000
[2,]  0.5 -0.3333333 -0.3333333
mcmc = data.brms
coefs <- as.matrix(mcmc)[, 1:4]
newdata <- data.frame(A = levels(data$A), B = mean(data$B))
Xmat <- model.matrix(~A + B, data = newdata)
c.mat = c.mat %*% Xmat
c.mat
     (Intercept)   AGroup B   AGroup C         B
[1,]   0.0000000  1.0000000 -1.0000000  0.000000
[2,]  -0.1666667 -0.3333333 -0.3333333 -2.750292
(comps = tidyMCMC(as.mcmc(coefs %*% t(c.mat)), conf.int = TRUE, conf.method = "HPDinterval"))
  term estimate std.error conf.low conf.high
1 var1 5.718753 1.8460785 2.065845  9.169838
2 var2 6.122501 0.8984429 4.344599  7.808558

Finite Population Standard Deviations

Variance components, the amount of added variance attributed to each influence, are traditionally estimated for so called random effects. These are the effects for which the levels employed in the design are randomly selected to represent a broader range of possible levels. For such effects, effect sizes (differences between each level and a reference level) are of little value. Instead, the 'importance' of the variables are measured in units of variance components.

On the other hand, regular variance components for fixed factors (those whose measured levels represent the only levels of interest) are not logical - since variance components estimate variance as if the levels are randomly selected from a larger population. Nevertheless, in order to compare and contrast the scale of variability of both fixed and random factors, it is necessary to measure both on the same scale (sample or population based variance).

Finite-population variance components (Gelman, 2005) assume that the levels of all factors (fixed and random) in the design are all the possible levels available. In other words, they are assumed to represent finite populations of levels. Sample (rather than population) statistics are then used to calculate these finite-population variances (or standard deviations).

Since standard deviation (and variance) are bound at zero, standard deviation posteriors are typically non-normal. Consequently, medians and HPD intervals are more robust estimates.

library(broom)
mcmc = data.mcmcpack
head(mcmc)
Markov Chain Monte Carlo (MCMC) output:
Start = 1001 
End = 1007 
Thinning interval = 1 
     (Intercept)  AGroup B  AGroup C          B   sigma2
[1,]    50.18839 -18.23379 -22.87414 -0.4394640 14.11917
[2,]    49.39556 -15.64877 -20.98086 -0.4528605 13.22004
[3,]    49.44854 -16.36367 -23.66886 -0.3887300 15.57976
[4,]    51.46891 -16.26642 -24.26313 -0.4709924 12.01597
[5,]    50.55617 -17.54613 -22.44262 -0.4252965 16.35805
[6,]    50.73340 -17.54078 -23.75846 -0.4598987 10.84731
[7,]    51.16774 -17.41150 -22.41515 -0.3996571 17.57170
# A
wch = grep("^A", colnames(mcmc))
# Get the rowwise standard deviations between effects parameters
sd.A = apply(mcmc[, wch], 1, sd)
# B
wch = grep("^B", colnames(mcmc))
sd.B = sd(data$B) * abs(mcmc[, wch])
# generate a model matrix
newdata = data
Xmat = model.matrix(~A + B, newdata)
## get median parameter estimates
wch = grep("(Intercept)|^A|^B", colnames(mcmc))
coefs = mcmc[, wch]
fit = coefs %*% t(Xmat)
resid = sweep(fit, 2, data$Y, "-")
sd.resid = apply(resid, 1, sd)

sd.all = cbind(sd.A, sd.B, sd.resid)
(fpsd = tidyMCMC(sd.all, conf.int = TRUE, conf.method = "HPDinterval"))
      term estimate std.error conf.low conf.high
1     sd.A 4.080665 1.2858143 1.606486  6.754944
2     sd.B 5.030454 0.7743607 3.478134  6.527505
3 sd.resid 3.983878 0.1906139 3.761980  4.356994
# OR expressed as a percentage
(fpsd.p = tidyMCMC(100 * sd.all/rowSums(sd.all), estimate.method = "median",
    conf.int = TRUE, conf.method = "HPDinterval"))
      term estimate std.error conf.low conf.high
1     sd.A 31.28801  7.123578 16.55312  44.15266
2     sd.B 38.61193  5.120548 28.29820  48.32862
3 sd.resid 30.05025  4.065060 24.60465  38.59845
## we can even plot this as a Bayesian ANOVA table
ggplot(fpsd, aes(y = estimate, x = term)) + geom_hline(yintercept = 0,
    linetype = "dashed") + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) +
    geom_text(aes(label = sprintf("%.2f%%", fpsd.p$estimate), vjust = -1)) +
    scale_y_continuous("Finite population standard deviation") + scale_x_discrete() +
    coord_flip() + theme_classic()
plot of chunk tut7.5bMCMCpackFinitePopulation

Conclusions: Approximately 31.3% of the total finite population standard deviation is due to x.

library(broom)
mcmc = data.r2jags$BUGSoutput$sims.matrix
head(mcmc)
      beta[1]   beta[2]   beta[3]    beta[4] deviance    sigma
[1,] 52.60917 -18.82408 -23.38489 -0.4880702 167.0879 4.283265
[2,] 53.75233 -18.89312 -24.92630 -0.5191816 171.4824 5.124059
[3,] 51.15722 -16.81356 -23.25841 -0.5336914 170.4625 3.705732
[4,] 50.32853 -16.98682 -23.41316 -0.4149829 165.9603 4.545555
[5,] 48.78829 -15.18438 -24.19762 -0.4395482 172.1889 4.458376
[6,] 51.55891 -17.64261 -22.37656 -0.5040106 166.7893 4.155680
# A
wch = grep("beta.[2-3]", colnames(mcmc))
# Get the rowwise standard deviations between effects parameters
sd.A = apply(mcmc[, wch], 1, sd)
# B
wch = grep("beta.4", colnames(mcmc))
sd.B = sd(data$B) * abs(mcmc[, wch])
# generate a model matrix
newdata = data
Xmat = model.matrix(~A + B, newdata)
## get median parameter estimates
wch = grep("beta.[1-4]", colnames(mcmc))
coefs = mcmc[, wch]
fit = coefs %*% t(Xmat)
resid = sweep(fit, 2, data$Y, "-")
sd.resid = apply(resid, 1, sd)

sd.all = cbind(sd.A, sd.B, sd.resid)
(fpsd = tidyMCMC(sd.all, conf.int = TRUE, conf.method = "HPDinterval"))
      term estimate std.error conf.low conf.high
1     sd.A 4.085929 1.3242287 1.399537  6.648241
2     sd.B 5.031876 0.7938012 3.468931  6.606399
3 sd.resid 3.991447 0.1970424 3.762837  4.379866
# OR expressed as a percentage
(fpsd.p = tidyMCMC(100 * sd.all/rowSums(sd.all), estimate.method = "median",
    conf.int = TRUE, conf.method = "HPDinterval"))
      term estimate std.error conf.low conf.high
1     sd.A 31.33381  7.376887 15.57164  43.97278
2     sd.B 38.65910  5.313671 27.75097  48.62132
3 sd.resid 29.97544  4.188131 24.58803  39.19370
## we can even plot this as a Bayesian ANOVA table
ggplot(fpsd, aes(y = estimate, x = term)) + geom_hline(yintercept = 0,
    linetype = "dashed") + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) +
    geom_text(aes(label = sprintf("%.2f%%", fpsd.p$estimate), vjust = -1)) +
    scale_y_continuous("Finite population standard deviation") + scale_x_discrete() +
    coord_flip() + theme_classic()
plot of chunk tut7.5bR2JAGSFinitePopulation

Conclusions: Approximately 31.3% of the total finite population standard deviation is due to x.

library(broom)
mcmc = as.matrix(data.rstan)
head(mcmc)
          parameters
iterations  beta[1]   beta[2]   beta[3]    beta[4]    sigma    mu[1]    mu[2]    mu[3]    mu[4]
      [1,] 52.68797 -15.89902 -22.48872 -0.5325084 4.190521 49.72136 44.37439 39.97070 51.24060
      [2,] 51.04073 -14.86065 -23.63103 -0.4404722 3.869872 48.58686 44.16402 40.52145 49.84351
      [3,] 52.08091 -17.19621 -23.79866 -0.4785798 4.179663 49.41474 44.60927 40.65155 50.78012
      [4,] 51.54719 -19.52916 -24.99191 -0.3788141 4.283020 49.43681 45.63310 42.50041 50.51756
      [5,] 49.24231 -15.18013 -21.04165 -0.3906899 3.719325 47.06578 43.14281 39.91192 48.18041
      [6,] 49.65178 -16.39404 -19.21428 -0.4629938 3.691510 47.07244 42.42347 38.59465 48.39336
          parameters
iterations    mu[5]    mu[6]    mu[7]    mu[8]    mu[9]   mu[10]   mu[11]   mu[12]   mu[13]
      [1,] 40.47396 41.79718 41.35552 33.11874 51.77358 31.91453 32.08782 35.17453 31.86079
      [2,] 40.93773 42.03225 41.66692 34.85374 50.28437 33.85767 32.29147 34.84469 32.10367
      [3,] 41.10385 42.29307 41.89613 34.49351 51.25912 33.41126 30.65966 33.43378 30.45562
      [4,] 42.85842 43.79973 43.48554 37.62609 50.89671 36.76944 28.67375 30.86957 28.51224
      [5,] 40.28116 41.25197 40.92793 34.88479 48.57144 34.00129 30.61307 32.87772 30.44650
      [6,] 39.03221 40.18270 39.79869 32.63715 48.85676 31.59015 29.17031 31.85408 28.97291
          parameters
iterations   mu[14]   mu[15]   mu[16]   mu[17]   mu[18]   mu[19]   mu[20]   mu[21]   mu[22]
      [1,] 24.12069 24.92430 28.59623 33.75113 31.31670 16.35942 24.54276 24.16979 27.07583
      [2,] 25.70134 26.36605 29.40334 33.66730 31.65363 19.28149 26.05046 22.42234 24.82611
      [3,] 23.49939 24.22161 27.52168 32.15453 29.96664 16.52413 23.87872 22.86342 25.47515
      [4,] 23.00612 23.57779 26.18991 29.85699 28.12520 17.48493 23.30637 22.26606 24.33334
      [5,] 24.76775 25.35734 28.05136 31.83340 30.04731 19.07348 25.07742 23.77698 25.90907
      [6,] 22.24322 22.94193 26.13451 30.61648 28.49985 15.49512 22.61020 25.19514 27.72182
          parameters
iterations   mu[23]   mu[24]   mu[25]   mu[26]   mu[27]   mu[28]   mu[29]   mu[30] log_lik[1]
      [1,] 21.17639 32.86026 23.41873 25.46849 10.27852 11.46507 20.12482 28.62927  -2.355750
      [2,] 19.94630 29.61079 21.80108 23.49658 10.93196 11.91344 19.07648 26.11106  -2.348154
      [3,] 20.17316 30.67378 22.18841 24.03059 10.37895 11.44533 19.22809 26.87127  -2.362435
      [4,] 20.13662 28.44826 21.73177 23.18992 12.38413 13.22821 19.38856 25.43843  -2.385424
      [5,] 21.58078 30.15298 23.22594 24.72981 13.58525 14.45580 20.80927 27.04880  -2.564270
      [6,] 22.59250 32.75114 24.54212 26.32431 13.11726 14.14891 21.67820 29.07247  -2.560301
          parameters
iterations log_lik[2] log_lik[3] log_lik[4] log_lik[5] log_lik[6] log_lik[7] log_lik[8] log_lik[9]
      [1,]  -2.380651  -2.392311  -2.359395  -2.644059  -2.370981  -2.656101  -2.358963  -3.937789
      [2,]  -2.321658  -2.285946  -2.297979  -2.721302  -2.283643  -2.564277  -2.439362  -3.463785
      [3,]  -2.366243  -2.356686  -2.349263  -2.769866  -2.352204  -2.562282  -2.450072  -3.731235
      [4,]  -2.375320  -2.422274  -2.374746  -3.224846  -2.411614  -2.408977  -3.057791  -3.556057
      [5,]  -2.413647  -2.289148  -2.466128  -2.560216  -2.299999  -2.726477  -2.418549  -2.888813
      [6,]  -2.546048  -2.467202  -2.424089  -2.338922  -2.442708  -3.079584  -2.224991  -2.983441
          parameters
iterations log_lik[10] log_lik[11] log_lik[12] log_lik[13] log_lik[14] log_lik[15] log_lik[16]
      [1,]   -2.560221   -2.385313   -2.758740   -2.999655   -2.808351   -2.357355   -5.539627
      [2,]   -2.291578   -2.298123   -2.669736   -2.956470   -3.313616   -2.305454   -6.602203
      [3,]   -2.391007   -2.530008   -2.468268   -3.440631   -2.676762   -2.386746   -4.935821
      [4,]   -2.499489   -2.925431   -2.381090   -4.170168   -2.601230   -2.460896   -4.195091
      [5,]   -2.246331   -2.469401   -2.312064   -3.614919   -3.014530   -2.232484   -5.873203
      [6,]   -2.561867   -2.812913   -2.232745   -4.376755   -2.390970   -2.440833   -4.643851
          parameters
iterations log_lik[17] log_lik[18] log_lik[19] log_lik[20] log_lik[21] log_lik[22] log_lik[23]
      [1,]   -2.602428   -2.541644   -2.411276   -2.925312   -2.550560   -2.796703   -2.397480
      [2,]   -2.549710   -2.440502   -2.909103   -3.472430   -2.915533   -2.369028   -2.480363
      [3,]   -2.402926   -2.791773   -2.423397   -2.767720   -2.795438   -2.507550   -2.496696
      [4,]   -2.397022   -3.282272   -2.553796   -2.661806   -2.936898   -2.413541   -2.518648
      [5,]   -2.272281   -2.768729   -2.857917   -3.144357   -2.565447   -2.513088   -2.259383
      [6,]   -2.226004   -3.294590   -2.237378   -2.464606   -2.320908   -3.001039   -2.225789
          parameters
iterations log_lik[24] log_lik[25] log_lik[26] log_lik[27] log_lik[28] log_lik[29] log_lik[30]
      [1,]   -4.438382   -3.307734   -2.499837   -2.496503   -3.808212   -2.814545   -2.829593
      [2,]   -3.213944   -2.854593   -2.275338   -2.554512   -3.772549   -3.133717   -2.355333
      [3,]   -3.512033   -2.945362   -2.369487   -2.507912   -3.821285   -3.044316   -2.505695
      [4,]   -2.842706   -2.833433   -2.373597   -2.891791   -3.165134   -2.993189   -2.395956
      [5,]   -3.470845   -3.366605   -2.318398   -3.350387   -2.858380   -2.637408   -2.461301
      [6,]   -4.845740   -3.980841   -2.585876   -3.176836   -2.957508   -2.450304   -2.981159
          parameters
iterations      lp__
      [1,] -56.99328
      [2,] -56.17719
      [3,] -54.73138
      [4,] -55.42088
      [5,] -54.94832
      [6,] -56.99186
# A
wch = grep("beta.[2-3]", colnames(mcmc))
# Get the rowwise standard deviations between effects parameters
sd.A = apply(mcmc[, wch], 1, sd)
# B
wch = grep("beta.[4]", colnames(mcmc))
sd.B = sd(data$B) * abs(mcmc[, wch])
# generate a model matrix
newdata = data
Xmat = model.matrix(~A + B, newdata)
## get median parameter estimates
wch = grep("beta", colnames(mcmc))
coefs = mcmc[, wch]
fit = coefs %*% t(Xmat)
resid = sweep(fit, 2, data$Y, "-")
sd.resid = apply(resid, 1, sd)

sd.all = cbind(sd.A, sd.B, sd.resid)
(fpsd = tidyMCMC(sd.all, conf.int = TRUE, conf.method = "HPDinterval"))
      term estimate std.error conf.low conf.high
1     sd.A 4.072432 1.3484539 1.394517  6.692788
2     sd.B 5.053796 0.7704775 3.514660  6.518542
3 sd.resid 3.993504 0.1978560 3.764060  4.389092
# OR expressed as a percentage
(fpsd.p = tidyMCMC(100 * sd.all/rowSums(sd.all), estimate.method = "median",
    conf.int = TRUE, conf.method = "HPDinterval"))
      term estimate std.error conf.low conf.high
1     sd.A 31.42998  7.687620 14.88253  43.87950
2     sd.B 38.57359  5.482235 27.76693  48.93887
3 sd.resid 30.05401  4.027620 25.10582  39.01512
## we can even plot this as a Bayesian ANOVA table
ggplot(fpsd, aes(y = estimate, x = term)) + geom_hline(yintercept = 0,
    linetype = "dashed") + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) +
    geom_text(aes(label = sprintf("%.2f%%", fpsd.p$estimate), vjust = -1)) +
    scale_y_continuous("Finite population standard deviation") + scale_x_discrete() +
    coord_flip() + theme_classic()
plot of chunk tut7.5bR2RSTANFinitePopulation

Conclusions: Approximately 31.4% of the total finite population standard deviation is due to x.

library(broom)
mcmc = as.matrix(data.rstanarm)
head(mcmc)
          parameters
iterations (Intercept)  AGroup B  AGroup C          B    sigma
      [1,]    49.25161 -18.13674 -20.71374 -0.3963549 4.384061
      [2,]    50.28149 -20.18446 -23.60224 -0.3823662 4.601748
      [3,]    47.22865 -13.74348 -20.42610 -0.3439257 3.559008
      [4,]    50.36383 -18.72387 -23.75338 -0.3776768 4.211066
      [5,]    49.33028 -18.88190 -20.94786 -0.3634085 3.830714
      [6,]    49.08087 -14.40159 -22.87480 -0.4144611 3.793997
# A
wch = grep("^A", colnames(mcmc))
# Get the rowwise standard deviations between effects parameters
sd.A = apply(mcmc[, wch], 1, sd)
# B
wch = grep("^B", colnames(mcmc))
sd.B = sd(data$B) * abs(mcmc[, wch])
# generate a model matrix
newdata = data
Xmat = model.matrix(~A + B, newdata)
## get median parameter estimates
wch = grep("(Intercept)|^A|^B", colnames(mcmc))
coefs = mcmc[, wch]
fit = coefs %*% t(Xmat)
resid = sweep(fit, 2, data$Y, "-")
sd.resid = apply(resid, 1, sd)

sd.all = cbind(sd.A, sd.B, sd.resid)
(fpsd = tidyMCMC(sd.all, conf.int = TRUE, conf.method = "HPDinterval"))
      term estimate std.error conf.low conf.high
1     sd.A 4.026941 1.3323358 1.272390  6.477067
2     sd.B 5.031510 0.8284226 3.467658  6.627846
3 sd.resid 4.004512 0.2119432 3.762475  4.420304
# OR expressed as a percentage
(fpsd.p = tidyMCMC(100 * sd.all/rowSums(sd.all), estimate.method = "median",
    conf.int = TRUE, conf.method = "HPDinterval"))
      term estimate std.error conf.low conf.high
1     sd.A 31.14949  7.543440 13.88979  43.33205
2     sd.B 38.63719  5.482952 27.23552  49.17710
3 sd.resid 30.23369  4.336975 24.98450  40.19062
## we can even plot this as a Bayesian ANOVA table
ggplot(fpsd, aes(y = estimate, x = term)) + geom_hline(yintercept = 0,
    linetype = "dashed") + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) +
    geom_text(aes(label = sprintf("%.2f%%", fpsd.p$estimate), vjust = -1)) +
    scale_y_continuous("Finite population standard deviation") + scale_x_discrete() +
    coord_flip() + theme_classic()
plot of chunk tut7.5bR2RSTANARMFinitePopulation

Conclusions: Approximately 31.1% of the total finite population standard deviation is due to x.

library(broom)
mcmc = as.matrix(data.brms)
head(mcmc)
          parameters
iterations b_Intercept b_AGroupB b_AGroupC        b_B    sigma      lp__
      [1,]    46.14429 -17.89799 -23.20626 -0.1747196 4.234569 -112.8789
      [2,]    50.46466 -17.00736 -21.94628 -0.4189975 3.964478 -105.7635
      [3,]    49.51423 -16.74688 -22.42555 -0.4566537 3.784797 -106.7988
      [4,]    52.03476 -20.63035 -26.45616 -0.3795796 4.637355 -108.6522
      [5,]    49.58570 -18.21573 -23.70766 -0.4119922 4.153369 -106.9716
      [6,]    51.87136 -19.76110 -26.12697 -0.4639165 3.485067 -109.3362
# A
wch = grep("^b_A", colnames(mcmc))
# Get the rowwise standard deviations between effects parameters
sd.A = apply(mcmc[, wch], 1, sd)
# B
wch = grep("^b_B", colnames(mcmc))
sd.B = sd(data$B) * abs(mcmc[, wch])
# generate a model matrix
newdata = data
Xmat = model.matrix(~A + B, newdata)
## get median parameter estimates
wch = c(grep("(b_Intercept)|^b_", colnames(mcmc)))
coefs = mcmc[, wch]
fit = coefs %*% t(Xmat)
resid = sweep(fit, 2, data$Y, "-")
sd.resid = apply(resid, 1, sd)

sd.all = cbind(sd.A, sd.B, sd.resid)
(fpsd = tidyMCMC(sd.all, conf.int = TRUE, conf.method = "HPDinterval"))
      term estimate std.error conf.low conf.high
1     sd.A 4.043769 1.3053747 1.460773  6.484054
2     sd.B 5.012018 0.7745195 3.599942  6.585264
3 sd.resid 3.982644 0.1855119 3.761674  4.356373
# OR expressed as a percentage
(fpsd.p = tidyMCMC(100 * sd.all/rowSums(sd.all), estimate.method = "median",
    conf.int = TRUE, conf.method = "HPDinterval"))
      term estimate std.error conf.low conf.high
1     sd.A 31.29667  7.332669 15.34081  43.31182
2     sd.B 38.43989  5.149650 28.75407  49.05275
3 sd.resid 30.14396  4.349195 24.45494  39.31055
## we can even plot this as a Bayesian ANOVA table
ggplot(fpsd, aes(y = estimate, x = term)) + geom_hline(yintercept = 0,
    linetype = "dashed") + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) +
    geom_text(aes(label = sprintf("%.2f%%", fpsd.p$estimate), vjust = -1)) +
    scale_y_continuous("Finite population standard deviation") + scale_x_discrete() +
    coord_flip() + theme_classic()
plot of chunk tut7.5bR2BRMSFinitePopulation

Conclusions: Approximately 31.3% of the total finite population standard deviation is due to x.

$R^2$

In a frequentist context, the $R^2$ value is seen as a useful indicator of goodness of fit. Whilst it has long been acknowledged that this measure is not appropriate for comparing models (for such purposes information criterion such as AIC are more appropriate), it is nevertheless useful for estimating the amount (percent) of variance explained by the model.

In a frequentist context, $R^2$ is calculated as the variance in predicted values divided by the variance in the observed (response) values. Unfortunately, this classical formulation does not translate simply into a Bayesian context since the equivalently calculated numerator can be larger than the an equivalently calculated denominator - thereby resulting in an $R^2$ greater than 100%. Gelman, Goodrich, Gabry, and Ali (2017) proposed an alternative formulation in which the denominator comprises the sum of the explained variance and the variance of the residuals.

So in the standard regression model notation of: $$ \begin{align} y_i \sim{}& N(\mu_i, \sigma)\\ \mu_i =& \mathbf{X}\boldsymbol{\beta} \end{align} $$ The $R^2$ could be formulated as: $$ R^2 = \frac{\sigma^2_f}{\sigma^2_f + \sigma^2_e} $$ where $\sigma^2_f = var(\mu)$, ($\mu = \mathbf{X}\boldsymbol{\beta})$) and for Gaussian models $\sigma^2_e = var(y-\mu)$

library(broom)
mcmc <- data.mcmcpack
Xmat = model.matrix(~A + B, data)
wch = grep("(Intercept)|^A|^B", colnames(mcmc))
coefs = mcmc[, wch]
fit = coefs %*% t(Xmat)
resid = sweep(fit, 2, data$Y, "-")
var_f = apply(fit, 1, var)
var_e = apply(resid, 1, var)
R2 = var_f/(var_f + var_e)
tidyMCMC(as.mcmc(R2), conf.int = TRUE, conf.method = "HPDinterval")
  term  estimate  std.error  conf.low conf.high
1 var1 0.8712662 0.02114025 0.8316521 0.8958785
# for comparison with frequentist
summary(lm(Y ~ A + B, data))
Call:
lm(formula = Y ~ A + B, data = data)

Residuals:
   Min     1Q Median     3Q    Max 
-8.671 -2.911  0.328  2.188  7.407 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)  50.20598    1.71450  29.283  < 2e-16 ***
AGroup B    -17.15033    1.78618  -9.602 4.91e-10 ***
AGroup C    -22.91188    1.79775 -12.745 1.09e-12 ***
B            -0.41403    0.06143  -6.740 3.76e-07 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 3.973 on 26 degrees of freedom
Multiple R-squared:  0.884,	Adjusted R-squared:  0.8706 
F-statistic: 66.05 on 3 and 26 DF,  p-value: 2.719e-12
library(broom)
mcmc <- data.r2jags$BUGSoutput$sims.matrix
Xmat = model.matrix(~A + B, data)
wch = grep("beta", colnames(mcmc))
coefs = mcmc[, wch]
fit = coefs %*% t(Xmat)
resid = sweep(fit, 2, data$Y, "-")
var_f = apply(fit, 1, var)
var_e = apply(resid, 1, var)
R2 = var_f/(var_f + var_e)
tidyMCMC(as.mcmc(R2), conf.int = TRUE, conf.method = "HPDinterval")
  term  estimate  std.error  conf.low conf.high
1 var1 0.8708952 0.02144096 0.8290317 0.8960092
# for comparison with frequentist
summary(lm(Y ~ A + B, data))
Call:
lm(formula = Y ~ A + B, data = data)

Residuals:
   Min     1Q Median     3Q    Max 
-8.671 -2.911  0.328  2.188  7.407 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)  50.20598    1.71450  29.283  < 2e-16 ***
AGroup B    -17.15033    1.78618  -9.602 4.91e-10 ***
AGroup C    -22.91188    1.79775 -12.745 1.09e-12 ***
B            -0.41403    0.06143  -6.740 3.76e-07 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 3.973 on 26 degrees of freedom
Multiple R-squared:  0.884,	Adjusted R-squared:  0.8706 
F-statistic: 66.05 on 3 and 26 DF,  p-value: 2.719e-12
library(broom)
mcmc <- as.matrix(data.rstan)
Xmat = model.matrix(~A + B, data)
wch = grep("beta", colnames(mcmc))
coefs = mcmc[, wch]
fit = coefs %*% t(Xmat)
resid = sweep(fit, 2, data$Y, "-")
var_f = apply(fit, 1, var)
var_e = apply(resid, 1, var)
R2 = var_f/(var_f + var_e)
tidyMCMC(as.mcmc(R2), conf.int = TRUE, conf.method = "HPDinterval")
  term  estimate  std.error  conf.low conf.high
1 var1 0.8706419 0.02178395 0.8297436 0.8959157
# for comparison with frequentist
summary(lm(Y ~ A + B, data))
Call:
lm(formula = Y ~ A + B, data = data)

Residuals:
   Min     1Q Median     3Q    Max 
-8.671 -2.911  0.328  2.188  7.407 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)  50.20598    1.71450  29.283  < 2e-16 ***
AGroup B    -17.15033    1.78618  -9.602 4.91e-10 ***
AGroup C    -22.91188    1.79775 -12.745 1.09e-12 ***
B            -0.41403    0.06143  -6.740 3.76e-07 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 3.973 on 26 degrees of freedom
Multiple R-squared:  0.884,	Adjusted R-squared:  0.8706 
F-statistic: 66.05 on 3 and 26 DF,  p-value: 2.719e-12
library(broom)
mcmc <- as.matrix(data.rstanarm)
Xmat = model.matrix(~A + B, data)
wch = grep("Intercept|^A|^B", colnames(mcmc))
coefs = mcmc[, wch]
fit = coefs %*% t(Xmat)
resid = sweep(fit, 2, data$Y, "-")
var_f = apply(fit, 1, var)
var_e = apply(resid, 1, var)
R2 = var_f/(var_f + var_e)
tidyMCMC(as.mcmc(R2), conf.int = TRUE, conf.method = "HPDinterval")
  term  estimate std.error  conf.low conf.high
1 var1 0.8699886 0.0241677 0.8256175 0.8960312
# for comparison with frequentist
summary(lm(Y ~ A + B, data))
Call:
lm(formula = Y ~ A + B, data = data)

Residuals:
   Min     1Q Median     3Q    Max 
-8.671 -2.911  0.328  2.188  7.407 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)  50.20598    1.71450  29.283  < 2e-16 ***
AGroup B    -17.15033    1.78618  -9.602 4.91e-10 ***
AGroup C    -22.91188    1.79775 -12.745 1.09e-12 ***
B            -0.41403    0.06143  -6.740 3.76e-07 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 3.973 on 26 degrees of freedom
Multiple R-squared:  0.884,	Adjusted R-squared:  0.8706 
F-statistic: 66.05 on 3 and 26 DF,  p-value: 2.719e-12
library(broom)
mcmc <- as.matrix(data.brms)
Xmat = model.matrix(~A + B, data)
wch = grep("b_", colnames(mcmc))
coefs = mcmc[, wch]
fit = coefs %*% t(Xmat)
resid = sweep(fit, 2, data$Y, "-")
var_f = apply(fit, 1, var)
var_e = apply(resid, 1, var)
R2 = var_f/(var_f + var_e)
tidyMCMC(as.mcmc(R2), conf.int = TRUE, conf.method = "HPDinterval")
  term  estimate  std.error  conf.low conf.high
1 var1 0.8712322 0.02007052 0.8322119 0.8956157
# for comparison with frequentist
summary(lm(Y ~ A + B, data))
Call:
lm(formula = Y ~ A + B, data = data)

Residuals:
   Min     1Q Median     3Q    Max 
-8.671 -2.911  0.328  2.188  7.407 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)  50.20598    1.71450  29.283  < 2e-16 ***
AGroup B    -17.15033    1.78618  -9.602 4.91e-10 ***
AGroup C    -22.91188    1.79775 -12.745 1.09e-12 ***
B            -0.41403    0.06143  -6.740 3.76e-07 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 3.973 on 26 degrees of freedom
Multiple R-squared:  0.884,	Adjusted R-squared:  0.8706 
F-statistic: 66.05 on 3 and 26 DF,  p-value: 2.719e-12

Dealing with heterogeneous slopes

Random data incorporating the following trends (effect parameters)
  • the sample size per treatment=10
  • the categorical $x$ variable with 3 levels
  • the first treatment group has a population mean of 40.
  • the other two treatments reduce the mean by 15 and 20 units respectively
  • the data are drawn from normal distributions with a mean of 0 and standard deviation of 4 ($\sigma^2=16$)
  • the covariate (B) is a continuous variable with a mean of 20 and a standard deviation of 15
set.seed(6)
n <- 10
p <- 3
A.eff <- c(40, -15, -20)
beta <- c(-0.45, -0.1, 0.5)
sigma <- 4
B <- rnorm(n * p, 0, 15)
A <- gl(p, n, lab = paste("Group", LETTERS[1:3]))
mm <- model.matrix(~A * B)
data1 <- data.frame(A = A, B = B, Y = as.numeric(c(A.eff, beta) %*% t(mm)) + rnorm(n * p, 0, 4))
data1$B <- data1$B + 20
head(data1)
        A        B        Y
1 Group A 24.04409 35.49432
2 Group A 10.55022 46.22216
3 Group A 33.02990 29.41898
4 Group A 45.90793 24.10656
5 Group A 20.36281 44.38834
6 Group A 25.52038 36.87477

Exploratory data analysis

library(car)
scatterplot(Y ~ B | A, data = data1)
plot of chunk tut7.5bS10.1
boxplot(Y ~ A, data1)
plot of chunk tut7.5bS10.1
# OR via ggplot
library(ggplot2)
ggplot(data1, aes(y = Y, x = B, group = A)) + geom_point() + geom_smooth(method = "lm")
plot of chunk tut7.5bS10.1
ggplot(data1, aes(y = Y, x = A)) + geom_boxplot()
plot of chunk tut7.5bS10.1
Conclusions:
  • there is no evidence of obvious non-normality
  • the assumption of linearity seems reasonable
  • the variability of the three groups seems approximately equal
  • the slopes (Y vs B trends) do appear to differ between treatment groups - in particular, Group C seems to portray a different trend to Groups A and B.

Homogeneity of slopes

We can explore inferential evidence of unequal slopes by examining estimated effects of the interaction between the categorical variable and the covariate. Note, pay no attention to the main effects - only the interaction. Even though I intend to illustrate Bayesian analyses here, for such a simple model, it is considerably simpler to use traditional OLS for testing for the presence of an interaction..

anova(lm(Y ~ B * A, data = data1))
Analysis of Variance Table

Response: Y
          Df  Sum Sq Mean Sq F value    Pr(>F)    
B          1 1257.14 1257.14  98.520 5.685e-10 ***
A          2 2042.02 1021.01  80.015 2.420e-11 ***
B:A        2  510.02  255.01  19.985 7.778e-06 ***
Residuals 24  306.25   12.76                      
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Conclusions: there is strong evidence to suggest that the assumption of equal slopes is violated.

Model fitting or statistical analysis

The multiplicative model is constructed and specified in the same way as the additive model. The difference is that the model matrix ($\mathbf{X}$) contains the interaction terms.

$$ \begin{align} y_{ij} &\sim{} N(\mu_{ij}, \sigma)\\ \mu_{ij} &= \beta_0 + \mathbf{X}\boldsymbol{\beta}\\[1em] \beta_0 &\sim{} N(0,100)\\ \beta &\sim{} N(0,10)\\ \sigma &\sim{} cauchy(0,5)\\ \end{align} $$

library(MCMCpack)
data1.mcmcpack <- MCMCregress(Y ~ A * B, data = data1)

Define the model

modelString = "
  model {
  #Likelihood
  for (i in 1:n) {
  y[i]~dnorm(mean[i],tau)
  mean[i] <- inprod(beta[],X[i,])
  }
  #Priors
  for (i in 1:ngroups) {
  beta[i] ~ dnorm(0, 1.0E-6) 
  }
  sigma ~ dunif(0, 100)
  tau <- 1 / (sigma * sigma)
  }
  "

Define the data

Arrange the data as a list (as required by BUGS). As input, JAGS will need to be supplied with:

  • the response variable (y)
  • the predictor model matrix (X)
  • the total number of observed items (n)
  • the number of predictor terms (nX)
This all needs to be contained within a list object. We will create two data lists, one for each of the hypotheses.

X <- model.matrix(~A * B, data1)
data1.list <- with(data1, list(y = Y, X = X, n = nrow(data1), ngroups = ncol(X)))

Define the MCMC chain parameters

Next we should define the behavioural parameters of the MCMC sampling chains. Include the following:

  • the nodes (estimated parameters) to monitor (return samples for)
  • the number of MCMC chains (3)
  • the number of burnin steps (1000)
  • the thinning factor (10)
  • the number of MCMC iterations - determined by the number of samples to save, the rate of thinning and the number of chains

params <- c("beta", "sigma")
nChains = 3
burnInSteps = 3000
thinSteps = 10
numSavedSteps = 15000  #across all chains
nIter = ceiling(burnInSteps + (numSavedSteps * thinSteps)/nChains)
nIter
[1] 53000

Fit the model

Now run the JAGS code via the R2jags interface. Note that the first time jags is run after the R2jags package is loaded, it is often necessary to run any kind of randomization function just to initiate the .Random.seed variable.

library(R2jags)
data1.r2jags <- jags(data = data1.list, inits = NULL, parameters.to.save = params,
    model.file = textConnection(modelString), n.chains = nChains, n.iter = nIter,
    n.burnin = burnInSteps, n.thin = thinSteps)
Compiling model graph
   Resolving undeclared variables
   Allocating nodes
Graph information:
   Observed stochastic nodes: 30
   Unobserved stochastic nodes: 7
   Total graph size: 298

Initializing model
print(data1.r2jags)
Inference for Bugs model at "5", fit using jags,
 3 chains, each with 53000 iterations (first 3000 discarded), n.thin = 10
 n.sims = 15000 iterations saved
         mu.vect sd.vect    2.5%     25%     50%     75%   97.5%  Rhat n.eff
beta[1]   49.553   2.336  45.049  48.003  49.538  51.092  54.162 1.001 15000
beta[2]  -12.227   3.093 -18.248 -14.285 -12.228 -10.182  -6.142 1.001 11000
beta[3]  -28.270   2.993 -34.198 -30.236 -28.275 -26.304 -22.340 1.001 15000
beta[4]   -0.495   0.093  -0.680  -0.556  -0.494  -0.433  -0.313 1.001 15000
beta[5]   -0.121   0.115  -0.347  -0.197  -0.122  -0.046   0.105 1.001 15000
beta[6]    0.410   0.111   0.193   0.336   0.410   0.482   0.636 1.001 15000
sigma      3.772   0.585   2.827   3.354   3.707   4.106   5.105 1.001 15000
deviance 163.101   4.379 156.875 159.924 162.329 165.512 173.639 1.001 15000

For each parameter, n.eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor (at convergence, Rhat=1).

DIC info (using the rule, pD = var(deviance)/2)
pD = 9.6 and DIC = 172.7
DIC is an estimate of expected predictive error (lower deviance is better).
data1.mcmc.list <- as.mcmc(data1.r2jags)

Model matrix formulation

The minimum model in Stan required to fit the above simple regression follows. Note the following modifications from the model defined in JAGS:
  • the normal distribution is defined by variance rather than precision
  • rather than using a uniform prior for sigma, I am using a half-Cauchy

We now translate the likelihood model into STAN code.
$$\begin{align} y_{ij}&\sim{}N(\mu_{ij}, \sigma)\\ \mu_{ij} &= \mathbf{X}\boldsymbol{\beta}\\ \beta_0&\sim{}N(0,100)\\ \beta&\sim{}N(0,10)\\ \sigma&\sim{}Cauchy(0,5)\\ \end{align} $$

Define the model

modelString = "
  data {
  int<lower=1> n;
  int<lower=1> nX;
  vector [n] y;
  matrix [n,nX] X;
  }
  parameters {
  vector[nX] beta;
  real<lower=0> sigma;
  }
  transformed parameters {
  vector[n] mu;

  mu = X*beta;
  }
  model {
  #Likelihood
  y~normal(mu,sigma);
  
  #Priors
  beta ~ normal(0,100);
  sigma~cauchy(0,5);
  }
  generated quantities {
  vector[n] log_lik;
  
  for (i in 1:n) {
  log_lik[i] = normal_lpdf(y[i] | mu[i], sigma); 
  }
  }
  "

Define the data

Arrange the data as a list (as required by BUGS). As input, JAGS will need to be supplied with:

  • the response variable (y)
  • the predictor model matrix (X)
  • the total number of observed items (n)
  • the number of predictor terms (nX)
This all needs to be contained within a list object. We will create two data lists, one for each of the hypotheses.

Xmat <- model.matrix(~A * B, data1)
data1.list <- with(data1, list(y = Y, X = Xmat, nX = ncol(Xmat), n = nrow(data1)))

Fit the model

Now run the JAGS code via the R2jags interface. Note that the first time jags is run after the R2jags package is loaded, it is often necessary to run any kind of randomization function just to initiate the .Random.seed variable.

## load the rstan package
library(rstan)
data1.rstan <- stan(data = data1.list, model_code = modelString, chains = 3,
    iter = 2000, warmup = 500, thin = 3)
SAMPLING FOR MODEL '3d2414c9dcf4b5e12be870eadd2c894a' NOW (CHAIN 1).

Gradient evaluation took 2.2e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.22 seconds.
Adjust your expectations accordingly!


Iteration:    1 / 2000 [  0%]  (Warmup)
Iteration:  200 / 2000 [ 10%]  (Warmup)
Iteration:  400 / 2000 [ 20%]  (Warmup)
Iteration:  501 / 2000 [ 25%]  (Sampling)
Iteration:  700 / 2000 [ 35%]  (Sampling)
Iteration:  900 / 2000 [ 45%]  (Sampling)
Iteration: 1100 / 2000 [ 55%]  (Sampling)
Iteration: 1300 / 2000 [ 65%]  (Sampling)
Iteration: 1500 / 2000 [ 75%]  (Sampling)
Iteration: 1700 / 2000 [ 85%]  (Sampling)
Iteration: 1900 / 2000 [ 95%]  (Sampling)
Iteration: 2000 / 2000 [100%]  (Sampling)

 Elapsed Time: 0.099447 seconds (Warm-up)
               0.151706 seconds (Sampling)
               0.251153 seconds (Total)


SAMPLING FOR MODEL '3d2414c9dcf4b5e12be870eadd2c894a' NOW (CHAIN 2).

Gradient evaluation took 7e-06 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.07 seconds.
Adjust your expectations accordingly!


Iteration:    1 / 2000 [  0%]  (Warmup)
Iteration:  200 / 2000 [ 10%]  (Warmup)
Iteration:  400 / 2000 [ 20%]  (Warmup)
Iteration:  501 / 2000 [ 25%]  (Sampling)
Iteration:  700 / 2000 [ 35%]  (Sampling)
Iteration:  900 / 2000 [ 45%]  (Sampling)
Iteration: 1100 / 2000 [ 55%]  (Sampling)
Iteration: 1300 / 2000 [ 65%]  (Sampling)
Iteration: 1500 / 2000 [ 75%]  (Sampling)
Iteration: 1700 / 2000 [ 85%]  (Sampling)
Iteration: 1900 / 2000 [ 95%]  (Sampling)
Iteration: 2000 / 2000 [100%]  (Sampling)

 Elapsed Time: 0.106983 seconds (Warm-up)
               0.116191 seconds (Sampling)
               0.223174 seconds (Total)


SAMPLING FOR MODEL '3d2414c9dcf4b5e12be870eadd2c894a' NOW (CHAIN 3).

Gradient evaluation took 7e-06 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.07 seconds.
Adjust your expectations accordingly!


Iteration:    1 / 2000 [  0%]  (Warmup)
Iteration:  200 / 2000 [ 10%]  (Warmup)
Iteration:  400 / 2000 [ 20%]  (Warmup)
Iteration:  501 / 2000 [ 25%]  (Sampling)
Iteration:  700 / 2000 [ 35%]  (Sampling)
Iteration:  900 / 2000 [ 45%]  (Sampling)
Iteration: 1100 / 2000 [ 55%]  (Sampling)
Iteration: 1300 / 2000 [ 65%]  (Sampling)
Iteration: 1500 / 2000 [ 75%]  (Sampling)
Iteration: 1700 / 2000 [ 85%]  (Sampling)
Iteration: 1900 / 2000 [ 95%]  (Sampling)
Iteration: 2000 / 2000 [100%]  (Sampling)

 Elapsed Time: 0.142543 seconds (Warm-up)
               0.116691 seconds (Sampling)
               0.259234 seconds (Total)
print(data1.rstan, par = c("beta", "sigma"))
Inference for Stan model: 3d2414c9dcf4b5e12be870eadd2c894a.
3 chains, each with iter=2000; warmup=500; thin=3; 
post-warmup draws per chain=500, total post-warmup draws=1500.

          mean se_mean   sd   2.5%    25%    50%    75%  97.5% n_eff Rhat
beta[1]  49.50    0.08 2.34  44.55  48.01  49.58  51.01  54.11   902    1
beta[2] -12.29    0.09 3.04 -18.60 -14.30 -12.17 -10.22  -6.70  1094    1
beta[3] -28.21    0.10 2.97 -33.80 -30.23 -28.34 -26.40 -22.04   957    1
beta[4]  -0.49    0.00 0.09  -0.67  -0.55  -0.49  -0.44  -0.31   978    1
beta[5]  -0.12    0.00 0.11  -0.34  -0.19  -0.12  -0.05   0.11  1164    1
beta[6]   0.41    0.00 0.11   0.18   0.34   0.41   0.48   0.62   953    1
sigma     3.73    0.02 0.58   2.81   3.32   3.66   4.05   5.07  1271    1

Samples were drawn using NUTS(diag_e) at Sat Nov  4 07:45:04 2017.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).
library(rstanarm)
library(broom)
library(coda)
data1.rstanarm = stan_glm(Y ~ A * B, data = data1, iter = 2000, warmup = 500,
    chains = 3, thin = 2, refresh = 0, prior_intercept = normal(0, 100),
    prior = normal(0, 100), prior_aux = cauchy(0, 2))
Gradient evaluation took 7.1e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.71 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.586557 seconds (Warm-up)
               0.34573 seconds (Sampling)
               0.932287 seconds (Total)


Gradient evaluation took 2.5e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.25 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.478284 seconds (Warm-up)
               0.353797 seconds (Sampling)
               0.832081 seconds (Total)


Gradient evaluation took 1.2e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.12 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.438293 seconds (Warm-up)
               0.338371 seconds (Sampling)
               0.776664 seconds (Total)
print(data1.rstanarm)
stan_glm
 family:  gaussian [identity]
 formula: Y ~ A * B
------

Estimates:
            Median MAD_SD
(Intercept)  49.6    2.3 
AGroup B    -12.3    3.1 
AGroup C    -28.3    3.0 
B            -0.5    0.1 
AGroup B:B   -0.1    0.1 
AGroup C:B    0.4    0.1 
sigma         3.7    0.6 

Sample avg. posterior predictive 
distribution of y (X = xbar):
         Median MAD_SD
mean_PPD 26.9    1.0  

------
For info on the priors used see help('prior_summary.stanreg').
tidyMCMC(data1.rstanarm, conf.int = TRUE, conf.method = "HPDinterval")
         term    estimate  std.error    conf.low   conf.high
1 (Intercept)  49.5438430 2.38328655  44.8579446  54.1707410
2    AGroup B -12.3097610 3.13510315 -18.8059933  -6.6018889
3    AGroup C -28.2751561 3.02810337 -33.8195233 -21.9065319
4           B  -0.4940998 0.09381214  -0.6922853  -0.3200262
5  AGroup B:B  -0.1204605 0.11724241  -0.3514843   0.1104638
6  AGroup C:B   0.4111006 0.11271108   0.1810014   0.6214744
7       sigma   3.7727701 0.59136185   2.7322401   4.9499110
library(brms)
library(broom)
library(coda)
data1.brms = brm(Y ~ A * B, data = data1, iter = 2000, warmup = 500, chains = 3,
    thin = 2, refresh = 0, prior = c(prior(normal(0, 100), class = "Intercept"),
        prior(normal(0, 100), class = "b"), prior(cauchy(0, 5), class = "sigma")))
Gradient evaluation took 2.9e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.29 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.094494 seconds (Warm-up)
               0.100503 seconds (Sampling)
               0.194997 seconds (Total)


Gradient evaluation took 8e-06 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.08 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.076777 seconds (Warm-up)
               0.104894 seconds (Sampling)
               0.181671 seconds (Total)


Gradient evaluation took 9e-06 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.09 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.091052 seconds (Warm-up)
               0.09965 seconds (Sampling)
               0.190702 seconds (Total)
print(data1.brms)
 Family: gaussian(identity) 
Formula: Y ~ A * B 
   Data: data1 (Number of observations: 30) 
Samples: 3 chains, each with iter = 2000; warmup = 500; thin = 2; 
         total post-warmup samples = 2250
    ICs: LOO = NA; WAIC = NA; R2 = NA
 
Population-Level Effects: 
          Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
Intercept    49.60      2.32    44.97    54.25       1182    1
AGroupB     -12.33      3.06   -18.49    -6.37       1216    1
AGroupC     -28.29      2.97   -34.18   -22.34       1246    1
B            -0.49      0.09    -0.67    -0.32       1160    1
AGroupB:B    -0.12      0.11    -0.35     0.11       1186    1
AGroupC:B     0.41      0.11     0.19     0.62       1248    1

Family Specific Parameters: 
      Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sigma     3.71      0.54     2.81     4.98       1423    1

Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample 
is a crude measure of effective sample size, and Rhat is the potential 
scale reduction factor on split chains (at convergence, Rhat = 1).
tidyMCMC(data1.brms, conf.int = TRUE, conf.method = "HPDinterval")
         term    estimate std.error    conf.low   conf.high
1 b_Intercept  49.6027270 2.3173028  44.8624049  54.0547491
2   b_AGroupB -12.3262056 3.0553552 -18.7775957  -6.7594663
3   b_AGroupC -28.2870152 2.9659002 -33.8736952 -22.1495177
4         b_B  -0.4926334 0.0908813  -0.6696454  -0.3177777
5 b_AGroupB:B  -0.1212027 0.1126979  -0.3497945   0.1053154
6 b_AGroupC:B   0.4067397 0.1097094   0.1878784   0.6203292
7       sigma   3.7129053 0.5394484   2.6776034   4.7406944

MCMC diagnostics

In addition to the regular model diagnostic checks (such as residual plots), for Bayesian analyses, it is necessary to explore the characteristics of the MCMC chains and the sampler in general. Recall that the purpose of MCMC sampling is to replicate the posterior distribution of the model likelihood and priors by drawing a known number of samples from this posterior (thereby formulating a probability distribution). This is only reliable if the MCMC samples accurately reflect the posterior.

Unfortunately, since we only know the posterior in the most trivial of circumstances, it is necessary to rely on indirect measures of how accurately the MCMC samples are likely to reflect the likelihood. I will breifly outline the most important diagnostics, however, please refer to Tutorial 4.3, Secton 3.1: Markov Chain Monte Carlo sampling for a discussion of these diagnostics.

  • Traceplots for each parameter illustrate the MCMC sample values after each successive iteration along the chain. Bad chain mixing (characterized by any sort of pattern) suggests that the MCMC sampling chains may not have completely traversed all features of the posterior distribution and that more iterations are required to ensure the distribution has been accurately represented.
    plot of chunk sim7
  • Autocorrelation plot for each paramter illustrate the degree of correlation between MCMC samples separated by different lags. For example, a lag of 0 represents the degree of correlation between each MCMC sample and itself (obviously this will be a correlation of 1). A lag of 1 represents the degree of correlation between each MCMC sample and the next sample along the Chain and so on. In order to be able to generate unbiased estimates of parameters, the MCMC samples should be independent (uncorrelated). In the figures below, this would be violated in the top autocorrelation plot and met in the bottom autocorrelation plot.
    plot of chunk sim7 plot of chunk sim9
  • Rhat statistic for each parameter provides a measure of sampling efficiency/effectiveness. Ideally, all values should be less than 1.05. If there are values of 1.05 or greater it suggests that the sampler was not very efficient or effective. Not only does this mean that the sampler was potentiall slower than it could have been, more importantly, it could indicate that the sampler spent time sampling in a region of the likelihood that is less informative. Such a situation can arise from either a misspecified model or overly vague priors that permit sampling in otherwise nonscence parameter space.

Prior to inspecting any summaries of the parameter estimates, it is prudent to inspect a range of chain convergence diagnostics.

  • Trace plots
    View trace plots
    library(MCMCpack)
    plot(data1.mcmcpack)
    
    plot of chunk tut7.5bMCMCpackTrace2
    plot of chunk tut7.5bMCMCpackTrace2
    Trace plots show no evidence that the chains have not reasonably traversed the entire multidimensional parameter space.
  • Raftery diagnostic
    View Raftery diagnostic
    library(MCMCpack)
    raftery.diag(data1.mcmcpack)
    
    Quantile (q) = 0.025
    Accuracy (r) = +/- 0.005
    Probability (s) = 0.95 
                                                       
                 Burn-in  Total Lower bound  Dependence
                 (M)      (N)   (Nmin)       factor (I)
     (Intercept) 2        3929  3746         1.050     
     AGroup B    2        3650  3746         0.974     
     AGroup C    2        3710  3746         0.990     
     B           2        3650  3746         0.974     
     AGroup B:B  2        3962  3746         1.060     
     AGroup C:B  2        3865  3746         1.030     
     sigma2      2        3929  3746         1.050     
    
    The Raftery diagnostics estimate that we would require about 3900 samples to reach the specified level of confidence in convergence. As we have 10,000 samples, we can be confidence that convergence has occurred.
  • Autocorrelation diagnostic
    View autocorrelations
    library(MCMCpack)
    autocorr.diag(data1.mcmcpack)
    
           (Intercept)     AGroup B     AGroup C            B   AGroup B:B    AGroup C:B      sigma2
    Lag 0   1.00000000  1.000000000  1.000000000  1.000000000  1.000000000  1.0000000000 1.000000000
    Lag 1  -0.01234075 -0.008277569 -0.011438936 -0.003198364 -0.011825247 -0.0003818285 0.217142614
    Lag 5   0.01859622  0.019122855 -0.003823357  0.013434001  0.026688474  0.0020263629 0.005952516
    Lag 10  0.01991699  0.021417875  0.011460174  0.013885559  0.022966909  0.0030683915 0.023030028
    Lag 50 -0.00543317 -0.009493830 -0.007385782 -0.005974147 -0.008253629  0.0009582506 0.011223549
    
    A lag of 1 appears to be mainly sufficient to avoid autocorrelation.

Again, prior to examining the summaries, we should have explored the convergence diagnostics.

library(coda)
data1.mcmc = as.mcmc(data1.r2jags)
  • Trace plots
    plot(data1.mcmc)
    
    plot of chunk tut7.5bJAGS2Trace
    plot of chunk tut7.5bJAGS2Trace
    Trace plots show no evidence that the chains have not reasonably traversed the entire multidimensional parameter space.

    When there are a lot of parameters, this can result in a very large number of traceplots. To focus on just certain parameters (such as $\beta$s)

    preds <- c("beta[1]", "beta[2]", "beta[3]", "beta[4]")
    plot(as.mcmc(data1.r2jags)[, preds])
    
    plot of chunk tut7.5bJAGS2Trace1
  • Raftery diagnostic
    raftery.diag(data1.mcmc)
    
    [[1]]
    
    Quantile (q) = 0.025
    Accuracy (r) = +/- 0.005
    Probability (s) = 0.95 
                                                    
              Burn-in  Total Lower bound  Dependence
              (M)      (N)   (Nmin)       factor (I)
     beta[1]  20       37410 3746          9.99     
     beta[2]  20       38660 3746         10.30     
     beta[3]  20       36800 3746          9.82     
     beta[4]  20       38660 3746         10.30     
     beta[5]  20       37410 3746          9.99     
     beta[6]  20       38030 3746         10.20     
     deviance 20       39000 3746         10.40     
     sigma    20       36200 3746          9.66     
    
    
    [[2]]
    
    Quantile (q) = 0.025
    Accuracy (r) = +/- 0.005
    Probability (s) = 0.95 
                                                    
              Burn-in  Total Lower bound  Dependence
              (M)      (N)   (Nmin)       factor (I)
     beta[1]  20       37410 3746          9.99     
     beta[2]  20       38030 3746         10.20     
     beta[3]  20       38050 3746         10.20     
     beta[4]  10       37410 3746          9.99     
     beta[5]  20       36800 3746          9.82     
     beta[6]  20       36800 3746          9.82     
     deviance 20       38030 3746         10.20     
     sigma    20       36800 3746          9.82     
    
    
    [[3]]
    
    Quantile (q) = 0.025
    Accuracy (r) = +/- 0.005
    Probability (s) = 0.95 
                                                    
              Burn-in  Total Lower bound  Dependence
              (M)      (N)   (Nmin)       factor (I)
     beta[1]  20       36200 3746          9.66     
     beta[2]  20       37410 3746          9.99     
     beta[3]  20       39300 3746         10.50     
     beta[4]  20       39300 3746         10.50     
     beta[5]  20       36800 3746          9.82     
     beta[6]  20       36800 3746          9.82     
     deviance 20       39300 3746         10.50     
     sigma    20       36810 3746          9.83     
    
    The Raftery diagnostics for each chain estimate that we would require no more than 5000 samples to reach the specified level of confidence in convergence. As we have 16,667 samples, we can be confidence that convergence has occurred.
  • Autocorrelation diagnostic
    autocorr.diag(data1.mcmc)
    
                 beta[1]      beta[2]      beta[3]      beta[4]      beta[5]      beta[6]      deviance
    Lag 0    1.000000000  1.000000000  1.000000000  1.000000000  1.000000000  1.000000000  1.0000000000
    Lag 10   0.004437399  0.002545394  0.005457633  0.002408673  0.005032132  0.003931065 -0.0005090962
    Lag 50  -0.005470768 -0.006470215 -0.008316906 -0.004031177 -0.009725834 -0.003744444  0.0069346156
    Lag 100 -0.006510539 -0.018290658 -0.009018695  0.003875208 -0.005127466  0.004179197 -0.0033519319
    Lag 500 -0.002091585 -0.006610883  0.001302301 -0.000777923 -0.003703581  0.005995650  0.0081995570
                   sigma
    Lag 0    1.000000000
    Lag 10   0.002181234
    Lag 50  -0.002582261
    Lag 100  0.004125329
    Lag 500  0.008170166
    
    A lag of 10 appears to be sufficient to avoid autocorrelation (poor mixing).

Again, prior to examining the summaries, we should have explored the convergence diagnostics. There are numerous ways of working with STAN model fits (for exploring diagnostics and summarization).

  • extract the mcmc samples and convert them into a mcmc.list to leverage the various coda routines
  • use the numerous routines that come with the rstan package
  • use the routines that come with the bayesplot package
  • explore the diagnostics interactively via shinystan

We will explore all of these:
  • via coda
    • Traceplots
    • library(coda)
      s = as.array(data1.rstan)
      wch = grep("beta", dimnames(s)$parameters)
      s = s[, , wch]
      mcmc <- do.call(mcmc.list, plyr:::alply(s[, , -(length(s[1, 1, ]))], 2, as.mcmc))
      plot(mcmc)
      
      plot of chunk tut7.5bSTAN2codaTraceplots
      plot of chunk tut7.5bSTAN2codaTraceplots
      Trace plots show no evidence that the chains have not reasonably traversed the entire multidimensional parameter space.
    • Autocorrelation
    • library(coda)
      s = as.array(data1.rstan)
      wch = grep("beta", dimnames(s)$parameters)
      s = s[, , wch]
      mcmc <- do.call(mcmc.list, plyr:::alply(s[, , -(length(s[1, 1, ]))], 2, as.mcmc))
      autocorr.diag(mcmc)
      
                 beta[1]      beta[2]      beta[3]      beta[4]      beta[5]
      Lag 0   1.00000000  1.000000000  1.000000000  1.000000000  1.000000000
      Lag 1   0.19962255  0.134452951  0.162486507  0.181592704  0.125717018
      Lag 5  -0.01473278  0.013954338 -0.020209597 -0.007642484  0.007016602
      Lag 10 -0.01802321 -0.008248189 -0.007702265 -0.019093851 -0.002760047
      Lag 50 -0.03711757 -0.035737496 -0.030485873 -0.020661680 -0.014791965
      
      Trace plots show no evidence that the chains have not reasonably traversed the entire multidimensional parameter space.
  • via rstan
    • Traceplots
      stan_trace(data1.rstan)
      
      plot of chunk tut7.5bSTAN2Trace
      Trace plots show no evidence that the chains have not reasonably traversed the entire multidimensional parameter space.
    • Raftery diagnostic
      raftery.diag(data1.rstan)
      
      Quantile (q) = 0.025
      Accuracy (r) = +/- 0.005
      Probability (s) = 0.95 
      
      You need a sample size of at least 3746 with these values of q, r and s
      
      The Raftery diagnostics for each chain estimate that we would require no more than 5000 samples to reach the specified level of confidence in convergence. As we have 16,667 samples, we can be confidence that convergence has occurred.
    • Autocorrelation diagnostic
      stan_ac(data1.rstan)
      
      plot of chunk tut7.5bSTAN2Auto
      A lag of 2 appears borderline in a couple of cases suggesting some autocorrelation (poor mixing). It might be worth increasing the thining rate to 4. Feel free to do so, I am not going to bother for this demo.
    • Rhat values. These values are a measure of sampling efficiency/effectiveness. Ideally, all values should be less than 1.05. If there are values of 1.05 or greater it suggests that the sampler was not very efficient or effective. Not only does this mean that the sampler was potentiall slower than it could have been, more importantly, it could indicate that the sampler spent time sampling in a region of the likelihood that is less informative. Such a situation can arise from either a misspecified model or overly vague priors that permit sampling in otherwise nonscence parameter space.
      stan_rhat(data1.rstan)
      
      plot of chunk tut7.5bSTAN2Rhat
      In this instance, all rhat values are well below 1.05 (a good thing).
    • Another measure of sampling efficiency is Effective Sample Size (ess). ess indicate the number samples (or proportion of samples that the sampling algorithm deamed effective. The sampler rejects samples on the basis of certain criterion and when it does so, the previous sample value is used. Hence while the MCMC sampling chain may contain 1000 samples, if there are only 10 effective samples (1%), the estimated properties are not likely to be reliable.
      stan_ess(data1.rstan)
      
      plot of chunk tut7.5bSTAN2ess
      In this instance, most of the parameters have reasonably high effective samples and thus there is likely to be a good range of values from which to estimate paramter properties.
  • via bayesplot
    • Trace plots and density plots
      library(bayesplot)
      mcmc_trace(as.matrix(data1.rstan), regex_pars = "beta|sigma")
      
      plot of chunk tut7.5bSTAN2MCMCTrace
      library(bayesplot)
      mcmc_combo(as.matrix(data1.rstan), regex_pars = "beta|sigma")
      
      plot of chunk tut7.5bSTAN2Trace1
      Trace plots show no evidence that the chains have not reasonably traversed the entire multidimensional parameter space.
    • Density plots
      library(bayesplot)
      mcmc_dens(as.matrix(data1.rstan), regex_pars = "beta|sigma")
      
      plot of chunk tut7.5bSTAN2dens
      Density plots sugggest mean or median would be appropriate to describe the fixed posteriors and median is appropriate for the sigma posterior.
  • via shinystan
    library(shinystan)
    launch_shinystan(data1.rstan)
    
  • It is worth exploring the influence of our priors.

Again, prior to examining the summaries, we should have explored the convergence diagnostics. There are numerous ways of working with STANARM model fits (for exploring diagnostics and summarization).

  • extract the mcmc samples and convert them into a mcmc.list to leverage the various coda routines
  • use the numerous routines that come with the rstan package
  • use the routines that come with the bayesplot package
  • explore the diagnostics interactively via shinystan

We will explore all of these:
  • via coda
    • Traceplots
    • library(coda)
      s = as.array(data1.rstanarm)
      mcmc <- do.call(mcmc.list, plyr:::alply(s[, , -(length(s[1, 1, ]))], 2, as.mcmc))
      plot(mcmc)
      
      plot of chunk tut7.5bRSTANARM2codaTraceplots
      plot of chunk tut7.5bRSTANARM2codaTraceplots
      Trace plots show no evidence that the chains have not reasonably traversed the entire multidimensional parameter space.
    • Autocorrelation
    • library(coda)
      s = as.array(data1.rstanarm)
      mcmc <- do.call(mcmc.list, plyr:::alply(s[, , -(length(s[1, 1, ]))], 2, as.mcmc))
      autocorr.diag(mcmc)
      
              (Intercept)     AGroup B     AGroup C           B   AGroup B:B   AGroup C:B
      Lag 0   1.000000000  1.000000000  1.000000000 1.000000000  1.000000000  1.000000000
      Lag 1   0.051997304  0.022839434  0.049122699 0.029332271  0.018600498  0.039303026
      Lag 5  -0.009030666 -0.023573121 -0.021521616 0.002608995 -0.008629405 -0.008541973
      Lag 10  0.022740083  0.005983251 -0.006472736 0.009874178 -0.009109182 -0.041524169
      Lag 50  0.023718877  0.022629378  0.043072824 0.029933371  0.029119110  0.025530832
      
      Trace plots show no evidence that the chains have not reasonably traversed the entire multidimensional parameter space.
  • via rstan
    • Traceplots
      stan_trace(data1.rstanarm)
      
      plot of chunk tut7.5bRSTANARM2Trace
      Trace plots show no evidence that the chains have not reasonably traversed the entire multidimensional parameter space.
    • Raftery diagnostic
      raftery.diag(data1.rstanarm)
      
      Quantile (q) = 0.025
      Accuracy (r) = +/- 0.005
      Probability (s) = 0.95 
      
      You need a sample size of at least 3746 with these values of q, r and s
      
      The Raftery diagnostics for each chain estimate that we would require no more than 5000 samples to reach the specified level of confidence in convergence. As we have 16,667 samples, we can be confidence that convergence has occurred.
    • Autocorrelation diagnostic
      stan_ac(data1.rstanarm)
      
      plot of chunk tut7.5bRSTANARM2Auto
      A lag of 2 appears broadly sufficient to avoid autocorrelation (poor mixing).
    • Rhat values. These values are a measure of sampling efficiency/effectiveness. Ideally, all values should be less than 1.05. If there are values of 1.05 or greater it suggests that the sampler was not very efficient or effective. Not only does this mean that the sampler was potentiall slower than it could have been, more importantly, it could indicate that the sampler spent time sampling in a region of the likelihood that is less informative. Such a situation can arise from either a misspecified model or overly vague priors that permit sampling in otherwise nonscence parameter space.
      stan_rhat(data1.rstanarm)
      
      plot of chunk tut7.5bRSTANARM2Rhat
      In this instance, all rhat values are well below 1.05 (a good thing).
    • Another measure of sampling efficiency is Effective Sample Size (ess). ess indicate the number samples (or proportion of samples that the sampling algorithm deamed effective. The sampler rejects samples on the basis of certain criterion and when it does so, the previous sample value is used. Hence while the MCMC sampling chain may contain 1000 samples, if there are only 10 effective samples (1%), the estimated properties are not likely to be reliable.
      stan_ess(data1.rstanarm)
      
      plot of chunk tut7.5bRSTANARM2ess
      In this instance, most of the parameters have reasonably high effective samples and thus there is likely to be a good range of values from which to estimate paramter properties.
  • via bayesplot
    • Trace plots and density plots
      library(bayesplot)
      mcmc_trace(as.array(data1.rstanarm), regex_pars = "Intercept|x|sigma")
      
      plot of chunk tut7.5bRSTANARM2MCMCTrace
      mcmc_combo(as.array(data1.rstanarm))
      
      plot of chunk tut7.5bRSTANARM2Trace1
      Trace plots show no evidence that the chains have not reasonably traversed the entire multidimensional parameter space.
    • Density plots
      mcmc_dens(as.array(data1.rstanarm))
      
      plot of chunk tut7.5bRSTANARM2dens
      Density plots sugggest mean or median would be appropriate to describe the fixed posteriors and median is appropriate for the sigma posterior.
  • via rstanarm
    The rstanarm package provides additional posterior checks.
    • Posterior vs Prior - this compares the posterior estimate for each parameter against the associated prior. If the spread of the priors is small relative to the posterior, then it is likely that the priors are too influential. On the other hand, overly wide priors can lead to computational issues.
      library(rstanarm)
      posterior_vs_prior(data1.rstanarm, color_by = "vs", group_by = TRUE,
          facet_args = list(scales = "free_y"))
      
      Gradient evaluation took 6.9e-05 seconds
      1000 transitions using 10 leapfrog steps per transition would take 0.69 seconds.
      Adjust your expectations accordingly!
      
      
      
       Elapsed Time: 0.403063 seconds (Warm-up)
                     0.077008 seconds (Sampling)
                     0.480071 seconds (Total)
      
      
      Gradient evaluation took 1.1e-05 seconds
      1000 transitions using 10 leapfrog steps per transition would take 0.11 seconds.
      Adjust your expectations accordingly!
      
      
      
       Elapsed Time: 0.280175 seconds (Warm-up)
                     0.104319 seconds (Sampling)
                     0.384494 seconds (Total)
      
      plot of chunk tut7.5bRSTANARM2posterorvsprior
  • via shinystan
    						  library(shinystan) 
    						  launch_shinystan(data1.rstanarm))      
    

Again, prior to examining the summaries, we should have explored the convergence diagnostics. Rather than duplicate this for both additive and multiplicative models, we will only explore the multiplicative model. There are numerous ways of working with STAN model fits (for exploring diagnostics and summarization).

  • extract the mcmc samples and convert them into a mcmc.list to leverage the various coda routines
  • use the numerous routines that come with the rstan package
  • use the routines that come with the bayesplot package
  • explore the diagnostics interactively via shinystan

We will explore all of these:
  • via coda
    • Traceplots
    • library(coda)
      mcmc = as.mcmc(data1.brms)
      plot(mcmc)
      
      plot of chunk tut7.5bBRMS2codaTraceplots
      plot of chunk tut7.5bBRMS2codaTraceplots
      Trace plots show no evidence that the chains have not reasonably traversed the entire multidimensional parameter space.
    • Autocorrelation
    • library(coda)
      mcmc = as.mcmc(data1.brms)
      autocorr.diag(mcmc)
      
      Error in ts(x, start = start(x), end = end(x), deltat = thin(x)): invalid time series parameters specified
      
      Trace plots show no evidence that the chains have not reasonably traversed the entire multidimensional parameter space.
  • via rstan
    • Traceplots
      stan_trace(data1.brms$fit)
      
      plot of chunk tut7.5bBRMS2Trace
      Trace plots show no evidence that the chains have not reasonably traversed the entire multidimensional parameter space.
    • Raftery diagnostic
      raftery.diag(data1.brms)
      
      Quantile (q) = 0.025
      Accuracy (r) = +/- 0.005
      Probability (s) = 0.95 
      
      You need a sample size of at least 3746 with these values of q, r and s
      
      The Raftery diagnostics for each chain estimate that we would require no more than 5000 samples to reach the specified level of confidence in convergence. As we have 16,667 samples, we can be confidence that convergence has occurred.
    • Autocorrelation diagnostic
      stan_ac(data1.brms$fit)
      
      plot of chunk tut7.5bBRMS2Auto
      A lag of 2 appears borderline in a couple of cases suggesting some autocorrelation (poor mixing). It might be worth increasing the thining rate to 4. Feel free to do so, I am not going to bother for this demo.
    • Rhat values. These values are a measure of sampling efficiency/effectiveness. Ideally, all values should be less than 1.05. If there are values of 1.05 or greater it suggests that the sampler was not very efficient or effective. Not only does this mean that the sampler was potentiall slower than it could have been, more importantly, it could indicate that the sampler spent time sampling in a region of the likelihood that is less informative. Such a situation can arise from either a misspecified model or overly vague priors that permit sampling in otherwise nonscence parameter space.
      stan_rhat(data1.brms$fit)
      
      plot of chunk tut7.5bBRMS2Rhat
      In this instance, all rhat values are well below 1.05 (a good thing).
    • Another measure of sampling efficiency is Effective Sample Size (ess). ess indicate the number samples (or proportion of samples that the sampling algorithm deamed effective. The sampler rejects samples on the basis of certain criterion and when it does so, the previous sample value is used. Hence while the MCMC sampling chain may contain 1000 samples, if there are only 10 effective samples (1%), the estimated properties are not likely to be reliable.
      stan_ess(data1.brms$fit)
      
      plot of chunk tut7.5bBRMS2ess
      In this instance, most of the parameters have reasonably high effective samples and thus there is likely to be a good range of values from which to estimate paramter properties.

Model validation

Model validation involves exploring the model diagnostics and fit to ensure that the model is broadly appropriate for the data. As such, exploration of the residuals should be routine.

For more complex models (those that contain multiple effects, it is also advisable to plot the residuals against each of the individual predictors. For sampling designs that involve sample collection over space or time, it is also a good idea to explore whether there are any temporal or spatial patterns in the residuals.

There are numerous situations (e.g. when applying specific variance-covariance structures to a model) where raw residuals do not reflect the interior workings of the model. Typically, this is because they do not take into account the variance-covariance matrix or assume a very simple variance-covariance matrix. Since the purpose of exploring residuals is to evaluate the model, for these cases, it is arguably better to draw conclusions based on standardized (or studentized) residuals.

Unfortunately the definitions of standardized and studentized residuals appears to vary and the two terms get used interchangeably. I will adopt the following definitions:


Standardized residuals:the raw residuals divided by the true standard deviation of the residuals (which of course is rarely known).
Studentized residuals:the raw residuals divided by the standard deviation of the residuals.
Note that externally studentized residuals are calculated by dividing the raw residuals by a unique standard deviation for each observation that is calculated from regressions having left each successive observation out.
Pearson residuals:the raw residuals divided by the standard deviation of the response variable.

The mark of a good model is being able to predict well. In an ideal world, we would have sufficiently large sample size as to permit us to hold a fraction (such as 25%) back thereby allowing us to train the model on 75% of the data and then see how well the model can predict the withheld 25%. Unfortunately, such a luxury is still rare in ecology.

The next best option is to see how well the model can predict the observed data. Models tend to struggle most with the extremes of trends and have particular issues when the extremes approach logical boundaries (such as zero for count data and standard deviations). We can use the fitted model to generate random predicted observations and then explore some properties of these compared to the actual observed data.

Rather than dublicate this for both additive and multiplicative models, we will only explore the multiplicative model.

Residuals are not computed directly within MCMCpack. However, we can calculate them manually form the posteriors.

mcmc = as.data.frame(data1.mcmcpack)
# generate a model matrix
newdata1 = data1
Xmat = model.matrix(~A * B, newdata1)
## get median parameter estimates
head(mcmc)
  (Intercept)   AGroup B  AGroup C          B AGroup B:B AGroup C:B    sigma2
1    48.70652 -11.241113 -25.43591 -0.4057086 -0.1929317  0.2966322  9.505405
2    43.17996  -8.274010 -22.66362 -0.3300430 -0.1959714  0.2629181 13.109720
3    49.48138 -10.480517 -27.46155 -0.4829662 -0.1457857  0.3635239 14.426140
4    47.75332 -11.387909 -27.18387 -0.3879251 -0.2055644  0.3181976 10.130435
5    49.43988 -11.918492 -29.20874 -0.4436741 -0.2074017  0.3961033  9.917436
6    46.51075  -6.677204 -22.78787 -0.3225533 -0.3883363  0.1657746 11.947826
coefs = apply(mcmc[, 1:6], 2, median)
fit = as.vector(coefs %*% t(Xmat))
resid = data1$Y - fit
ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))
plot of chunk tut7.5bMCMCpack2resid

Residuals against predictors

mcmc = as.data.frame(data1.mcmcpack)
# generate a model matrix
newdata1 = newdata1
Xmat = model.matrix(~A * B, newdata1)
## get median parameter estimates
coefs = apply(mcmc[, 1:6], 2, median)
fit = as.vector(coefs %*% t(Xmat))
resid = data1$Y - fit
newdata1 = newdata1 %>% cbind(fit, resid)
ggplot(newdata1) + geom_point(aes(y = resid, x = A))
plot of chunk tut7.5bMCMCpack2resid1
ggplot(newdata1) + geom_point(aes(y = resid, x = B))
plot of chunk tut7.5bMCMCpack2resid1

And now for studentized residuals

mcmc = as.data.frame(data1.mcmcpack)
# generate a model matrix
newdata1 = data1
Xmat = model.matrix(~A * B, newdata1)
## get median parameter estimates
coefs = apply(mcmc[, 1:6], 2, median)
fit = as.vector(coefs %*% t(Xmat))
resid = data1$Y - fit
sresid = resid/sd(resid)
ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))
plot of chunk tut7.5bMCMCpack2resid2

Conclusions: for this simple model, the studentized residuals yield the same pattern as the raw residuals (or the Pearson residuals for that matter).

Lets see how well data simulated from the model reflects the raw data

mcmc = as.matrix(data1.mcmcpack)
# generate a model matrix
Xmat = model.matrix(~A * B, data1)
## get median parameter estimates
coefs = mcmc[, 1:6]
fit = coefs %*% t(Xmat)
## draw samples from this model
yRep = sapply(1:nrow(mcmc), function(i) rnorm(nrow(data1), fit[i,
    ], sqrt(mcmc[i, "sigma2"])))
newdata1 = data.frame(A = data1$A, B = data1$B, yRep) %>% gather(key = Sample,
    value = Value, -A, -B)
ggplot(newdata1) + geom_violin(aes(y = Value, x = A, fill = "Model"),
    alpha = 0.5) + geom_violin(data = data1, aes(y = Y, x = A,
    fill = "Obs"), alpha = 0.5) + geom_point(data = data1, aes(y = Y,
    x = A), position = position_jitter(width = 0.1, height = 0),
    color = "black")
plot of chunk tut7.5bMCMCpack2Fit
ggplot(newdata1) + geom_violin(aes(y = Value, x = B, fill = "Model",
    group = B, color = A), alpha = 0.5) + geom_point(data = data1,
    aes(y = Y, x = B, group = B, color = A))
plot of chunk tut7.5bMCMCpack2Fit

Conclusions the predicted trends do encapsulate the actual data, suggesting that the model is a reasonable representation of the underlying processes. Note, these are prediction intervals rather than confidence intervals as we are seeking intervals within which we can predict individual observations rather than means.

We can also explore the posteriors of each parameter.

library(bayesplot)
mcmc_intervals(as.matrix(data1.mcmcpack), regex_pars = "Intercept|^A|^B|sigma")
plot of chunk tut7.5bMCMCpack2Area
mcmc_areas(as.matrix(data1.mcmcpack), regex_pars = "Intercept|^A|^B|sigma")
plot of chunk tut7.5bMCMCpack2Area
Rather than dublicate this for both additive and multiplicative models, we will only explore the multiplicative model.

Residuals are not computed directly within JAGS. However, we can calculate them manually form the posteriors.

mcmc = data1.r2jags$BUGSoutput$sims.matrix %>% as.data.frame %>%
    dplyr:::select(contains("beta"), sigma) %>% as.matrix
# generate a model matrix
newdata1 = data1
Xmat = model.matrix(~A * B, newdata1)
## get median parameter estimates
coefs = apply(mcmc[, 1:6], 2, median)
fit = as.vector(coefs %*% t(Xmat))
resid = data1$Y - fit
ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))
plot of chunk tut7.5bJAGS2resid

Residuals against predictors

mcmc = data1.r2jags$BUGSoutput$sims.matrix %>% as.data.frame %>%
    dplyr:::select(contains("beta"), sigma) %>% as.matrix
# generate a model matrix
newdata1 = newdata1
Xmat = model.matrix(~A * B, newdata1)
## get median parameter estimates
coefs = apply(mcmc[, 1:6], 2, median)
fit = as.vector(coefs %*% t(Xmat))
resid = data1$Y - fit
newdata1 = newdata1 %>% cbind(fit, resid)
ggplot(newdata1) + geom_point(aes(y = resid, x = A))
plot of chunk tut7.5bJAGS2resid1
ggplot(newdata1) + geom_point(aes(y = resid, x = B))
plot of chunk tut7.5bJAGS2resid1

And now for studentized residuals

mcmc = data1.r2jags$BUGSoutput$sims.matrix %>% as.data.frame %>%
    dplyr:::select(contains("beta"), sigma) %>% as.matrix
# generate a model matrix
newdata1 = data1
Xmat = model.matrix(~A * B, newdata1)
## get median parameter estimates
coefs = apply(mcmc[, 1:6], 2, median)
fit = as.vector(coefs %*% t(Xmat))
resid = data1$Y - fit
sresid = resid/sd(resid)
ggplot() + geom_point(data1 = NULL, aes(y = sresid, x = fit))
plot of chunk tut7.5bJAGS2resid2

Conclusions: for this simple model, the studentized residuals yield the same pattern as the raw residuals (or the Pearson residuals for that matter).

Lets see how well data simulated from the model reflects the raw data

mcmc = data1.r2jags$BUGSoutput$sims.matrix %>% as.data.frame %>%
    dplyr:::select(contains("beta"), sigma) %>% as.matrix
# generate a model matrix
Xmat = model.matrix(~A * B, data1)
## get median parameter estimates
coefs = mcmc[, 1:6]
fit = coefs %*% t(Xmat)
## draw samples from this model
yRep = sapply(1:nrow(mcmc), function(i) rnorm(nrow(data1), fit[i,
    ], mcmc[i, "sigma"]))
newdata1 = data.frame(A = data1$A, B = data1$B, yRep) %>% gather(key = Sample,
    value = Value, -A, -B)
ggplot(newdata1) + geom_violin(aes(y = Value, x = A, fill = "Model"),
    alpha = 0.5) + geom_violin(data = data1, aes(y = Y, x = A,
    fill = "Obs"), alpha = 0.5) + geom_point(data = data1, aes(y = Y,
    x = A), position = position_jitter(width = 0.1, height = 0),
    color = "black")
plot of chunk tut7.5bJAGS2Fit
ggplot(newdata1) + geom_violin(aes(y = Value, x = B, fill = "Model",
    group = B, color = A), alpha = 0.5) + geom_point(data = data1,
    aes(y = Y, x = B, group = B, color = A))
plot of chunk tut7.5bJAGS2Fit

Conclusions the predicted trends do encapsulate the actual data, suggesting that the model is a reasonable representation of the underlying processes. Note, these are prediction intervals rather than confidence intervals as we are seeking intervals within which we can predict individual observations rather than means.

We can also explore the posteriors of each parameter.

library(bayesplot)
mcmc_intervals(data1.r2jags$BUGSoutput$sims.matrix, regex_pars = "beta|sigma")
plot of chunk tut7.5bJAGS2Area
mcmc_areas(data1.r2jags$BUGSoutput$sims.matrix, regex_pars = "beta|sigma")
plot of chunk tut7.5bJAGS2Area
Rather than dublicate this for both additive and multiplicative models, we will only explore the multiplicative model.

Residuals are not computed directly within RSTAN. However, we can calculate them manually form the posteriors.

mcmc = as.data.frame(data1.rstan) %>% dplyr:::select(contains("beta"),
    sigma) %>% as.matrix
# generate a model matrix
newdata1 = data1
Xmat = model.matrix(~A * B, newdata1)
## get median parameter estimates
coefs = apply(mcmc[, 1:6], 2, median)
fit = as.vector(coefs %*% t(Xmat))
resid = data1$Y - fit
ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))
plot of chunk tut7.5bRSTAN2resid

Residuals against predictors

mcmc = as.data.frame(data1.rstan) %>% dplyr:::select(contains("beta"),
    sigma) %>% as.matrix
# generate a model matrix
newdata1 = newdata1
Xmat = model.matrix(~A * B, newdata1)
## get median parameter estimates
coefs = apply(mcmc[, 1:6], 2, median)
fit = as.vector(coefs %*% t(Xmat))
resid = data1$Y - fit
newdata1 = newdata1 %>% cbind(fit, resid)
ggplot(newdata1) + geom_point(aes(y = resid, x = A))
plot of chunk tut7.5bRSTAN2resid1
ggplot(newdata1) + geom_point(aes(y = resid, x = B))
plot of chunk tut7.5bRSTAN2resid1

And now for studentized residuals

mcmc = as.data.frame(data1.rstan) %>% dplyr:::select(contains("beta"),
    sigma) %>% as.matrix
# generate a model matrix
newdata1 = data1
Xmat = model.matrix(~A * B, newdata1)
## get median parameter estimates
coefs = apply(mcmc[, 1:6], 2, median)
fit = as.vector(coefs %*% t(Xmat))
resid = data1$Y - fit
sresid = resid/sd(resid)
ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))
plot of chunk tut7.5bRSTAN2resid2

Conclusions: for this simple model, the studentized residuals yield the same pattern as the raw residuals (or the Pearson residuals for that matter).

Lets see how well data simulated from the model reflects the raw data

mcmc = as.data.frame(data1.rstan) %>% dplyr:::select(contains("beta"),
    sigma) %>% as.matrix
# generate a model matrix
Xmat = model.matrix(~A * B, data1)
## get median parameter estimates
coefs = mcmc[, 1:6]
fit = coefs %*% t(Xmat)
## draw samples from this model
yRep = sapply(1:nrow(mcmc), function(i) rnorm(nrow(data1), fit[i,
    ], mcmc[i, "sigma"]))
newdata1 = data.frame(A = data1$A, B = data1$B, yRep) %>% gather(key = Sample,
    value = Value, -A, -B)
ggplot(newdata1) + geom_violin(aes(y = Value, x = A, fill = "Model"),
    alpha = 0.5) + geom_violin(data = data1, aes(y = Y, x = A,
    fill = "Obs"), alpha = 0.5) + geom_point(data = data1, aes(y = Y,
    x = A), position = position_jitter(width = 0.1, height = 0),
    color = "black")
plot of chunk tut7.5bRSTAN2Fit
ggplot(newdata1) + geom_violin(aes(y = Value, x = B, fill = "Model",
    group = B, color = A), alpha = 0.5) + geom_point(data = data1,
    aes(y = Y, x = B, group = B, color = A))
plot of chunk tut7.5bRSTAN2Fit

Conclusions the predicted trends do encapsulate the actual data, suggesting that the model is a reasonable representation of the underlying processes. Note, these are prediction intervals rather than confidence intervals as we are seeking intervals within which we can predict individual observations rather than means.

We can also explore the posteriors of each parameter.

library(bayesplot)
mcmc_intervals(as.matrix(data1.rstan), regex_pars = "beta|sigma")
plot of chunk tut7.5bRSTAN2Area
mcmc_areas(as.matrix(data1.rstan), regex_pars = "beta|sigma")
plot of chunk tut7.5bRSTAN2Area
Rather than dublicate this for both additive and multiplicative models, we will only explore the multiplicative model.

Residuals are not computed directly within RSTANARM. However, we can calculate them manually form the posteriors.

resid = resid(data1.rstanarm)
fit = fitted(data1.rstanarm)
ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))
plot of chunk tut7.5bRSTANARM2resid

Residuals against predictors

resid = resid(data1.rstanarm)
dat = data1 %>% mutate(resid = resid)
ggplot(dat) + geom_point(aes(y = resid, x = A))
plot of chunk tut7.5bRSTANARM2resid1
ggplot(dat) + geom_point(aes(y = resid, x = B))
plot of chunk tut7.5bRSTANARM2resid1

And now for studentized residuals

resid = resid(data1.rstanarm)
sresid = resid/sd(resid)
fit = fitted(data1.rstanarm)
ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))
plot of chunk tut7.5bRSTANARM2resid2

Conclusions: for this simple model, the studentized residuals yield the same pattern as the raw residuals (or the Pearson residuals for that matter).

Lets see how well data simulated from the model reflects the raw data

y_pred = posterior_predict(data1.rstanarm)
newdata1 = data1 %>% cbind(t(y_pred)) %>% gather(key = "Rep",
    value = "Value", -A:-Y)
ggplot(newdata1) + geom_violin(aes(y = Value, x = A, fill = "Model"),
    alpha = 0.5) + geom_violin(data = data1, aes(y = Y, x = A,
    fill = "Obs"), alpha = 0.5) + geom_point(data = data1, aes(y = Y,
    x = A), position = position_jitter(width = 0.1, height = 0),
    color = "black")
plot of chunk tut7.5bRSTANARM2Fit
ggplot(newdata1) + geom_violin(aes(y = Value, x = B, fill = "Model",
    group = B, color = A), alpha = 0.5) + geom_point(data = data1,
    aes(y = Y, x = B, group = B, color = A))
plot of chunk tut7.5bRSTANARM2Fit

Conclusions the predicted trends do encapsulate the actual data, suggesting that the model is a reasonable representation of the underlying processes. Note, these are prediction intervals rather than confidence intervals as we are seeking intervals within which we can predict individual observations rather than means.

We can also explore the posteriors of each parameter.

library(bayesplot)
mcmc_intervals(as.matrix(data1.rstanarm), regex_pars = "Intercept|^A|^B|sigma")
plot of chunk tut7.5bRSTANARM2Area
mcmc_areas(as.matrix(data1.rstanarm), regex_pars = "Intercept|^A|^B|sigma")
plot of chunk tut7.5bRSTANARM2Area
Rather than dublicate this for both additive and multiplicative models, we will only explore the multiplicative model.

Residuals are not computed directly within BRMS. However, we can calculate them manually form the posteriors.

resid = resid(data1.brms)[, "Estimate"]
fit = fitted(data1.brms)[, "Estimate"]
ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))
plot of chunk tut7.5bBRMS2resid

Residuals against predictors

resid = resid(data1.brms)[, "Estimate"]
dat = data1 %>% mutate(resid = resid)
ggplot(dat) + geom_point(aes(y = resid, x = A))
plot of chunk tut7.5bBRMS2resid1
ggplot(dat) + geom_point(aes(y = resid, x = B))
plot of chunk tut7.5bBRMS2resid1

And now for studentized residuals

resid = resid(data1.brms)[, "Estimate"]
sresid = resid/sd(resid)
fit = fitted(data1.brms)[, "Estimate"]
ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))
plot of chunk tut7.5bBRMS2resid2

Conclusions: for this simple model, the studentized residuals yield the same pattern as the raw residuals (or the Pearson residuals for that matter).

Lets see how well data simulated from the model reflects the raw data

y_pred = posterior_predict(data1.brms)
newdata1 = data1 %>% cbind(t(y_pred)) %>% gather(key = "Rep",
    value = "Value", -A:-Y)
ggplot(newdata1) + geom_violin(aes(y = Value, x = A, fill = "Model"),
    alpha = 0.5) + geom_violin(data = data1, aes(y = Y, x = A,
    fill = "Obs"), alpha = 0.5) + geom_point(data = data1, aes(y = Y,
    x = A), position = position_jitter(width = 0.1, height = 0),
    color = "black")
plot of chunk tut7.5bBRMS2Fit
ggplot(newdata1) + geom_violin(aes(y = Value, x = B, fill = "Model",
    group = B, color = A), alpha = 0.5) + geom_point(data = data1,
    aes(y = Y, x = B, group = B, color = A))
plot of chunk tut7.5bBRMS2Fit

Conclusions the predicted trends do encapsulate the actual data, suggesting that the model is a reasonable representation of the underlying processes. Note, these are prediction intervals rather than confidence intervals as we are seeking intervals within which we can predict individual observations rather than means.

We can also explore the posteriors of each parameter.

library(bayesplot)
mcmc_intervals(as.matrix(data1.brms), regex_pars = "b_|sigma")
plot of chunk tut7.5bBRMS2Area
mcmc_areas(as.matrix(data1.brms), regex_pars = "b_|sigma")
plot of chunk tut7.5bBRMS2Area

Parameter estimates (posterior summaries)

Although all parameters in a Bayesian analysis are considered random and are considered a distribution, rarely would it be useful to present tables of all the samples from each distribution. On the other hand, plots of the posterior distributions are do have some use. Nevertheless, most workers prefer to present simple statistical summaries of the posteriors. Popular choices include the median (or mean) and 95% credibility intervals.

library(coda)
mcmcpvalue <- function(samp) {
    ## elementary version that creates an empirical p-value for the
    ## hypothesis that the columns of samp have mean zero versus a general
    ## multivariate distribution with elliptical contours.

    ## differences from the mean standardized by the observed
    ## variance-covariance factor

    ## Note, I put in the bit for single terms
    if (length(dim(samp)) == 0) {
        std <- backsolve(chol(var(samp)), cbind(0, t(samp)) - mean(samp),
            transpose = TRUE)
        sqdist <- colSums(std * std)
        sum(sqdist[-1] > sqdist[1])/length(samp)
    } else {
        std <- backsolve(chol(var(samp)), cbind(0, t(samp)) - colMeans(samp),
            transpose = TRUE)
        sqdist <- colSums(std * std)
        sum(sqdist[-1] > sqdist[1])/nrow(samp)
    }

}

Matrix model (MCMCpack)

summary(data1.mcmcpack)
Iterations = 1001:11000
Thinning interval = 1 
Number of chains = 1 
Sample size per chain = 10000 

1. Empirical mean and standard deviation for each variable,
   plus standard error of the mean:

                Mean      SD  Naive SE Time-series SE
(Intercept)  49.5353 2.29334 0.0229334      0.0226186
AGroup B    -12.2262 3.09179 0.0309179      0.0309179
AGroup C    -28.2439 2.93282 0.0293282      0.0293282
B            -0.4938 0.09038 0.0009038      0.0008764
AGroup B:B   -0.1211 0.11358 0.0011358      0.0011178
AGroup C:B    0.4089 0.10829 0.0010829      0.0010829
sigma2       13.8964 4.43200 0.0443200      0.0552651

2. Quantiles for each variable:

                2.5%      25%      50%       75%    97.5%
(Intercept)  45.0690  48.0206  49.5282  51.01720  54.0757
AGroup B    -18.3801 -14.2357 -12.1642 -10.19136  -6.1832
AGroup C    -34.0521 -30.1795 -28.2026 -26.30389 -22.5493
B            -0.6743  -0.5528  -0.4940  -0.43428  -0.3178
AGroup B:B   -0.3412  -0.1958  -0.1209  -0.04575   0.1030
AGroup C:B    0.1948   0.3382   0.4086   0.47902   0.6240
sigma2        7.7536  10.8320  13.0991  16.09205  24.8023
# OR
library(broom)
tidyMCMC(data1.mcmcpack, conf.int = TRUE, conf.method = "HPDinterval")
         term    estimate  std.error    conf.low   conf.high
1 (Intercept)  49.5353232 2.29333768  45.1289511  54.1244471
2    AGroup B -12.2262443 3.09178506 -18.4166155  -6.2375566
3    AGroup C -28.2438752 2.93282206 -33.9185333 -22.4861115
4           B  -0.4938007 0.09037721  -0.6733914  -0.3173760
5  AGroup B:B  -0.1211118 0.11358441  -0.3414655   0.1027879
6  AGroup C:B   0.4088582 0.10828868   0.1931050   0.6219178
7      sigma2  13.8964411 4.43200264   7.0646504  22.9045959
Conclusions:
  • the intercept of the first group (Group A) is 49.5353232
  • the mean of the second group (Group B) is -12.2262443 units greater than (A)
  • the mean of the third group (Group C) is -28.2438752 units greater than (A)
  • a one unit increase in B in Group A is associated with a -0.4938007 units increase in Y
  • difference in slope between Group B and Group A -0.1211118
  • difference in slope between Group C and Group A 0.4088582
The 95% confidence interval for the effects of Group B, Group C and the partial slope associated with B do not overlapp with 0 implying a significant difference between group A and groups B, C (at the mean level of predictor B) and a significant negative relationship with B (for Group A). The slope associated with Group B was not found to be significantly different from that associated with Group A, however, the slope associated with Group C was found to be significantly less negative than the slope associated with Group A.

While workers attempt to become comfortable with a new statistical framework, it is only natural that they like to evaluate and comprehend new structures and output alongside more familiar concepts. One way to facilitate this is via Bayesian p-values that are somewhat analogous to the frequentist p-values for investigating the hypothesis that a parameter is equal to zero.

## since values are less than zero
mcmcpvalue(data1.mcmcpack[, 2])  # effect of (B-A = 0)
[1] 0.001
mcmcpvalue(data1.mcmcpack[, 3])  # effect of (C-A = 0)
[1] 0
mcmcpvalue(data1.mcmcpack[, 4])  # effect of (slope = 0)
[1] 0
mcmcpvalue(data1.mcmcpack[, 5])  # effect of (slopeB - slopeA = 0)
[1] 0.2763
mcmcpvalue(data1.mcmcpack[, 6])  # effect of (slopeC - slopeA = 0)
[1] 7e-04
mcmcpvalue(data1.mcmcpack[, 2:6])  # effect of (model)
[1] 0

There is evidence that the reponse differs between the groups.

Matrix model (JAGS)

print(data1.r2jags)
Inference for Bugs model at "5", fit using jags,
 3 chains, each with 53000 iterations (first 3000 discarded), n.thin = 10
 n.sims = 15000 iterations saved
         mu.vect sd.vect    2.5%     25%     50%     75%   97.5%  Rhat n.eff
beta[1]   49.553   2.336  45.049  48.003  49.538  51.092  54.162 1.001 15000
beta[2]  -12.227   3.093 -18.248 -14.285 -12.228 -10.182  -6.142 1.001 11000
beta[3]  -28.270   2.993 -34.198 -30.236 -28.275 -26.304 -22.340 1.001 15000
beta[4]   -0.495   0.093  -0.680  -0.556  -0.494  -0.433  -0.313 1.001 15000
beta[5]   -0.121   0.115  -0.347  -0.197  -0.122  -0.046   0.105 1.001 15000
beta[6]    0.410   0.111   0.193   0.336   0.410   0.482   0.636 1.001 15000
sigma      3.772   0.585   2.827   3.354   3.707   4.106   5.105 1.001 15000
deviance 163.101   4.379 156.875 159.924 162.329 165.512 173.639 1.001 15000

For each parameter, n.eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor (at convergence, Rhat=1).

DIC info (using the rule, pD = var(deviance)/2)
pD = 9.6 and DIC = 172.7
DIC is an estimate of expected predictive error (lower deviance is better).
# OR
library(broom)
tidyMCMC(as.mcmc(data1.r2jags), conf.int = TRUE, conf.method = "HPDinterval")
      term    estimate  std.error    conf.low   conf.high
1  beta[1]  49.5534512 2.33556926  45.0234105  54.1328491
2  beta[2] -12.2268098 3.09333864 -18.3836687  -6.3089149
3  beta[3] -28.2700616 2.99256629 -34.2921382 -22.4881054
4  beta[4]  -0.4947167 0.09264266  -0.6838619  -0.3183081
5  beta[5]  -0.1214410 0.11480484  -0.3469758   0.1054090
6  beta[6]   0.4099855 0.11145489   0.1877435   0.6289609
7 deviance 163.1011311 4.37873657 156.1255607 171.7611813
8    sigma   3.7720815 0.58508086   2.7189187   4.9291545
Conclusions:
  • the intercept of the first group (Group A) is 49.5534512
  • the mean of the second group (Group B) is -12.2268098 units greater than (A)
  • the mean of the third group (Group C) is -28.2700616 units greater than (A)
  • a one unit increase in B in Group A is associated with a -0.4947167 units increase in Y
  • difference in slope between Group B and Group A -0.121441
  • difference in slope between Group C and Group A 0.4099855
The 95% confidence interval for the effects of Group B, Group C and the partial slope associated with B do not overlapp with 0 implying a significant difference between group A and groups B, C (at the mean level of predictor B) and a significant negative relationship with B (for Group A). The slope associated with Group B was not found to be significantly different from that associated with Group A, however, the slope associated with Group C was found to be significantly less negative than the slope associated with Group A.

While workers attempt to become comfortable with a new statistical framework, it is only natural that they like to evaluate and comprehend new structures and output alongside more familiar concepts. One way to facilitate this is via Bayesian p-values that are somewhat analogous to the frequentist p-values for investigating the hypothesis that a parameter is equal to zero.

## since values are less than zero
mcmcpvalue(data1.r2jags$BUGSoutput$sims.matrix[, "beta[2]"])  # effect of (B-A = 0)
[1] 0.0003333333
mcmcpvalue(data1.r2jags$BUGSoutput$sims.matrix[, "beta[3]"])  # effect of (C-A = 0)
[1] 0
mcmcpvalue(data1.r2jags$BUGSoutput$sims.matrix[, "beta[4]"])  # effect of (slope = 0)
[1] 0
mcmcpvalue(data1.r2jags$BUGSoutput$sims.matrix[, "beta[5]"])  # effect of (slopeB - slopeA = 0)
[1] 0.28
mcmcpvalue(data1.r2jags$BUGSoutput$sims.matrix[, "beta[6]"])  # effect of (slopeC - slopeA = 0)
[1] 0.0007333333
mcmcpvalue(data1.r2jags$BUGSoutput$sims.matrix[, 2:6])  # effect of (model)
[1] 0

There is evidence that the reponse differs between the groups.

Matrix model (RSTAN)

print(data1.rstan, pars = c("beta", "sigma"))
Inference for Stan model: 3d2414c9dcf4b5e12be870eadd2c894a.
3 chains, each with iter=2000; warmup=500; thin=3; 
post-warmup draws per chain=500, total post-warmup draws=1500.

          mean se_mean   sd   2.5%    25%    50%    75%  97.5% n_eff Rhat
beta[1]  49.50    0.08 2.34  44.55  48.01  49.58  51.01  54.11   902    1
beta[2] -12.29    0.09 3.04 -18.60 -14.30 -12.17 -10.22  -6.70  1094    1
beta[3] -28.21    0.10 2.97 -33.80 -30.23 -28.34 -26.40 -22.04   957    1
beta[4]  -0.49    0.00 0.09  -0.67  -0.55  -0.49  -0.44  -0.31   978    1
beta[5]  -0.12    0.00 0.11  -0.34  -0.19  -0.12  -0.05   0.11  1164    1
beta[6]   0.41    0.00 0.11   0.18   0.34   0.41   0.48   0.62   953    1
sigma     3.73    0.02 0.58   2.81   3.32   3.66   4.05   5.07  1271    1

Samples were drawn using NUTS(diag_e) at Sat Nov  4 07:45:04 2017.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).
# OR
library(broom)
tidyMCMC(data1.rstan, conf.int = TRUE, conf.method = "HPDinterval", pars = c("beta", "sigma"))
     term    estimate  std.error    conf.low    conf.high
1 beta[1]  49.5013545 2.34230542  44.9503609  54.24168440
2 beta[2] -12.2889364 3.04350709 -18.5232918  -6.66905473
3 beta[3] -28.2129289 2.96719159 -33.8515230 -22.12391688
4 beta[4]  -0.4929557 0.09152697  -0.6703121  -0.30402421
5 beta[5]  -0.1195490 0.11233322  -0.3536692   0.08719164
6 beta[6]   0.4082751 0.11091872   0.1750039   0.61796175
7   sigma   3.7255865 0.57520983   2.6288615   4.77664381
Conclusions:
  • the intercept of the first group (Group A) is 49.5013545
  • the mean of the second group (Group B) is -12.2889364 units greater than (A)
  • the mean of the third group (Group C) is -28.2129289 units greater than (A)
  • a one unit increase in B in Group A is associated with a -0.4929557 units increase in Y
  • difference in slope between Group B and Group A -0.119549
  • difference in slope between Group C and Group A 0.4082751
The 95% confidence interval for the effects of Group B, Group C and the partial slope associated with B do not overlapp with 0 implying a significant difference between group A and groups B, C (at the mean level of predictor B) and a significant negative relationship with B (for Group A). The slope associated with Group B was not found to be significantly different from that associated with Group A, however, the slope associated with Group C was found to be significantly less negative than the slope associated with Group A.

While workers attempt to become comfortable with a new statistical framework, it is only natural that they like to evaluate and comprehend new structures and output alongside more familiar concepts. One way to facilitate this is via Bayesian p-values that are somewhat analogous to the frequentist p-values for investigating the hypothesis that a parameter is equal to zero.

## since values are less than zero
mcmcpvalue(as.matrix(data1.rstan)[, "beta[2]"])  # effect of (B-A = 0)
[1] 0.0006666667
mcmcpvalue(as.matrix(data1.rstan)[, "beta[3]"])  # effect of (C-A = 0)
[1] 0
mcmcpvalue(as.matrix(data1.rstan)[, "beta[4]"])  # effect of (slope = 0)
[1] 0
mcmcpvalue(as.matrix(data1.rstan)[, "beta[5]"])  # effect of (slopeB - slopeA = 0)
[1] 0.278
mcmcpvalue(as.matrix(data1.rstan)[, "beta[6]"])  # effect of (slopeC - slopeA = 0)
[1] 0.0006666667
mcmcpvalue(as.matrix(data1.rstan)[, 2:6])  # effect of (model)
[1] 0

There is evidence that a model that includes an interaction is better than an additive model.

library(loo)
(full = loo(extract_log_lik(data1.rstan)))
Computed from 1500 by 30 log-likelihood matrix

         Estimate  SE
elpd_loo    -85.3 4.0
p_loo         6.3 1.6
looic       170.6 7.9

Pareto k diagnostic values:
                         Count  Pct 
(-Inf, 0.5]   (good)     26    86.7%
 (0.5, 0.7]   (ok)        4    13.3%
   (0.7, 1]   (bad)       0     0.0%
   (1, Inf)   (very bad)  0     0.0%

All Pareto k estimates are ok (k < 0.7)
See help('pareto-k-diagnostic') for details.
# now fit a model without main factor
modelString = "
  data {
  int<lower=1> n;
  int<lower=1> nX;
  vector [n] y;
  matrix [n,nX] X;
  }
  parameters {
  vector[nX] beta;
  real<lower=0> sigma;
  }
  transformed parameters {
  vector[n] mu;

  mu = X*beta;
  }
  model {
  #Likelihood
  y~normal(mu,sigma);
  
  #Priors
  beta ~ normal(0,1000);
  sigma~cauchy(0,5);
  }
  generated quantities {
  vector[n] log_lik;
  
  for (i in 1:n) {
  log_lik[i] = normal_lpdf(y[i] | mu[i], sigma); 
  }
  }
  "

Xmat <- model.matrix(~A + B, data1)
data1.list <- with(data1, list(y = Y, X = Xmat, n = nrow(data1), nX = ncol(Xmat)))
data1.rstan.red <- stan(data = data1.list, model_code = modelString, chains = 3,
    iter = 2000, warmup = 500, thin = 3)
SAMPLING FOR MODEL '3b057d3d81cbed2078ce678376a94574' NOW (CHAIN 1).

Gradient evaluation took 4.2e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.42 seconds.
Adjust your expectations accordingly!


Iteration:    1 / 2000 [  0%]  (Warmup)
Iteration:  200 / 2000 [ 10%]  (Warmup)
Iteration:  400 / 2000 [ 20%]  (Warmup)
Iteration:  501 / 2000 [ 25%]  (Sampling)
Iteration:  700 / 2000 [ 35%]  (Sampling)
Iteration:  900 / 2000 [ 45%]  (Sampling)
Iteration: 1100 / 2000 [ 55%]  (Sampling)
Iteration: 1300 / 2000 [ 65%]  (Sampling)
Iteration: 1500 / 2000 [ 75%]  (Sampling)
Iteration: 1700 / 2000 [ 85%]  (Sampling)
Iteration: 1900 / 2000 [ 95%]  (Sampling)
Iteration: 2000 / 2000 [100%]  (Sampling)

 Elapsed Time: 0.138805 seconds (Warm-up)
               0.122949 seconds (Sampling)
               0.261754 seconds (Total)


SAMPLING FOR MODEL '3b057d3d81cbed2078ce678376a94574' NOW (CHAIN 2).

Gradient evaluation took 1.1e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.11 seconds.
Adjust your expectations accordingly!


Iteration:    1 / 2000 [  0%]  (Warmup)
Iteration:  200 / 2000 [ 10%]  (Warmup)
Iteration:  400 / 2000 [ 20%]  (Warmup)
Iteration:  501 / 2000 [ 25%]  (Sampling)
Iteration:  700 / 2000 [ 35%]  (Sampling)
Iteration:  900 / 2000 [ 45%]  (Sampling)
Iteration: 1100 / 2000 [ 55%]  (Sampling)
Iteration: 1300 / 2000 [ 65%]  (Sampling)
Iteration: 1500 / 2000 [ 75%]  (Sampling)
Iteration: 1700 / 2000 [ 85%]  (Sampling)
Iteration: 1900 / 2000 [ 95%]  (Sampling)
Iteration: 2000 / 2000 [100%]  (Sampling)

 Elapsed Time: 0.106299 seconds (Warm-up)
               0.113143 seconds (Sampling)
               0.219442 seconds (Total)


SAMPLING FOR MODEL '3b057d3d81cbed2078ce678376a94574' NOW (CHAIN 3).

Gradient evaluation took 1.1e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.11 seconds.
Adjust your expectations accordingly!


Iteration:    1 / 2000 [  0%]  (Warmup)
Iteration:  200 / 2000 [ 10%]  (Warmup)
Iteration:  400 / 2000 [ 20%]  (Warmup)
Iteration:  501 / 2000 [ 25%]  (Sampling)
Iteration:  700 / 2000 [ 35%]  (Sampling)
Iteration:  900 / 2000 [ 45%]  (Sampling)
Iteration: 1100 / 2000 [ 55%]  (Sampling)
Iteration: 1300 / 2000 [ 65%]  (Sampling)
Iteration: 1500 / 2000 [ 75%]  (Sampling)
Iteration: 1700 / 2000 [ 85%]  (Sampling)
Iteration: 1900 / 2000 [ 95%]  (Sampling)
Iteration: 2000 / 2000 [100%]  (Sampling)

 Elapsed Time: 0.120129 seconds (Warm-up)
               0.110436 seconds (Sampling)
               0.230565 seconds (Total)
(reduced = loo(extract_log_lik(data1.rstan.red)))
Computed from 1500 by 30 log-likelihood matrix

         Estimate  SE
elpd_loo    -97.9 3.3
p_loo         5.0 1.1
looic       195.8 6.7

Pareto k diagnostic values:
                         Count  Pct 
(-Inf, 0.5]   (good)     28    93.3%
 (0.5, 0.7]   (ok)        2     6.7%
   (0.7, 1]   (bad)       0     0.0%
   (1, Inf)   (very bad)  0     0.0%

All Pareto k estimates are ok (k < 0.7)
See help('pareto-k-diagnostic') for details.
par(mfrow = 1:2, mar = c(5, 3.8, 1, 0) + 0.1, las = 3)
plot(full, label_points = TRUE)
plot(reduced, label_points = TRUE)
plot of chunk tut7.5bRSTANloo2
compare_models(full, reduced)
Error in discrete == discrete[1]: comparison of these types is not implemented
Conclusions: the expected out-of-sample predictive accuracy is substantially lower for the model that includes the interaction. This might be used to suggest that the inferential evidence for a general interaction effect.

Matrix model (RSTANARM)

summary(data1.rstanarm)
Model Info:

 function:  stan_glm
 family:    gaussian [identity]
 formula:   Y ~ A * B
 algorithm: sampling
 priors:    see help('prior_summary')
 sample:    2250 (posterior sample size)
 num obs:   30

Estimates:
                mean   sd     2.5%   25%    50%    75%    97.5%
(Intercept)     49.5    2.4   44.9   48.0   49.6   51.1   54.3 
AGroup B       -12.3    3.1  -18.3  -14.4  -12.3  -10.2   -6.1 
AGroup C       -28.3    3.0  -34.3  -30.3  -28.3  -26.3  -22.3 
B               -0.5    0.1   -0.7   -0.6   -0.5   -0.4   -0.3 
AGroup B:B      -0.1    0.1   -0.3   -0.2   -0.1    0.0    0.1 
AGroup C:B       0.4    0.1    0.2    0.3    0.4    0.5    0.6 
sigma            3.8    0.6    2.8    3.3    3.7    4.1    5.2 
mean_PPD        26.9    1.0   24.9   26.2   26.9   27.5   28.9 
log-posterior  -97.2    2.2 -102.5  -98.4  -96.8  -95.6  -94.1 

Diagnostics:
              mcse Rhat n_eff
(Intercept)   0.1  1.0  1938 
AGroup B      0.1  1.0  1974 
AGroup C      0.1  1.0  1902 
B             0.0  1.0  2126 
AGroup B:B    0.0  1.0  2068 
AGroup C:B    0.0  1.0  1989 
sigma         0.0  1.0  1226 
mean_PPD      0.0  1.0  1497 
log-posterior 0.1  1.0  1030 

For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence Rhat=1).
# OR
library(broom)
tidyMCMC(data1.rstanarm$stanfit, conf.int = TRUE, conf.method = "HPDinterval")
           term    estimate  std.error     conf.low   conf.high
1   (Intercept)  49.5438430 2.38328655   44.8579446  54.1707410
2      AGroup B -12.3097610 3.13510315  -18.8059933  -6.6018889
3      AGroup C -28.2751561 3.02810337  -33.8195233 -21.9065319
4             B  -0.4940998 0.09381214   -0.6922853  -0.3200262
5    AGroup B:B  -0.1204605 0.11724241   -0.3514843   0.1104638
6    AGroup C:B   0.4111006 0.11271108    0.1810014   0.6214744
7         sigma   3.7727701 0.59136185    2.7322401   4.9499110
8      mean_PPD  26.8713702 1.00446504   24.9737303  29.0017404
9 log-posterior -97.2208223 2.19450449 -101.6130115 -93.7209496
Conclusions:
  • the intercept of the first group (Group A) is 49.543843
  • the mean of the second group (Group B) is -12.309761 units greater than (A)
  • the mean of the third group (Group C) is -28.2751561 units greater than (A)
  • a one unit increase in B in Group A is associated with a -0.4940998 units increase in Y
  • difference in slope between Group B and Group A -0.1204605
  • difference in slope between Group C and Group A 0.4111006
The 95% confidence interval for the effects of Group B, Group C and the partial slope associated with B do not overlapp with 0 implying a significant difference between group A and groups B, C (at the mean level of predictor B) and a significant negative relationship with B (for Group A). The slope associated with Group B was not found to be significantly different from that associated with Group A, however, the slope associated with Group C was found to be significantly less negative than the slope associated with Group A.

While workers attempt to become comfortable with a new statistical framework, it is only natural that they like to evaluate and comprehend new structures and output alongside more familiar concepts. One way to facilitate this is via Bayesian p-values that are somewhat analogous to the frequentist p-values for investigating the hypothesis that a parameter is equal to zero.

## since values are less than zero
head(as.matrix(data1.rstanarm))
          parameters
iterations (Intercept)  AGroup B  AGroup C          B  AGroup B:B AGroup C:B    sigma
      [1,]    49.10544 -12.03203 -30.24254 -0.5408720 -0.07177445  0.5097346 3.368864
      [2,]    49.57108 -10.46682 -27.74189 -0.5167232 -0.15566016  0.4159785 3.202435
      [3,]    52.79242 -16.30169 -31.83083 -0.5749994 -0.08438700  0.4699279 3.849197
      [4,]    50.85134 -14.46853 -28.92998 -0.5535872  0.03970759  0.4676661 4.339336
      [5,]    52.41561 -18.30582 -30.55315 -0.4583573 -0.10973607  0.3507038 4.801621
      [6,]    52.90121 -18.31527 -30.77699 -0.6350685  0.13775093  0.4858842 3.541416
mcmcpvalue(as.matrix(data1.rstanarm)[, "AGroup B"])  # effect of (B-A = 0)
[1] 0
mcmcpvalue(as.matrix(data1.rstanarm)[, "AGroup C"])  # effect of (C-A = 0)
[1] 0
mcmcpvalue(as.matrix(data1.rstanarm)[, "B"])  # effect of (slope = 0)
[1] 0
mcmcpvalue(as.matrix(data1.rstanarm)[, "AGroup B:B"])  # effect of (slopeB - slopeA = 0)
[1] 0.2951111
mcmcpvalue(as.matrix(data1.rstanarm)[, "AGroup C:B"])  # effect of (slopeC - slopeA = 0)
[1] 0.001777778
mcmcpvalue(as.matrix(data1.rstanarm)[, 2:6])  # effect of (model)
[1] 0

There is evidence that the reponse differs between the groups. There is very little evidence that the response of group D differs from that of group A

library(loo)
(full = loo(data1.rstanarm))
Computed from 2250 by 30 log-likelihood matrix

         Estimate  SE
elpd_loo    -85.9 4.1
p_loo         6.8 1.8
looic       171.9 8.2

Pareto k diagnostic values:
                         Count  Pct 
(-Inf, 0.5]   (good)     24    80.0%
 (0.5, 0.7]   (ok)        5    16.7%
   (0.7, 1]   (bad)       1     3.3%
   (1, Inf)   (very bad)  0     0.0%
See help('pareto-k-diagnostic') for details.
data1.rstanarm.red = update(data1.rstanarm, . ~ A + B)
Gradient evaluation took 4.1e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.41 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.252022 seconds (Warm-up)
               0.130327 seconds (Sampling)
               0.382349 seconds (Total)


Gradient evaluation took 1.9e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.19 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.182518 seconds (Warm-up)
               0.152707 seconds (Sampling)
               0.335225 seconds (Total)


Gradient evaluation took 1.2e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.12 seconds.
Adjust your expectations accordingly!



 Elapsed Time: 0.164992 seconds (Warm-up)
               0.158132 seconds (Sampling)
               0.323124 seconds (Total)
(reduced = loo(data1.rstanarm.red))
Computed from 2250 by 30 log-likelihood matrix

         Estimate  SE
elpd_loo    -97.7 3.1
p_loo         4.7 1.0
looic       195.5 6.3

Pareto k diagnostic values:
                         Count  Pct 
(-Inf, 0.5]   (good)     27    90.0%
 (0.5, 0.7]   (ok)        3    10.0%
   (0.7, 1]   (bad)       0     0.0%
   (1, Inf)   (very bad)  0     0.0%

All Pareto k estimates are ok (k < 0.7)
See help('pareto-k-diagnostic') for details.
par(mfrow = 1:2, mar = c(5, 3.8, 1, 0) + 0.1, las = 3)
plot(full, label_points = TRUE)
plot(reduced, label_points = TRUE)
plot of chunk tut7.5bRSTANARMloo2
compare_models(full, reduced)
elpd_diff        se 
    -11.8       4.2 
Conclusions: the expected out-of-sample predictive accuracy is substantially lower for the model that includes the interaction (A:B). This might be used to suggest that the inferential evidence for a general interaction.

Matrix model (BRMS)

summary(data1.brms)
 Family: gaussian(identity) 
Formula: Y ~ A * B 
   Data: data1 (Number of observations: 30) 
Samples: 3 chains, each with iter = 2000; warmup = 500; thin = 2; 
         total post-warmup samples = 2250
    ICs: LOO = NA; WAIC = NA; R2 = NA
 
Population-Level Effects: 
          Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
Intercept    49.60      2.32    44.97    54.25       1182    1
AGroupB     -12.33      3.06   -18.49    -6.37       1216    1
AGroupC     -28.29      2.97   -34.18   -22.34       1246    1
B            -0.49      0.09    -0.67    -0.32       1160    1
AGroupB:B    -0.12      0.11    -0.35     0.11       1186    1
AGroupC:B     0.41      0.11     0.19     0.62       1248    1

Family Specific Parameters: 
      Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sigma     3.71      0.54     2.81     4.98       1423    1

Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample 
is a crude measure of effective sample size, and Rhat is the potential 
scale reduction factor on split chains (at convergence, Rhat = 1).
# OR
library(broom)
tidyMCMC(data1.brms$fit, conf.int = TRUE, conf.method = "HPDinterval")
         term    estimate std.error    conf.low   conf.high
1 b_Intercept  49.6027270 2.3173028  44.8624049  54.0547491
2   b_AGroupB -12.3262056 3.0553552 -18.7775957  -6.7594663
3   b_AGroupC -28.2870152 2.9659002 -33.8736952 -22.1495177
4         b_B  -0.4926334 0.0908813  -0.6696454  -0.3177777
5 b_AGroupB:B  -0.1212027 0.1126979  -0.3497945   0.1053154
6 b_AGroupC:B   0.4067397 0.1097094   0.1878784   0.6203292
7       sigma   3.7129053 0.5394484   2.6776034   4.7406944
Conclusions:
  • the intercept of the first group (Group A) is 49.602727
  • the mean of the second group (Group B) is -12.3262056 units greater than (A)
  • the mean of the third group (Group C) is -28.2870152 units greater than (A)
  • a one unit increase in B in Group A is associated with a -0.4926334 units increase in Y
  • difference in slope between Group B and Group A -0.1212027
  • difference in slope between Group C and Group A 0.4067397
The 95% confidence interval for the effects of Group B, Group C and the partial slope associated with B do not overlapp with 0 implying a significant difference between group A and groups B, C (at the mean level of predictor B) and a significant negative relationship with B (for Group A). The slope associated with Group B was not found to be significantly different from that associated with Group A, however, the slope associated with Group C was found to be significantly less negative than the slope associated with Group A.

While workers attempt to become comfortable with a new statistical framework, it is only natural that they like to evaluate and comprehend new structures and output alongside more familiar concepts. One way to facilitate this is via Bayesian p-values that are somewhat analogous to the frequentist p-values for investigating the hypothesis that a parameter is equal to zero.

## since values are less than zero
head(as.matrix(data1.brms))
          parameters
iterations b_Intercept  b_AGroupB b_AGroupC        b_B b_AGroupB:B b_AGroupC:B    sigma      lp__
      [1,]    47.70303 -10.255010 -25.32093 -0.3742985 -0.17137635   0.2600049 3.092922 -116.2110
      [2,]    50.55262 -14.941232 -30.12870 -0.4643934 -0.13460334   0.4086522 3.230887 -114.2718
      [3,]    48.22410  -9.538243 -25.80487 -0.4858593 -0.10874505   0.4073568 4.423214 -116.0591
      [4,]    48.07606 -10.397199 -26.96966 -0.4404635 -0.16751208   0.3346061 2.849359 -113.2790
      [5,]    49.43749 -10.676002 -28.03187 -0.5099977 -0.11303887   0.4276380 3.777916 -113.2435
      [6,]    50.64586 -13.340054 -27.57699 -0.5712098 -0.02596519   0.4449288 2.659352 -115.2269
mcmcpvalue(as.matrix(data1.brms)[, "b_AGroupB"])  # effect of (B-A = 0)
[1] 0.0004444444
mcmcpvalue(as.matrix(data1.brms)[, "b_AGroupC"])  # effect of (C-A = 0)
[1] 0
mcmcpvalue(as.matrix(data1.brms)[, "b_B"])  # effect of (slope = 0)
[1] 0
mcmcpvalue(as.matrix(data1.brms)[, "b_AGroupB:B"])  # effect of (slopeB - slopeA = 0)
[1] 0.264
mcmcpvalue(as.matrix(data1.brms)[, "b_AGroupC:B"])  # effect of (slopeB - slopeA = 0)
[1] 0.001333333
mcmcpvalue(as.matrix(data1.brms)[, 2:6])  # effect of (model)
[1] 0

There is evidence that the reponse differs between the groups. There is very little evidence that the response of group D differs from that of group A

library(loo)
(full = loo(data1.brms))
  LOOIC   SE
 170.84 8.04
data1.brms.red = update(data1.brms, . ~ A + B)
SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 1).

Gradient evaluation took 2.4e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.24 seconds.
Adjust your expectations accordingly!


Iteration:    1 / 2000 [  0%]  (Warmup)
Iteration:  200 / 2000 [ 10%]  (Warmup)
Iteration:  400 / 2000 [ 20%]  (Warmup)
Iteration:  600 / 2000 [ 30%]  (Warmup)
Iteration:  800 / 2000 [ 40%]  (Warmup)
Iteration: 1000 / 2000 [ 50%]  (Warmup)
Iteration: 1001 / 2000 [ 50%]  (Sampling)
Iteration: 1200 / 2000 [ 60%]  (Sampling)
Iteration: 1400 / 2000 [ 70%]  (Sampling)
Iteration: 1600 / 2000 [ 80%]  (Sampling)
Iteration: 1800 / 2000 [ 90%]  (Sampling)
Iteration: 2000 / 2000 [100%]  (Sampling)

 Elapsed Time: 0.059643 seconds (Warm-up)
               0.028751 seconds (Sampling)
               0.088394 seconds (Total)


SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 2).

Gradient evaluation took 8e-06 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.08 seconds.
Adjust your expectations accordingly!


Iteration:    1 / 2000 [  0%]  (Warmup)
Iteration:  200 / 2000 [ 10%]  (Warmup)
Iteration:  400 / 2000 [ 20%]  (Warmup)
Iteration:  600 / 2000 [ 30%]  (Warmup)
Iteration:  800 / 2000 [ 40%]  (Warmup)
Iteration: 1000 / 2000 [ 50%]  (Warmup)
Iteration: 1001 / 2000 [ 50%]  (Sampling)
Iteration: 1200 / 2000 [ 60%]  (Sampling)
Iteration: 1400 / 2000 [ 70%]  (Sampling)
Iteration: 1600 / 2000 [ 80%]  (Sampling)
Iteration: 1800 / 2000 [ 90%]  (Sampling)
Iteration: 2000 / 2000 [100%]  (Sampling)

 Elapsed Time: 0.05915 seconds (Warm-up)
               0.028504 seconds (Sampling)
               0.087654 seconds (Total)


SAMPLING FOR MODEL 'gaussian(identity) brms-model' NOW (CHAIN 3).

Gradient evaluation took 8e-06 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.08 seconds.
Adjust your expectations accordingly!


Iteration:    1 / 2000 [  0%]  (Warmup)
Iteration:  200 / 2000 [ 10%]  (Warmup)
Iteration:  400 / 2000 [ 20%]  (Warmup)
Iteration:  600 / 2000 [ 30%]  (Warmup)
Iteration:  800 / 2000 [ 40%]  (Warmup)
Iteration: 1000 / 2000 [ 50%]  (Warmup)
Iteration: 1001 / 2000 [ 50%]  (Sampling)
Iteration: 1200 / 2000 [ 60%]  (Sampling)
Iteration: 1400 / 2000 [ 70%]  (Sampling)
Iteration: 1600 / 2000 [ 80%]  (Sampling)
Iteration: 1800 / 2000 [ 90%]  (Sampling)
Iteration: 2000 / 2000 [100%]  (Sampling)

 Elapsed Time: 0.056265 seconds (Warm-up)
               0.027707 seconds (Sampling)
               0.083972 seconds (Total)
(reduced = loo(data1.brms.red))
  LOOIC  SE
 195.67 6.7
par(mfrow = 1:2, mar = c(5, 3.8, 1, 0) + 0.1, las = 3)
plot(full, label_points = TRUE)
plot(reduced, label_points = TRUE)
plot of chunk tut7.5bBRMSloo2
compare_models(full, reduced)
Error in discrete == discrete[1]: comparison of these types is not implemented
Conclusions: the expected out-of-sample predictive accuracy is substantially lower for the model that includes the interaction (A:B). This might be used to suggest that the inferential evidence for a general interaction.

Graphical summaries

A nice graphic is often a great accompaniment to a statistical analysis. Although there are no fixed assumptions associated with graphing (in contrast to statistical analyses), we often want the graphical summaries to reflect the associated statistical analyses. After all, the sample is just one perspective on the population(s). What we are more interested in is being able to estimate and depict likely population parameters/trends.

Thus, whilst we could easily provide a plot displaying the raw data along with simple measures of location and spread, arguably, we should use estimates that reflect the fitted model. In this case, it would be appropriate to plot the credibility interval associated with each group.

Matrix model (MCMCpack)

mcmc = data1.mcmcpack
## Calculate the fitted values
newdata1 = expand.grid(A = levels(data1$A), B = seq(min(data1$B), max(data1$B),
    len = 100))
Xmat = model.matrix(~A * B, newdata1)
coefs = mcmc[, 1:6]
fit = coefs %*% t(Xmat)
newdata1 = newdata1 %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.method = "HPDinterval"))

ggplot(newdata1, aes(y = estimate, x = B, fill = A)) + geom_ribbon(aes(ymin = conf.low,
    ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") +
    scale_x_continuous("B") + theme_classic()
plot of chunk tut7.5bMCMCpack2GraphicalSummaries

As this is simple single factor ANOVA, we can simple add the raw data to this figure. For more complex designs with additional predictors, it is necessary to plot partial residuals.

## Calculate partial residuals fitted values
fdata1 = rdata1 = data1
fMat = rMat = model.matrix(~A * B, fdata1)
fit = as.vector(apply(coefs, 2, median) %*% t(fMat))
resid = as.vector(data1$Y - apply(coefs, 2, median) %*% t(rMat))
rdata1 = rdata1 %>% mutate(partial.resid = resid + fit)

ggplot(newdata1, aes(y = estimate, x = B, fill = A)) + geom_point(data = rdata1,
    aes(y = partial.resid, x = B, color = A)) + geom_ribbon(aes(ymin = conf.low,
    ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") +
    scale_x_continuous("B") + theme_classic()
plot of chunk tut7.5bMCMCpack2GraphicalSummaries2

Matrix model (JAGS)

mcmc = data1.r2jags$BUGSoutput$sims.matrix
## Calculate the fitted values
newdata1 = expand.grid(A = levels(data1$A), B = seq(min(data1$B), max(data1$B),
    len = 100))
Xmat = model.matrix(~A * B, newdata1)
coefs = mcmc[, c("beta[1]", "beta[2]", "beta[3]", "beta[4]", "beta[5]",
    "beta[6]")]
fit = coefs %*% t(Xmat)
newdata1 = newdata1 %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.method = "HPDinterval"))

ggplot(newdata1, aes(y = estimate, x = B, fill = A)) + geom_ribbon(aes(ymin = conf.low,
    ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") +
    scale_x_continuous("B") + theme_classic()
plot of chunk tut7.5bJAGSG2raphicalSummaries

As this is simple single factor ANOVA, we can simple add the raw data to this figure. For more complex designs with additional predictors, it is necessary to plot partial residuals.

## Calculate partial residuals fitted values
fdata1 = rdata1 = data1
fMat = rMat = model.matrix(~A * B, fdata1)
fit = as.vector(apply(coefs, 2, median) %*% t(fMat))
resid = as.vector(data1$Y - apply(coefs, 2, median) %*% t(rMat))
rdata1 = rdata1 %>% mutate(partial.resid = resid + fit)

ggplot(newdata1, aes(y = estimate, x = B, fill = A)) + geom_point(data = rdata1,
    aes(y = partial.resid, x = B, color = A)) + geom_ribbon(aes(ymin = conf.low,
    ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") +
    scale_x_continuous("B") + theme_classic()
plot of chunk tut7.5bJAGS2GraphicalSummaries2

Matrix model (RSTAN)

mcmc = as.matrix(data1.rstan)
## Calculate the fitted values
newdata1 = expand.grid(A = levels(data1$A), B = seq(min(data1$B), max(data1$B),
    len = 100))
Xmat = model.matrix(~A * B, newdata1)
coefs = mcmc[, c("beta[1]", "beta[2]", "beta[3]", "beta[4]", "beta[5]",
    "beta[6]")]
fit = coefs %*% t(Xmat)
newdata1 = newdata1 %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.method = "HPDinterval"))

ggplot(newdata1, aes(y = estimate, x = B, fill = A)) + geom_ribbon(aes(ymin = conf.low,
    ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") +
    scale_x_continuous("B") + theme_classic()
plot of chunk tut7.5bRSTAN2GraphicalSummaries

As this is simple single factor ANOVA, we can simple add the raw data to this figure. For more complex designs with additional predictors, it is necessary to plot partial residuals.

## Calculate partial residuals fitted values
fdata1 = rdata1 = data1
fMat = rMat = model.matrix(~A * B, fdata1)
fit = as.vector(apply(coefs, 2, median) %*% t(fMat))
resid = as.vector(data1$Y - apply(coefs, 2, median) %*% t(rMat))
rdata1 = rdata1 %>% mutate(partial.resid = resid + fit)

ggplot(newdata1, aes(y = estimate, x = B, fill = A)) + geom_point(data = rdata1,
    aes(y = partial.resid, x = B, color = A)) + geom_ribbon(aes(ymin = conf.low,
    ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") +
    scale_x_continuous("B") + theme_classic()
plot of chunk tut7.5bRSTAN2GraphicalSummaries2

Matrix model (RSTANARM)

## Calculate the fitted values
newdata1 = expand.grid(A = levels(data1$A), B = seq(min(data1$B),
    max(data1$B), len = 100))
fit = posterior_linpred(data1.rstanarm, newdata = newdata1)
newdata1 = newdata1 %>% cbind(tidyMCMC(as.mcmc(fit), conf.int = TRUE,
    conf.method = "HPDinterval"))

ggplot(newdata1, aes(y = estimate, x = B, fill = A)) + geom_ribbon(aes(ymin = conf.low,
    ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") +
    scale_x_continuous("B") + theme_classic()
plot of chunk tut7.5bRSTANARM2GraphicalSummaries

As this is simple single factor ANOVA, we can simple add the raw data to this figure. For more complex designs with additional predictors, it is necessary to plot partial residuals.

## Calculate partial residuals
rdata1 = data1
pp = posterior_linpred(data1.rstanarm, newdata = rdata1)
fit = as.vector(apply(pp, 2, median))
resid = resid(data1.rstanarm)
rdata1 = rdata1 %>% mutate(partial.resid = resid + fit)

ggplot(newdata1, aes(y = estimate, x = B, fill = A)) + geom_point(data = rdata1,
    aes(y = partial.resid, x = B, color = A)) + geom_ribbon(aes(ymin = conf.low,
    ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") +
    scale_x_continuous("B") + theme_classic()
plot of chunk tut7.5bRSTANARM2GraphicalSummaries2

Matrix model (BRMS)

Although we could calculated the fitted values via matrix multiplication of the coefficients and the model matrix (as for MCMCpack, RJAGS and RSTAN), for more complex models, it is more convenient to use the marginal_effects function that comes with brms.

plot(marginal_effects(data1.brms), points = TRUE)
plot of chunk tut7.5bBRMS2GraphicalSummaries.a
plot of chunk tut7.5bBRMS2GraphicalSummaries.a
plot of chunk tut7.5bBRMS2GraphicalSummaries.a
# OR
eff = plot(marginal_effects(data1.brms), points = TRUE, plot = FALSE)
eff
$A
plot of chunk tut7.5bBRMS2GraphicalSummaries.a
$B
plot of chunk tut7.5bBRMS2GraphicalSummaries.a
$`B:A`
plot of chunk tut7.5bBRMS2GraphicalSummaries.a
## Calculate the fitted values
newdata1 = expand.grid(A = levels(data1$A), B = seq(min(data1$B),
    max(data1$B), len = 100))
fit = fitted(data1.brms, newdata = newdata1, summary = FALSE)
newdata1 = newdata1 %>% cbind(tidyMCMC(as.mcmc(fit), conf.int = TRUE,
    conf.method = "HPDinterval"))

ggplot(newdata1, aes(y = estimate, x = B, fill = A)) + geom_ribbon(aes(ymin = conf.low,
    ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") +
    scale_x_continuous("B") + theme_classic()
plot of chunk tut7.5bBRMS2GraphicalSummaries

As this is simple single factor ANOVA, we can simple add the raw data to this figure. For more complex designs with additional predictors, it is necessary to plot partial residuals.

## Calculate partial residuals
rdata1 = data1
fit = fitted(data1.brms, summary = TRUE)[, "Estimate"]
resid = resid(data1.brms)[, "Estimate"]
rdata1 = rdata1 %>% mutate(partial.resid = resid + fit)

ggplot(newdata1, aes(y = estimate, x = B, fill = A)) + geom_point(data = rdata1,
    aes(y = partial.resid, x = B, color = A)) + geom_ribbon(aes(ymin = conf.low,
    ymax = conf.high), alpha = 0.2) + geom_line() + scale_y_continuous("Y") +
    scale_x_continuous("B") + theme_classic()
plot of chunk tut7.5bBRMS2GraphicalSummaries2

References

Gelman, A. (2005). “Analysis of Variance - Why it is More Important Than Ever”. In: The Annals of Statistics 33.1, pp. 1–53.

Gelman, A., B. Goodrich, J. Gabry, et al. (2017). “R-squared for Bayesian regression models”.




Worked Examples

Single factor classification (ANOVA) references
  • McCarthy (2007) - Chpt 6
  • Kery (2010) - Chpt 10
  • Gelman & Hill (2007) - Chpt 4
  • Logan (2010) - Chpt 12
  • Quinn & Keough (2002) - Chpt 9

Homogeneous slopes

To investigate the impacts of sexual activity on the fruitfly longevity, Partridge and Farquhar (1981), measured the longevity of male fruitflies with access to either one virgin female (potential mate), eight virgin females, one pregnant female (not a potential mate), eight pregnant females or no females. The pool of available male fruitflies varied in size and since size is known to impact longevity, the researchers also measured thorax length as a covariate.

Download Partridge1 data set
Format of partridge1.csv data file
TREATMENTTHORAXLONGEV
Preg80.6435
Preg80.6837
Preg80.6849
......

TREATMENTCategorical listing of the number and sort of female partners(None - no female partners, Preg1 - one pregnant female partner, Preg8 - eight pregnant partners, Virg1 - one virgin female, Virg8 - eight virgin females)
THORAXContinuous covariate, the length of the thorax (mm)
LONGEVLongevity of male fruitflies. Response variable.
Saltmarsh

Open
the partridge1 data file.
partridge <- read.csv("../downloads/data/partridge1.csv", strip.white = T)
head(partridge)
  TREATMENT THORAX LONGEV
1     Preg8   0.64     35
2     Preg8   0.68     37
3     Preg8   0.68     49
4     Preg8   0.72     46
5     Preg8   0.72     63
6     Preg8   0.76     39

A subset of these data were analysed in Tutorial 7.4b Q3 (the same data, yet without the continuous covariate). On that occasion we decided that although the response were counts (and thus a good match for a Poisson distribution), the magnitude of the observations would probably mean that a Gaussian distribution would be adequate.

The addition of a continuous covariate is likely to complicate this somewhat. Now we not only assume homogeneity of variance between treatments, we also assume homogeneity of variance with thorax length.

  1. Perform basic exploratory data analysis to examined.
    • normality
    • homogeneity of variance
    • homogeneity of slopes (and equality of covariate ranges)
    ggplot(partridge, aes(y = LONGEV, x = THORAX, color = TREATMENT)) + geom_smooth() + geom_point()
    
    plot of chunk ws7.5bQ1.2
    We can also quickly explore these by plotting the residuals from a simple linear model
    plot(lm(LONGEV ~ THORAX + TREATMENT, partridge), which = 1)
    
    plot of chunk ws7.5bQ1.2a

    There is definitely evidence of non-homogeneity of variance and perhaps even non-linearity. As already indicated, this is perhaps not unsurprising. Arguably, the most appropriate starting point would be to begin with a Poisson distribution which would not only expect the non-homogeneity of variance, since the Poisson uses a log-link, the relationship on a response scale would be expected to be non-linear.

    Nevertheless, we will researve the use of a Poisson distribution until Tutorial 9.4b and for now, simply use a log (base 10) transformation to be consistent with the original authors. Note, as there are no zeros in these data, and (perhaps more importantly), the response does not approach zero, a logarithmic transformation is likely to be reasonable.

    ggplot(partridge, aes(y = log10(LONGEV), x = THORAX, color = TREATMENT)) + geom_smooth() + geom_point()
    
    plot of chunk ws7.5bQ1.2b
    plot(lm(log10(LONGEV) ~ THORAX + TREATMENT, partridge), which = 1)
    
    plot of chunk ws7.5bQ1.2b

    That does appear more reasonable..

    And now to explore the range of the covariate.

    ggplot(partridge, aes(y = THORAX, x = TREATMENT)) + geom_boxplot()
    
    plot of chunk ws7.5bQ1.2c
    summary(lm(THORAX ~ TREATMENT, partridge))
    
    Call:
    lm(formula = THORAX ~ TREATMENT, data = partridge)
    
    Residuals:
        Min      1Q  Median      3Q     Max 
    -0.1960 -0.0456  0.0144  0.0544  0.1344 
    
    Coefficients:
                   Estimate Std. Error t value Pr(>|t|)    
    (Intercept)     0.83600    0.01543  54.194   <2e-16 ***
    TREATMENTPreg1 -0.01040    0.02182  -0.477    0.634    
    TREATMENTPreg8 -0.03040    0.02182  -1.393    0.166    
    TREATMENTVirg1  0.00160    0.02182   0.073    0.942    
    TREATMENTVirg8 -0.03600    0.02182  -1.650    0.102    
    ---
    Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
    
    Residual standard error: 0.07713 on 120 degrees of freedom
    Multiple R-squared:  0.04032,	Adjusted R-squared:  0.008335 
    F-statistic: 1.261 on 4 and 120 DF,  p-value: 0.2893
    

    Conclusions - no evidence that the range of thorax length varies substantially between treatments. Furthermore, homogeneity of slopes seems reasonable and so we will use an additive model.

  2. Fit the appropriate Bayesian model to explore the effect of treatment and thorax length on male fruitfly longenvity. $$ \begin{align} y_{ij} &\sim{} N(\mu_{ij}, \sigma)\\ \mu_{ij} &= \beta_0 + \mathbf{X}\boldsymbol{\beta}\\[1em] \beta_0 &\sim{} N(0,10)\\ \beta &\sim{} N(0,10)\\ \sigma &\sim{} cauchy(0,5)\\ \end{align} $$
    library(MCMCpack)
    partridge.mcmcpack = MCMCregress(log10(LONGEV) ~ TREATMENT + THORAX, data = partridge)
    
    modelString = "
      model {
      #Likelihood
      for (i in 1:n) {
      y[i]~dnorm(mu[i],tau)
      mu[i] <- beta0 + inprod(beta[],X[i,])
      }
      #Priors
      beta0 ~ dnorm(0.01,1.0E-6)
      for (j in 1:nX) {
      beta[j] ~ dnorm(0.01,1.0E-6)
      }
      tau <- 1 / (sigma * sigma)
      sigma~dunif(0,100)
      }
      "
    
    X = model.matrix(~TREATMENT + THORAX, data = partridge)
    partridge.list <- with(partridge, list(y = log10(LONGEV), X = X[, -1],
        nX = ncol(X) - 1, n = nrow(partridge)))
    
    params <- c("beta0", "beta", "sigma")
    burnInSteps = 3000
    nChains = 3
    numSavedSteps = 15000
    thinSteps = 10
    nIter = ceiling((numSavedSteps * thinSteps)/nChains)
    
    partridge.r2jags <- jags(data = partridge.list, inits = NULL, parameters.to.save = params,
        model.file = textConnection(modelString), n.chains = nChains, n.iter = nIter,
        n.burnin = burnInSteps, n.thin = thinSteps)
    
    Compiling model graph
       Resolving undeclared variables
       Allocating nodes
    Graph information:
       Observed stochastic nodes: 125
       Unobserved stochastic nodes: 7
       Total graph size: 998
    
    Initializing model
    
    					  modelString=" 
    					  data { 
    					  int n;   // total number of observations 
    					  vector[n] Y;      // response variable 
    					  int nX;  // number of effects 
    					  matrix[n, nX] X;   // model matrix 
    					  } 
    					  transformed data { 
    					  matrix[n, nX - 1] Xc;  // centered version of X 
    					  vector[nX - 1] means_X;  // column means of X before centering 
    					  
    					  for (i in 2:nX) { 
    					  means_X[i - 1] = mean(X[, i]); 
    					  Xc[, i - 1] = X[, i] - means_X[i - 1]; 
    					  }  
    					  } 
    					  parameters { 
    					  vector[nX-1] beta;  // population-level effects 
    					  real cbeta0;  // center-scale intercept 
    					  real sigma;  // residual SD 
    					  } 
    					  transformed parameters { 
    					  } 
    					  model { 
    					  vector[n] mu; 
    					  mu = Xc * beta + cbeta0; 
    					  // prior specifications 
    					  beta ~ normal(0, 10); 
    					  cbeta0 ~ normal(0, 10); 
    					  sigma ~ cauchy(0, 5); 
    					  // likelihood contribution 
    					  Y ~ normal(mu, sigma); 
    					  } 
    					  generated quantities { 
    					  real beta0;  // population-level intercept 
    					  vector[n] log_lik;
    					  beta0 = cbeta0 - dot_product(means_X, beta);
    					  for (i in 1:n) {
    					  log_lik[i] = normal_lpdf(Y[i] | Xc[i] * beta + cbeta0, sigma);
    					  }
    					  }
    					  "
    
    X = model.matrix(~TREATMENT + THORAX, data = partridge)
    partridge.list <- with(partridge, list(Y = log10(LONGEV), X = X, nX = ncol(X), n = nrow(partridge)))
    
    library(rstan)
    partridge.rstan <- stan(data = partridge.list, model_code = modelString, chains = 3, iter = 5000, warmup = 500,
        thin = 2)
    
    In file included from /usr/local/lib/R/site-library/BH/include/boost/config.hpp:39:0,
                     from /usr/local/lib/R/site-library/BH/include/boost/math/tools/config.hpp:13,
                     from /usr/local/lib/R/site-library/StanHeaders/include/stan/math/rev/core/var.hpp:7,
                     from /usr/local/lib/R/site-library/StanHeaders/include/stan/math/rev/core/gevv_vvv_vari.hpp:5,
                     from /usr/local/lib/R/site-library/StanHeaders/include/stan/math/rev/core.hpp:12,
                     from /usr/local/lib/R/site-library/StanHeaders/include/stan/math/rev/mat.hpp:4,
                     from /usr/local/lib/R/site-library/StanHeaders/include/stan/math.hpp:4,
                     from /usr/local/lib/R/site-library/StanHeaders/include/src/stan/model/model_header.hpp:4,
                     from file6f0539983e2.cpp:8:
    /usr/local/lib/R/site-library/BH/include/boost/config/compiler/gcc.hpp:186:0: warning: "BOOST_NO_CXX11_RVALUE_REFERENCES" redefined
     #  define BOOST_NO_CXX11_RVALUE_REFERENCES
     ^
    <command-line>:0:0: note: this is the location of the previous definition
    
    SAMPLING FOR MODEL 'd98dbf6a02725fc3fce11306b77873e9' NOW (CHAIN 1).
    
    Gradient evaluation took 3.1e-05 seconds
    1000 transitions using 10 leapfrog steps per transition would take 0.31 seconds.
    Adjust your expectations accordingly!
    
    
    Iteration:    1 / 5000 [  0%]  (Warmup)
    Iteration:  500 / 5000 [ 10%]  (Warmup)
    Iteration:  501 / 5000 [ 10%]  (Sampling)
    Iteration: 1000 / 5000 [ 20%]  (Sampling)
    Iteration: 1500 / 5000 [ 30%]  (Sampling)
    Iteration: 2000 / 5000 [ 40%]  (Sampling)
    Iteration: 2500 / 5000 [ 50%]  (Sampling)
    Iteration: 3000 / 5000 [ 60%]  (Sampling)
    Iteration: 3500 / 5000 [ 70%]  (Sampling)
    Iteration: 4000 / 5000 [ 80%]  (Sampling)
    Iteration: 4500 / 5000 [ 90%]  (Sampling)
    Iteration: 5000 / 5000 [100%]  (Sampling)
    
     Elapsed Time: 0.077105 seconds (Warm-up)
                   0.380042 seconds (Sampling)
                   0.457147 seconds (Total)
    
    
    SAMPLING FOR MODEL 'd98dbf6a02725fc3fce11306b77873e9' NOW (CHAIN 2).
    
    Gradient evaluation took 1.6e-05 seconds
    1000 transitions using 10 leapfrog steps per transition would take 0.16 seconds.
    Adjust your expectations accordingly!
    
    
    Iteration:    1 / 5000 [  0%]  (Warmup)
    Iteration:  500 / 5000 [ 10%]  (Warmup)
    Iteration:  501 / 5000 [ 10%]  (Sampling)
    Iteration: 1000 / 5000 [ 20%]  (Sampling)
    Iteration: 1500 / 5000 [ 30%]  (Sampling)
    Iteration: 2000 / 5000 [ 40%]  (Sampling)
    Iteration: 2500 / 5000 [ 50%]  (Sampling)
    Iteration: 3000 / 5000 [ 60%]  (Sampling)
    Iteration: 3500 / 5000 [ 70%]  (Sampling)
    Iteration: 4000 / 5000 [ 80%]  (Sampling)
    Iteration: 4500 / 5000 [ 90%]  (Sampling)
    Iteration: 5000 / 5000 [100%]  (Sampling)
    
     Elapsed Time: 0.067367 seconds (Warm-up)
                   0.394947 seconds (Sampling)
                   0.462314 seconds (Total)
    
    
    SAMPLING FOR MODEL 'd98dbf6a02725fc3fce11306b77873e9' NOW (CHAIN 3).
    
    Gradient evaluation took 1.4e-05 seconds
    1000 transitions using 10 leapfrog steps per transition would take 0.14 seconds.
    Adjust your expectations accordingly!
    
    
    Iteration:    1 / 5000 [  0%]  (Warmup)
    Iteration:  500 / 5000 [ 10%]  (Warmup)
    Iteration:  501 / 5000 [ 10%]  (Sampling)
    Iteration: 1000 / 5000 [ 20%]  (Sampling)
    Iteration: 1500 / 5000 [ 30%]  (Sampling)
    Iteration: 2000 / 5000 [ 40%]  (Sampling)
    Iteration: 2500 / 5000 [ 50%]  (Sampling)
    Iteration: 3000 / 5000 [ 60%]  (Sampling)
    Iteration: 3500 / 5000 [ 70%]  (Sampling)
    Iteration: 4000 / 5000 [ 80%]  (Sampling)
    Iteration: 4500 / 5000 [ 90%]  (Sampling)
    Iteration: 5000 / 5000 [100%]  (Sampling)
    
     Elapsed Time: 0.067727 seconds (Warm-up)
                   0.42144 seconds (Sampling)
                   0.489167 seconds (Total)
    
    partridge.rstanarm = stan_glm(log10(LONGEV) ~ TREATMENT + THORAX, data = partridge,
        iter = 5000, warmup = 500, chains = 3, thin = 2, refresh = 0, prior_intercept = normal(0,
            10), prior = normal(0, 10), prior_aux = cauchy(0, 5))
    
    Gradient evaluation took 0.000161 seconds
    1000 transitions using 10 leapfrog steps per transition would take 1.61 seconds.
    Adjust your expectations accordingly!
    
    
    
     Elapsed Time: 0.109844 seconds (Warm-up)
                   0.657854 seconds (Sampling)
                   0.767698 seconds (Total)
    
    
    Gradient evaluation took 2.1e-05 seconds
    1000 transitions using 10 leapfrog steps per transition would take 0.21 seconds.
    Adjust your expectations accordingly!
    
    
    
     Elapsed Time: 0.08606 seconds (Warm-up)
                   0.507298 seconds (Sampling)
                   0.593358 seconds (Total)
    
    
    Gradient evaluation took 3e-05 seconds
    1000 transitions using 10 leapfrog steps per transition would take 0.3 seconds.
    Adjust your expectations accordingly!
    
    
    
     Elapsed Time: 0.084726 seconds (Warm-up)
                   0.50021 seconds (Sampling)
                   0.584936 seconds (Total)
    
    partridge.brm = brm(log10(LONGEV) ~ TREATMENT + THORAX, data = partridge,
        iter = 5000, warmup = 500, chains = 3, thin = 2, refresh = 0, prior = c(prior(normal(0,
            10), class = "Intercept"), prior(normal(0, 10), class = "b"), prior(cauchy(0,
            5), class = "sigma")))
    
    Gradient evaluation took 2.5e-05 seconds
    1000 transitions using 10 leapfrog steps per transition would take 0.25 seconds.
    Adjust your expectations accordingly!
    
    
    
     Elapsed Time: 0.060042 seconds (Warm-up)
                   0.340743 seconds (Sampling)
                   0.400785 seconds (Total)
    
    
    Gradient evaluation took 1.2e-05 seconds
    1000 transitions using 10 leapfrog steps per transition would take 0.12 seconds.
    Adjust your expectations accordingly!
    
    
    
     Elapsed Time: 0.061698 seconds (Warm-up)
                   0.346451 seconds (Sampling)
                   0.408149 seconds (Total)
    
    
    Gradient evaluation took 1.4e-05 seconds
    1000 transitions using 10 leapfrog steps per transition would take 0.14 seconds.
    Adjust your expectations accordingly!
    
    
    
     Elapsed Time: 0.064057 seconds (Warm-up)
                   0.336682 seconds (Sampling)
                   0.400739 seconds (Total)
    
  3. Explore MCMC diagnostics
    library(MCMCpack)
    plot(partridge.mcmcpack)
    
    plot of chunk tut7.5bQ3.2a
    plot of chunk tut7.5bQ3.2a
    raftery.diag(partridge.mcmcpack)
    
    Quantile (q) = 0.025
    Accuracy (r) = +/- 0.005
    Probability (s) = 0.95 
                                                          
                    Burn-in  Total Lower bound  Dependence
                    (M)      (N)   (Nmin)       factor (I)
     (Intercept)    2        3865  3746         1.030     
     TREATMENTPreg1 2        3710  3746         0.990     
     TREATMENTPreg8 2        3741  3746         0.999     
     TREATMENTVirg1 2        3710  3746         0.990     
     TREATMENTVirg8 2        3771  3746         1.010     
     THORAX         2        3710  3746         0.990     
     sigma2         2        3994  3746         1.070     
    
    autocorr.diag(partridge.mcmcpack)
    
            (Intercept) TREATMENTPreg1 TREATMENTPreg8 TREATMENTVirg1 TREATMENTVirg8        THORAX
    Lag 0   1.000000000    1.000000000   1.000000e+00    1.000000000    1.000000000  1.000000e+00
    Lag 1  -0.006504213    0.005715657  -1.260043e-03   -0.008276112   -0.008425989 -6.216742e-03
    Lag 5   0.020132882    0.012739945  -2.900389e-02   -0.001338169   -0.015262191  2.106973e-02
    Lag 10  0.005066356    0.011293213  -8.019851e-03   -0.003100969   -0.013767490  5.865134e-03
    Lag 50 -0.001937426   -0.024530392   8.454565e-05    0.004761843   -0.008212958 -3.677482e-05
                  sigma2
    Lag 0   1.000000e+00
    Lag 1   5.041996e-02
    Lag 5   7.127112e-05
    Lag 10  1.653846e-02
    Lag 50 -7.121217e-03
    
    library(R2jags)
    library(coda)
    partridge.mcmc = as.mcmc(partridge.r2jags)
    plot(partridge.mcmc)
    
    plot of chunk tut7.5bQ3.2b
    plot of chunk tut7.5bQ3.2b
    raftery.diag(partridge.mcmc)
    
    [[1]]
    
    Quantile (q) = 0.025
    Accuracy (r) = +/- 0.005
    Probability (s) = 0.95 
                                                    
              Burn-in  Total Lower bound  Dependence
              (M)      (N)   (Nmin)       factor (I)
     beta0    10       37660 3746         10.10     
     beta[1]  20       37020 3746          9.88     
     beta[2]  20       37020 3746          9.88     
     beta[3]  20       37020 3746          9.88     
     beta[4]  20       35750 3746          9.54     
     beta[5]  20       37020 3746          9.88     
     deviance 20       37020 3746          9.88     
     sigma    10       37660 3746         10.10     
    
    
    [[2]]
    
    Quantile (q) = 0.025
    Accuracy (r) = +/- 0.005
    Probability (s) = 0.95 
                                                    
              Burn-in  Total Lower bound  Dependence
              (M)      (N)   (Nmin)       factor (I)
     beta0    10       37660 3746         10.10     
     beta[1]  10       37660 3746         10.10     
     beta[2]  30       40390 3746         10.80     
     beta[3]  20       37020 3746          9.88     
     beta[4]  10       37660 3746         10.10     
     beta[5]  20       38330 3746         10.20     
     deviance 20       38330 3746         10.20     
     sigma    20       36380 3746          9.71     
    
    
    [[3]]
    
    Quantile (q) = 0.025
    Accuracy (r) = +/- 0.005
    Probability (s) = 0.95 
                                                    
              Burn-in  Total Lower bound  Dependence
              (M)      (N)   (Nmin)       factor (I)
     beta0    20       37020 3746          9.88     
     beta[1]  10       37660 3746         10.10     
     beta[2]  20       39000 3746         10.40     
     beta[3]  20       37020 3746          9.88     
     beta[4]  20       37020 3746          9.88     
     beta[5]  10       37660 3746         10.10     
     deviance 20       37020 3746          9.88     
     sigma    20       38330 3746         10.20     
    
    autocorr.diag(partridge.mcmc)
    
                   beta0      beta[1]     beta[2]      beta[3]      beta[4]      beta[5]     deviance
    Lag 0    1.000000000  1.000000000 1.000000000  1.000000000  1.000000000  1.000000000  1.000000000
    Lag 10  -0.013921949  0.005149797 0.004975120 -0.001853710 -0.014382555 -0.012575832 -0.008892071
    Lag 50  -0.003696594 -0.002835295 0.001318796  0.010156118 -0.011928061 -0.003560853  0.010943677
    Lag 100  0.003584436 -0.001331310 0.003826374  0.009171373  0.008765503  0.003492416 -0.008883727
    Lag 500  0.016409182  0.013260824 0.009091810  0.001029463 -0.003070906  0.015760118  0.002411228
                    sigma
    Lag 0    1.0000000000
    Lag 10  -0.0092198712
    Lag 50   0.0059702667
    Lag 100 -0.0008803947
    Lag 500  0.0077816203
    
    library(rstan)
    library(coda)
    s = as.array(partridge.rstan)
    partridge.mcmc <- do.call(mcmc.list, plyr:::alply(s[, , c("beta0", "beta[1]", "beta[2]", "beta[3]", "beta[4]",
        "beta[5]", "sigma")], 2, as.mcmc))
    plot(partridge.mcmc)
    
    plot of chunk tut7.5bQ3.2c
    plot of chunk tut7.5bQ3.2c
    raftery.diag(partridge.mcmc)
    
    $`1`
    
    Quantile (q) = 0.025
    Accuracy (r) = +/- 0.005
    Probability (s) = 0.95 
    
    You need a sample size of at least 3746 with these values of q, r and s
    
    $`2`
    
    Quantile (q) = 0.025
    Accuracy (r) = +/- 0.005
    Probability (s) = 0.95 
    
    You need a sample size of at least 3746 with these values of q, r and s
    
    $`3`
    
    Quantile (q) = 0.025
    Accuracy (r) = +/- 0.005
    Probability (s) = 0.95 
    
    You need a sample size of at least 3746 with these values of q, r and s
    
    autocorr.diag(partridge.mcmc)
    
                   beta0      beta[1]      beta[2]     beta[3]      beta[4]     beta[5]         sigma
    Lag 0   1.0000000000  1.000000000  1.000000000 1.000000000  1.000000000 1.000000000  1.0000000000
    Lag 1   0.0362085746  0.091738066  0.093666654 0.065303202  0.083647790 0.029088441  0.0297993560
    Lag 5   0.0075315830 -0.007179393 -0.013474037 0.019288850 -0.002217306 0.007947353 -0.0114935490
    Lag 10  0.0154634582  0.004801029  0.020335742 0.010920588  0.025187625 0.014189952  0.0007989306
    Lag 50 -0.0001803206 -0.001614632 -0.005646201 0.003760134 -0.019317695 0.003126408 -0.0024392021
    
    library(rstan)
    library(coda)
    stan_ac(partridge.rstan, pars = c("beta0", "beta", "sigma"))
    
    plot of chunk tut7.5bQ3.2c1
    stan_rhat(partridge.rstan, pars = c("beta0", "beta", "sigma"))
    
    plot of chunk tut7.5bQ3.2c1
    stan_ess(partridge.rstan, pars = c("beta0", "beta", "sigma"))
    
    plot of chunk tut7.5bQ3.2c1
    # using Bayeseplot
    library(bayesplot)
    detach("package:reshape")
    mcmc_trace(as.array(partridge.rstan), regex_par = "beta|sigma")
    
    plot of chunk tut7.5bQ3.2c2
    mcmc_trace(as.array(partridge.rstan), regex_pars = "beta|sigma")
    
    plot of chunk tut7.5bQ3.2c2
    mcmc_dens(as.array(partridge.rstan), regex_par = "beta|sigma")
    
    plot of chunk tut7.5bQ3.2c2
    detach("package:reshape")
    library(bayesplot)
    mcmc_combo(as.array(partridge.rstan), regex_par = "beta|sigma")
    
    plot of chunk tut7.5bQ3.2c3
    library(rstanarm)
    library(coda)
    s = as.array(partridge.rstanarm)
    colnames(as.matrix(partridge.rstanarm))
    
    [1] "(Intercept)"    "TREATMENTPreg1" "TREATMENTPreg8" "TREATMENTVirg1" "TREATMENTVirg8"
    [6] "THORAX"         "sigma"         
    
    partridge.mcmc <- do.call(mcmc.list, plyr:::alply(s[, , c("(Intercept)", "TREATMENTPreg1", "TREATMENTPreg8",
        "TREATMENTVirg1", "TREATMENTVirg8", "THORAX", "sigma")], 2, as.mcmc))
    plot(partridge.mcmc)
    
    plot of chunk tut7.5bQ3.2d
    plot of chunk tut7.5bQ3.2d
    raftery.diag(partridge.mcmc)
    
    $`1`
    
    Quantile (q) = 0.025
    Accuracy (r) = +/- 0.005
    Probability (s) = 0.95 
    
    You need a sample size of at least 3746 with these values of q, r and s
    
    $`2`
    
    Quantile (q) = 0.025
    Accuracy (r) = +/- 0.005
    Probability (s) = 0.95 
    
    You need a sample size of at least 3746 with these values of q, r and s
    
    $`3`
    
    Quantile (q) = 0.025
    Accuracy (r) = +/- 0.005
    Probability (s) = 0.95 
    
    You need a sample size of at least 3746 with these values of q, r and s
    
    autocorr.diag(partridge.mcmc)
    
            (Intercept) TREATMENTPreg1 TREATMENTPreg8 TREATMENTVirg1 TREATMENTVirg8        THORAX
    Lag 0   1.000000000    1.000000000    1.000000000   1.0000000000    1.000000000  1.0000000000
    Lag 1   0.075353592    0.177654851    0.188350285   0.1751439722    0.185668547  0.0617834129
    Lag 5   0.003490198    0.005324381    0.000440865  -0.0045602106   -0.002274830  0.0006088013
    Lag 10 -0.003447542    0.012745177   -0.005106027   0.0115046730   -0.010358881 -0.0033071860
    Lag 50 -0.004644673   -0.009315829    0.011216715  -0.0005416519   -0.007338707 -0.0043183473
                  sigma
    Lag 0   1.000000000
    Lag 1   0.048851176
    Lag 5  -0.011582830
    Lag 10 -0.008910899
    Lag 50  0.003726101
    
    library(rstanarm)
    library(coda)
    stan_ac(partridge.rstanarm, regex_pars = "Intercept|TREATMENT|THORAX|sigma")
    
    plot of chunk tut7.5bQ3.2d1
    stan_rhat(partridge.rstanarm, regex_pars = "Intercept|TREATMENT|THORAX|sigma")
    
    plot of chunk tut7.5bQ3.2d1
    stan_ess(partridge.rstanarm, regex_pars = "Intercept|TREATMENT|THORAX|sigma")
    
    plot of chunk tut7.5bQ3.2d1
    # using Bayeseplot
    library(bayesplot)
    detach("package:reshape")
    mcmc_trace(as.array(partridge.rstanarm), regex_par = "Intercept|TREATMENT|THORAX|sigma")
    
    plot of chunk tut7.5bQ3.2d2
    mcmc_trace(as.array(partridge.rstanarm), regex_pars = "Intercept|TREATMENT|THORAX|sigma")
    
    plot of chunk tut7.5bQ3.2d2
    mcmc_dens(as.array(partridge.rstanarm), regex_pars = "Intercept|TREATMENT|THORAX|sigma")
    
    plot of chunk tut7.5bQ3.2d2
    detach("package:reshape")
    library(bayesplot)
    mcmc_combo(as.array(partridge.rstanarm), regex_par = "Intercept|TREATMENT|THORAX|sigma")
    
    plot of chunk tut7.5bQ3.2d3
    library(rstanarm)
    posterior_vs_prior(partridge.rstanarm, color_by = "vs", group_by = TRUE,
        facet_args = list(scales = "free_y"))
    
    Gradient evaluation took 4.7e-05 seconds
    1000 transitions using 10 leapfrog steps per transition would take 0.47 seconds.
    Adjust your expectations accordingly!
    
    
    
     Elapsed Time: 0.064874 seconds (Warm-up)
                   0.139432 seconds (Sampling)
                   0.204306 seconds (Total)
    
    
    Gradient evaluation took 4e-05 seconds
    1000 transitions using 10 leapfrog steps per transition would take 0.4 seconds.
    Adjust your expectations accordingly!
    
    
    
     Elapsed Time: 0.074223 seconds (Warm-up)
                   0.240174 seconds (Sampling)
                   0.314397 seconds (Total)
    
    plot of chunk tut7.5bQ3.2d4
    library(coda)
    library(brms)
    partridge.mcmc = as.mcmc(partridge.brm)
    plot(partridge.mcmc)
    
    plot of chunk tut7.5bQ3.2e
    plot of chunk tut7.5bQ3.2e
    raftery.diag(partridge.mcmc)
    
    [[1]]
    
    Quantile (q) = 0.025
    Accuracy (r) = +/- 0.005
    Probability (s) = 0.95 
    
    You need a sample size of at least 3746 with these values of q, r and s
    
    [[2]]
    
    Quantile (q) = 0.025
    Accuracy (r) = +/- 0.005
    Probability (s) = 0.95 
    
    You need a sample size of at least 3746 with these values of q, r and s
    
    [[3]]
    
    Quantile (q) = 0.025
    Accuracy (r) = +/- 0.005
    Probability (s) = 0.95 
    
    You need a sample size of at least 3746 with these values of q, r and s
    
    autocorr.diag(partridge.mcmc)
    
    Error in ts(x, start = start(x), end = end(x), deltat = thin(x)): invalid time series parameters specified
    
    library(coda)
    stan_ac(partridge.brm$fit)
    
    plot of chunk tut7.5bQ3.2e1
    stan_rhat(partridge.brm$fit)
    
    plot of chunk tut7.5bQ3.2e1
    stan_ess(partridge.brm$fit)
    
    plot of chunk tut7.5bQ3.2e1
  4. The chains appear well mixed and have converged on what is likely to be a stable posterior.

  5. Perform model validation
    library(MCMCpack)
    partridge.mcmc = as.data.frame(partridge.mcmcpack)
    # generate a model matrix
    newdata = partridge
    Xmat = model.matrix(~TREATMENT + THORAX, newdata)
    ## get median parameter estimates
    wch = grep("Intercept|TREATMENT|THORAX", colnames(partridge.mcmc))
    coefs = apply(partridge.mcmc[, wch], 2, median)
    fit = as.vector(coefs %*% t(Xmat))
    resid = log10(partridge$LONGEV) - fit
    ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))
    
    plot of chunk tut7.5bQ3.3a1
    # Plot residuals against treatment level
    newdata = newdata %>% cbind(fit, resid)
    newdata.melt = newdata %>% gather(key = Pred, value = Value, TREATMENT, THORAX)
    ggplot(newdata.melt) + geom_point(aes(y = resid, x = Value)) + facet_wrap(~Pred, scales = "free")
    
    plot of chunk tut7.5bQ3.3a2
    # Standardized residuals
    sresid = resid/sd(resid)
    ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))
    
    plot of chunk tut7.5bQ3.3a3
    library(MCMCpack)
    partridge.mcmc = as.matrix(partridge.mcmcpack)
    # generate a model matrix
    Xmat = model.matrix(~TREATMENT + THORAX, partridge)
    ## get median parameter estimates
    wch = grep("Intercept|TREATMENT|THORAX", colnames(partridge.mcmc))
    coefs = partridge.mcmc[, wch]
    fit = 10^(coefs %*% t(Xmat))
    ## draw samples from this model
    yRep = sapply(1:nrow(partridge.mcmc), function(i) rnorm(nrow(partridge),
        fit[i, ], sqrt(partridge.mcmc[i, "sigma2"])))
    newdata = data.frame(TREATMENT = partridge$TREATMENT, THORAX = partridge$THORAX,
        yRep) %>% gather(key = Sample, value = Value, -TREATMENT, -THORAX)
    ggplot(newdata) + geom_violin(aes(y = Value, x = TREATMENT, fill = "Model"),
        alpha = 0.5) + geom_violin(data = partridge, aes(y = LONGEV, x = TREATMENT,
        fill = "Obs"), alpha = 0.5) + geom_point(data = partridge, aes(y = LONGEV,
        x = TREATMENT), position = position_jitter(width = 0.1, height = 0),
        color = "black")
    
    plot of chunk tut7.5bQ3.3a4
    ggplot(newdata) + geom_violin(aes(y = Value, x = THORAX, fill = "Model",
        group = THORAX, color = TREATMENT), alpha = 0.5) + geom_point(data = partridge,
        aes(y = LONGEV, x = THORAX, group = THORAX, color = TREATMENT))
    
    plot of chunk tut7.5bQ3.3a4

    We can also explore the posteriors of each parameter.

    library(bayesplot)
    mcmc_intervals(as.matrix(partridge.mcmcpack), regex_pars = "TREATMENT|THORAX")
    
    plot of chunk tut7.5bQ3.3a5
    mcmc_areas(as.matrix(partridge.mcmcpack), regex_pars = "TREATMENT|THORAX")
    
    plot of chunk tut7.5bQ3.3a5
    partridge.mcmc = partridge.r2jags$BUGSoutput$sims.matrix
    # generate a model matrix
    newdata = partridge
    Xmat = model.matrix(~TREATMENT + THORAX, newdata)
    ## get median parameter estimates
    wch = c(which(colnames(partridge.mcmc) == "beta0"), grep("beta\\[", colnames(partridge.mcmc)))
    coefs = apply(partridge.mcmc[, wch], 2, median)
    fit = as.vector(coefs %*% t(Xmat))
    resid = log10(partridge$LONGEV) - fit
    ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))
    
    plot of chunk tut7.5bQ3.3b1
    # Plot residuals against treatment level
    newdata = newdata %>% cbind(fit, resid)
    newdata.melt = newdata %>% gather(key = Pred, value = Value, TREATMENT, THORAX)
    ggplot(newdata.melt) + geom_point(aes(y = resid, x = Value)) + facet_wrap(~Pred, scales = "free")
    
    plot of chunk tut7.5bQ3.3b2
    # Standardized residuals
    sresid = resid/sd(resid)
    ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))
    
    plot of chunk tut7.5bQ3.3b3
    partridge.mcmc = partridge.r2jags$BUGSoutput$sims.matrix
    # generate a model matrix
    Xmat = model.matrix(~TREATMENT + THORAX, partridge)
    ## get median parameter estimates
    wch = c(which(colnames(partridge.mcmc) == "beta0"), grep("beta\\[", colnames(partridge.mcmc)))
    coefs = partridge.mcmc[, wch]
    fit = 10^(coefs %*% t(Xmat))
    ## draw samples from this model
    yRep = sapply(1:nrow(partridge.mcmc), function(i) rnorm(nrow(partridge),
        fit[i, ], partridge.mcmc[i, "sigma"]))
    newdata = data.frame(TREATMENT = partridge$TREATMENT, THORAX = partridge$THORAX,
        yRep) %>% gather(key = Sample, value = Value, -TREATMENT, -THORAX)
    ggplot(newdata) + geom_violin(aes(y = Value, x = TREATMENT, fill = "Model"),
        alpha = 0.5) + geom_violin(data = partridge, aes(y = LONGEV, x = TREATMENT,
        fill = "Obs"), alpha = 0.5) + geom_point(data = partridge, aes(y = LONGEV,
        x = TREATMENT), position = position_jitter(width = 0.1, height = 0),
        color = "black")
    
    plot of chunk tut7.5bQ3.3b4
    ggplot(newdata) + geom_violin(aes(y = Value, x = THORAX, fill = "Model",
        group = THORAX, color = TREATMENT), alpha = 0.5) + geom_point(data = partridge,
        aes(y = LONGEV, x = THORAX, group = THORAX, color = TREATMENT))
    
    plot of chunk tut7.5bQ3.3b4

    We can also explore the posteriors of each parameter.

    library(bayesplot)
    mcmc_intervals(partridge.r2jags$BUGSoutput$sims.matrix, regex_pars = "beta|sigma")
    
    plot of chunk tut7.5bQ3.3b5
    mcmc_areas(partridge.r2jags$BUGSoutput$sims.matrix, regex_pars = "beta|sigma")
    
    plot of chunk tut7.5bQ3.3b5
    partridge.mcmc = as.matrix(partridge.rstan)
    # generate a model matrix
    newdata = partridge
    Xmat = model.matrix(~TREATMENT + THORAX, newdata)
    ## get median parameter estimates
    wch = c(which(colnames(partridge.mcmc) == "beta0"), grep("beta\\[", colnames(partridge.mcmc)))
    coefs = apply(partridge.mcmc[, wch], 2, median)
    fit = as.vector(coefs %*% t(Xmat))
    resid = log10(partridge$LONGEV) - fit
    ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))
    
    plot of chunk tut7.5bQ3.3c1
    # Plot residuals against treatment level
    newdata = newdata %>% cbind(fit, resid)
    newdata.melt = newdata %>% gather(key = Pred, value = Value, TREATMENT, THORAX)
    ggplot(newdata.melt) + geom_point(aes(y = resid, x = Value)) + facet_wrap(~Pred, scales = "free")
    
    plot of chunk tut7.5bQ3.3c2
    # Standardized residuals
    sresid = resid/sd(resid)
    ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))
    
    plot of chunk tut7.5bQ3.3c3
    partridge.mcmc = as.matrix(partridge.rstan)
    # generate a model matrix
    Xmat = model.matrix(~TREATMENT + THORAX, partridge)
    ## get median parameter estimates
    wch = c(which(colnames(partridge.mcmc) == "beta0"), grep("beta\\[", colnames(partridge.mcmc)))
    coefs = partridge.mcmc[, wch]
    fit = 10^(coefs %*% t(Xmat))
    ## draw samples from this model
    yRep = sapply(1:nrow(partridge.mcmc), function(i) rnorm(nrow(partridge),
        fit[i, ], partridge.mcmc[i, "sigma"]))
    newdata = data.frame(TREATMENT = partridge$TREATMENT, THORAX = partridge$THORAX,
        yRep) %>% gather(key = Sample, value = Value, -TREATMENT, -THORAX)
    ggplot(newdata) + geom_violin(aes(y = Value, x = TREATMENT, fill = "Model"),
        alpha = 0.5) + geom_violin(data = partridge, aes(y = LONGEV, x = TREATMENT,
        fill = "Obs"), alpha = 0.5) + geom_point(data = partridge, aes(y = LONGEV,
        x = TREATMENT), position = position_jitter(width = 0.1, height = 0),
        color = "black")
    
    plot of chunk tut7.5bQ3.3c4
    ggplot(newdata) + geom_violin(aes(y = Value, x = THORAX, fill = "Model",
        group = THORAX, color = TREATMENT), alpha = 0.5) + geom_point(data = partridge,
        aes(y = LONGEV, x = THORAX, group = THORAX, color = TREATMENT))
    
    plot of chunk tut7.5bQ3.3c4

    We can also explore the posteriors of each parameter.

    library(bayesplot)
    mcmc_intervals(as.matrix(partridge.rstan), regex_pars = "beta|sigma")
    
    plot of chunk tut7.5bQ3.3c5
    mcmc_areas(as.matrix(partridge.rstan), regex_pars = "beta|sigma")
    
    plot of chunk tut7.5bQ3.3c5
    partridge.mcmc = as.matrix(partridge.rstanarm)
    # generate a model matrix
    newdata = partridge
    Xmat = model.matrix(~TREATMENT + THORAX, newdata)
    ## get median parameter estimates
    wch = grep("Intercept|TREATMENT|THORAX", colnames(partridge.mcmc))
    coefs = apply(partridge.mcmc[, wch], 2, median)
    fit = as.vector(coefs %*% t(Xmat))
    resid = log10(partridge$LONGEV) - fit
    ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))
    
    plot of chunk tut7.5bQ3.3d1
    # Plot residuals against treatment level
    newdata = newdata %>% cbind(fit, resid)
    newdata.melt = newdata %>% gather(key = Pred, value = Value, TREATMENT, THORAX)
    ggplot(newdata.melt) + geom_point(aes(y = resid, x = Value)) + facet_wrap(~Pred, scales = "free")
    
    plot of chunk tut7.5bQ3.3d2
    # Standardized residuals
    sresid = resid/sd(resid)
    ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))
    
    plot of chunk tut7.5bQ3.3d3
    partridge.mcmc = as.matrix(partridge.rstanarm)
    # generate a model matrix
    Xmat = model.matrix(~TREATMENT + THORAX, partridge)
    ## get median parameter estimates
    wch = grep("Intercept|TREATMENT|THORAX", colnames(partridge.mcmc))
    coefs = partridge.mcmc[, wch]
    fit = 10^(coefs %*% t(Xmat))
    ## draw samples from this model
    yRep = sapply(1:nrow(partridge.mcmc), function(i) rnorm(nrow(partridge),
        fit[i, ], partridge.mcmc[i, "sigma"]))
    newdata = data.frame(TREATMENT = partridge$TREATMENT, THORAX = partridge$THORAX,
        yRep) %>% gather(key = Sample, value = Value, -TREATMENT, -THORAX)
    ggplot(newdata) + geom_violin(aes(y = Value, x = TREATMENT, fill = "Model"),
        alpha = 0.5) + geom_violin(data = partridge, aes(y = LONGEV, x = TREATMENT,
        fill = "Obs"), alpha = 0.5) + geom_point(data = partridge, aes(y = LONGEV,
        x = TREATMENT), position = position_jitter(width = 0.1, height = 0),
        color = "black")
    
    plot of chunk tut7.5bQ3.3d4
    ggplot(newdata) + geom_violin(aes(y = Value, x = THORAX, fill = "Model",
        group = THORAX, color = TREATMENT), alpha = 0.5) + geom_point(data = partridge,
        aes(y = LONGEV, x = THORAX, group = THORAX, color = TREATMENT))
    
    plot of chunk tut7.5bQ3.3d4

    We can also explore the posteriors of each parameter.

    library(bayesplot)
    mcmc_intervals(as.matrix(partridge.rstanarm), regex_pars = "Intercept|TREATMENT|THORAX|sigma")
    
    plot of chunk tut7.5bQ3.3d5
    mcmc_areas(as.matrix(partridge.rstanarm), regex_pars = "Intercept|TREATMENT|THORAX|sigma")
    
    plot of chunk tut7.5bQ3.3d5
    partridge.mcmc = as.matrix(partridge.brm)
    # generate a model matrix
    newdata = partridge
    Xmat = model.matrix(~TREATMENT + THORAX, newdata)
    ## get median parameter estimates
    wch = grep("b_", colnames(partridge.mcmc))
    coefs = apply(partridge.mcmc[, wch], 2, median)
    fit = as.vector(coefs %*% t(Xmat))
    resid = log10(partridge$LONGEV) - fit
    ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))
    
    plot of chunk tut7.5bQ3.3e1
    # Plot residuals against treatment level
    newdata = newdata %>% cbind(fit, resid)
    newdata.melt = newdata %>% gather(key = Pred, value = Value, TREATMENT, THORAX)
    ggplot(newdata.melt) + geom_point(aes(y = resid, x = Value)) + facet_wrap(~Pred, scales = "free")
    
    plot of chunk tut7.5bQ3.3e2
    # Standardized residuals
    sresid = resid/sd(resid)
    ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))
    
    plot of chunk tut7.5bQ3.3e3
    partridge.mcmc = as.matrix(partridge.brm)
    # generate a model matrix
    Xmat = model.matrix(~TREATMENT + THORAX, partridge)
    ## get median parameter estimates
    wch = grep("b_", colnames(partridge.mcmc))
    coefs = partridge.mcmc[, wch]
    fit = 10^(coefs %*% t(Xmat))
    ## draw samples from this model
    yRep = sapply(1:nrow(partridge.mcmc), function(i) rnorm(nrow(partridge),
        fit[i, ], partridge.mcmc[i, "sigma"]))
    newdata = data.frame(TREATMENT = partridge$TREATMENT, THORAX = partridge$THORAX,
        yRep) %>% gather(key = Sample, value = Value, -TREATMENT, -THORAX)
    ggplot(newdata) + geom_violin(aes(y = Value, x = TREATMENT, fill = "Model"),
        alpha = 0.5) + geom_violin(data = partridge, aes(y = LONGEV, x = TREATMENT,
        fill = "Obs"), alpha = 0.5) + geom_point(data = partridge, aes(y = LONGEV,
        x = TREATMENT), position = position_jitter(width = 0.1, height = 0),
        color = "black")
    
    plot of chunk tut7.5bQ3.3e4
    ggplot(newdata) + geom_violin(aes(y = Value, x = THORAX, fill = "Model",
        group = THORAX, color = TREATMENT), alpha = 0.5) + geom_point(data = partridge,
        aes(y = LONGEV, x = THORAX, group = THORAX, color = TREATMENT))
    
    plot of chunk tut7.5bQ3.3e4

    We can also explore the posteriors of each parameter.

    library(bayesplot)
    mcmc_intervals(as.matrix(partridge.brm), regex_pars = "b_|sigma")
    
    plot of chunk tut7.5bQ3.3e5
    mcmc_areas(as.matrix(partridge.brm), regex_pars = "b_|sigma")
    
    plot of chunk tut7.5bQ3.3e5
  6. The model validation diagnostics all seem reasonable suggesting that the model is likely to be reliable.

  7. Explore parameter estimates
    library(MCMCpack)
    summary(partridge.mcmcpack)
    
    Iterations = 1001:11000
    Thinning interval = 1 
    Number of chains = 1 
    Sample size per chain = 10000 
    
    1. Empirical mean and standard deviation for each variable,
       plus standard error of the mean:
    
                       Mean        SD  Naive SE Time-series SE
    (Intercept)     0.79153 0.0862441 8.624e-04      8.643e-04
    TREATMENTPreg1  0.02226 0.0238689 2.387e-04      2.387e-04
    TREATMENTPreg8  0.03650 0.0241417 2.414e-04      2.414e-04
    TREATMENTVirg1 -0.05404 0.0240102 2.401e-04      2.401e-04
    TREATMENTVirg8 -0.18198 0.0241462 2.415e-04      2.415e-04
    THORAX          1.19335 0.1009666 1.010e-03      1.006e-03
    sigma2          0.00713 0.0009452 9.452e-06      9.942e-06
    
    2. Quantiles for each variable:
    
                        2.5%       25%       50%       75%     97.5%
    (Intercept)     0.625379  0.732743  0.791865  0.849579  0.961182
    TREATMENTPreg1 -0.025297  0.006360  0.022323  0.038120  0.068806
    TREATMENTPreg8 -0.010526  0.020301  0.036510  0.052720  0.083886
    TREATMENTVirg1 -0.101254 -0.069767 -0.053919 -0.038007 -0.006695
    TREATMENTVirg8 -0.230098 -0.197744 -0.181983 -0.165952 -0.134440
    THORAX          0.992453  1.124938  1.193252  1.262005  1.389023
    sigma2          0.005491  0.006455  0.007057  0.007723  0.009211
    
    library(broom)
    tidyMCMC(partridge.mcmcpack, conf.int = TRUE, conf.method = "HPDinterval")
    
                term     estimate    std.error     conf.low    conf.high
    1    (Intercept)  0.791534756 0.0862440749  0.627282417  0.962326127
    2 TREATMENTPreg1  0.022262435 0.0238688705 -0.025800765  0.068161673
    3 TREATMENTPreg8  0.036498778 0.0241417477 -0.012028238  0.082087102
    4 TREATMENTVirg1 -0.054043317 0.0240102293 -0.101607483 -0.007550831
    5 TREATMENTVirg8 -0.181978738 0.0241461559 -0.231333679 -0.135891212
    6         THORAX  1.193347232 0.1009665978  0.995580301  1.390356975
    7         sigma2  0.007129683 0.0009451969  0.005343271  0.008958414
    
    mcmcpvalue(partridge.mcmcpack[, "TREATMENTPreg1"])
    
    [1] 0.3491
    
    mcmcpvalue(partridge.mcmcpack[, "TREATMENTPreg8"])
    
    [1] 0.133
    
    mcmcpvalue(partridge.mcmcpack[, "TREATMENTVirg1"])
    
    [1] 0.0271
    
    mcmcpvalue(partridge.mcmcpack[, "TREATMENTVirg8"])
    
    [1] 0
    
    mcmcpvalue(partridge.mcmcpack[, "THORAX"])
    
    [1] 0
    
    wch = grep("TREATMENT|THORAX", colnames(partridge.mcmcpack))
    mcmcpvalue(partridge.mcmcpack[, wch])
    
    [1] 0
    
    ## Frequentist for comparison
    summary(lm(log10(LONGEV) ~ TREATMENT + THORAX, partridge))
    
    Call:
    lm(formula = log10(LONGEV) ~ TREATMENT + THORAX, data = partridge)
    
    Residuals:
          Min        1Q    Median        3Q       Max 
    -0.226736 -0.058445 -0.003469  0.059961  0.170389 
    
    Coefficients:
                   Estimate Std. Error t value Pr(>|t|)    
    (Intercept)     0.79095    0.08443   9.368 5.89e-16 ***
    TREATMENTPreg1  0.02260    0.02368   0.954   0.3419    
    TREATMENTPreg8  0.03648    0.02385   1.530   0.1287    
    TREATMENTVirg1 -0.05381    0.02366  -2.275   0.0247 *  
    TREATMENTVirg8 -0.18165    0.02392  -7.592 7.79e-12 ***
    THORAX          1.19385    0.09900  12.060  < 2e-16 ***
    ---
    Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
    
    Residual standard error: 0.08364 on 119 degrees of freedom
    Multiple R-squared:  0.7055,	Adjusted R-squared:  0.6932 
    F-statistic: 57.02 on 5 and 119 DF,  p-value: < 2.2e-16
    
    print(partridge.r2jags)
    
    Inference for Bugs model at "5", fit using jags,
     3 chains, each with 50000 iterations (first 3000 discarded), n.thin = 10
     n.sims = 14100 iterations saved
              mu.vect sd.vect     2.5%      25%      50%      75%    97.5%  Rhat n.eff
    beta[1]     0.022   0.024   -0.025    0.006    0.022    0.039    0.069 1.001  5000
    beta[2]     0.037   0.024   -0.010    0.020    0.037    0.053    0.083 1.001 11000
    beta[3]    -0.054   0.024   -0.101   -0.070   -0.054   -0.038   -0.007 1.001 14000
    beta[4]    -0.182   0.024   -0.229   -0.198   -0.182   -0.166   -0.134 1.001  6500
    beta[5]     1.194   0.100    0.995    1.128    1.194    1.262    1.389 1.001  7500
    beta0       0.791   0.085    0.626    0.733    0.790    0.848    0.960 1.001  6800
    sigma       0.085   0.006    0.074    0.081    0.084    0.088    0.096 1.001 14000
    deviance -264.409   3.899 -269.889 -267.273 -265.103 -262.305 -254.937 1.002  2600
    
    For each parameter, n.eff is a crude measure of effective sample size,
    and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
    
    DIC info (using the rule, pD = var(deviance)/2)
    pD = 7.6 and DIC = -256.8
    DIC is an estimate of expected predictive error (lower deviance is better).
    
    library(broom)
    tidyMCMC(partridge.r2jags$BUGSoutput$sims.matrix, conf.int = TRUE, conf.method = "HPDinterval")
    
          term      estimate   std.error      conf.low     conf.high
    1  beta[1]    0.02222087 0.024107038   -0.02387136  7.028915e-02
    2  beta[2]    0.03655562 0.024131694   -0.01101267  8.252448e-02
    3  beta[3]   -0.05390009 0.023948614   -0.09989605 -6.049349e-03
    4  beta[4]   -0.18170503 0.024261555   -0.22861460 -1.338749e-01
    5  beta[5]    1.19380034 0.100331683    1.00121995  1.392995e+00
    6    beta0    0.79102584 0.085462563    0.62940375  9.630549e-01
    7 deviance -264.40914257 3.898646906 -270.71869433 -2.567381e+02
    8    sigma    0.08453005 0.005586101    0.07400793  9.562832e-02
    
    partridge.mcmc = partridge.r2jags$BUGSoutput$sims.matrix
    mcmcpvalue(partridge.mcmc[, "beta[1]"])
    
    [1] 0.3587234
    
    mcmcpvalue(partridge.mcmc[, "beta[2]"])
    
    [1] 0.1301418
    
    mcmcpvalue(partridge.mcmc[, "beta[3]"])
    
    [1] 0.02468085
    
    mcmcpvalue(partridge.mcmc[, "beta[4]"])
    
    [1] 0
    
    mcmcpvalue(partridge.mcmc[, "beta[5]"])
    
    [1] 0
    
    wch = grep("beta\\[", colnames(partridge.mcmc))
    mcmcpvalue(partridge.mcmc[, wch])
    
    [1] 0
    
    # summary(partridge.rstan)
    library(broom)
    partridge.mcmc = as.matrix(partridge.rstan)
    tidyMCMC(partridge.rstan, conf.int = TRUE, conf.method = "HPDinterval", pars = c("beta0", "beta", "sigma"),
        ess = TRUE, rhat = TRUE)
    
         term    estimate   std.error    conf.low    conf.high     rhat  ess
    1   beta0  0.79128931 0.086061774  0.62566968  0.961482006 1.000007 5827
    2 beta[1]  0.02299254 0.024083810 -0.02571685  0.067810084 1.000434 5184
    3 beta[2]  0.03628671 0.024376240 -0.01049910  0.083787879 1.000117 5274
    4 beta[3] -0.05368096 0.023662283 -0.10081880 -0.009464246 1.000751 5262
    5 beta[4] -0.18108684 0.024518494 -0.22869622 -0.132604319 1.000599 5420
    6 beta[5]  1.19318572 0.100771538  1.00337859  1.396651732 1.000058 6383
    7   sigma  0.08455309 0.005593207  0.07383390  0.095473807 1.000092 6376
    
    mcmcpvalue(partridge.mcmc[, "beta[1]"])
    
    [1] 0.3402963
    
    mcmcpvalue(partridge.mcmc[, "beta[2]"])
    
    [1] 0.1362963
    
    mcmcpvalue(partridge.mcmc[, "beta[3]"])
    
    [1] 0.02237037
    
    mcmcpvalue(partridge.mcmc[, "beta[4]"])
    
    [1] 0
    
    mcmcpvalue(partridge.mcmc[, "beta[5]"])
    
    [1] 0
    
    wch = grep("beta\\[", colnames(partridge.mcmc))
    mcmcpvalue(partridge.mcmc[, wch])
    
    [1] 0
    
    summary(partridge.rstanarm)
    
    Model Info:
    
     function:  stan_glm
     family:    gaussian [identity]
     formula:   log10(LONGEV) ~ TREATMENT + THORAX
     algorithm: sampling
     priors:    see help('prior_summary')
     sample:    6750 (posterior sample size)
     num obs:   125
    
    Estimates:
                     mean   sd    2.5%   25%   50%   75%   97.5%
    (Intercept)      0.8    0.1   0.6    0.7   0.8   0.9   1.0  
    TREATMENTPreg1   0.0    0.0   0.0    0.0   0.0   0.0   0.1  
    TREATMENTPreg8   0.0    0.0   0.0    0.0   0.0   0.1   0.1  
    TREATMENTVirg1  -0.1    0.0  -0.1   -0.1  -0.1   0.0   0.0  
    TREATMENTVirg8  -0.2    0.0  -0.2   -0.2  -0.2  -0.2  -0.1  
    THORAX           1.2    0.1   1.0    1.1   1.2   1.3   1.4  
    sigma            0.1    0.0   0.1    0.1   0.1   0.1   0.1  
    mean_PPD         1.7    0.0   1.7    1.7   1.7   1.7   1.8  
    log-posterior  122.3    1.9 117.6  121.3 122.6 123.7 125.0  
    
    Diagnostics:
                   mcse Rhat n_eff
    (Intercept)    0.0  1.0  5726 
    TREATMENTPreg1 0.0  1.0  4536 
    TREATMENTPreg8 0.0  1.0  4695 
    TREATMENTVirg1 0.0  1.0  4830 
    TREATMENTVirg8 0.0  1.0  4579 
    THORAX         0.0  1.0  5922 
    sigma          0.0  1.0  6149 
    mean_PPD       0.0  1.0  6728 
    log-posterior  0.0  1.0  4744 
    
    For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence Rhat=1).
    
    library(broom)
    partridge.mcmc = as.matrix(partridge.rstanarm)
    tidyMCMC(partridge.rstanarm$stanfit, conf.int = TRUE, conf.method = "HPDinterval", ess = TRUE, rhat = TRUE)
    
                term     estimate   std.error     conf.low     conf.high      rhat  ess
    1    (Intercept)   0.79293103 0.085741974   0.63442374   0.967936705 0.9998648 5726
    2 TREATMENTPreg1   0.02234996 0.024074877  -0.02338140   0.070865192 0.9999204 4536
    3 TREATMENTPreg8   0.03641433 0.024403496  -0.01013371   0.086076915 1.0000061 4695
    4 TREATMENTVirg1  -0.05403215 0.023548978  -0.09925124  -0.006906492 0.9997467 4830
    5 TREATMENTVirg8  -0.18226103 0.023984235  -0.22966723  -0.136666158 1.0002442 4579
    6         THORAX   1.19165812 0.100358278   0.98759586   1.382129994 0.9999683 5922
    7          sigma   0.08462483 0.005556366   0.07493118   0.096523655 0.9996408 6149
    8       mean_PPD   1.73567551 0.010668382   1.71548076   1.757712481 1.0000841 6728
    9  log-posterior 122.29436093 1.937479394 118.54507644 125.359663348 0.9998040 4744
    
    mcmcpvalue(partridge.mcmc[, "TREATMENTPreg1"])
    
    [1] 0.354963
    
    mcmcpvalue(partridge.mcmc[, "TREATMENTPreg8"])
    
    [1] 0.1346667
    
    mcmcpvalue(partridge.mcmc[, "TREATMENTVirg1"])
    
    [1] 0.02222222
    
    mcmcpvalue(partridge.mcmc[, "TREATMENTVirg8"])
    
    [1] 0
    
    mcmcpvalue(partridge.mcmc[, "THORAX"])
    
    [1] 0
    
    wch = grep("TREATMENT|THORAX", colnames(partridge.mcmc))
    mcmcpvalue(partridge.mcmc[, wch])
    
    [1] 0
    
    summary(partridge.brm)
    
     Family: gaussian(identity) 
    Formula: log10(LONGEV) ~ TREATMENT + THORAX 
       Data: partridge (Number of observations: 125) 
    Samples: 3 chains, each with iter = 5000; warmup = 500; thin = 2; 
             total post-warmup samples = 6750
        ICs: LOO = NA; WAIC = NA; R2 = NA
     
    Population-Level Effects: 
                   Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
    Intercept          0.79      0.09     0.63     0.96       6165    1
    TREATMENTPreg1     0.02      0.02    -0.02     0.07       5774    1
    TREATMENTPreg8     0.04      0.02    -0.01     0.08       5790    1
    TREATMENTVirg1    -0.05      0.02    -0.10    -0.01       5633    1
    TREATMENTVirg8    -0.18      0.02    -0.23    -0.13       5810    1
    THORAX             1.19      0.10     1.00     1.39       6155    1
    
    Family Specific Parameters: 
          Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
    sigma     0.08      0.01     0.07      0.1       6265    1
    
    Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample 
    is a crude measure of effective sample size, and Rhat is the potential 
    scale reduction factor on split chains (at convergence, Rhat = 1).
    
    library(broom)
    partridge.mcmc = as.matrix(partridge.brm)
    tidyMCMC(partridge.brm$fit, conf.int = TRUE, conf.method = "HPDinterval", ess = TRUE, rhat = TRUE)
    
                  term    estimate   std.error     conf.low    conf.high      rhat  ess
    1      b_Intercept  0.79087403 0.085559712  0.628005663  0.963051668 0.9999102 6165
    2 b_TREATMENTPreg1  0.02240460 0.024011041 -0.024874580  0.068487474 0.9997021 5774
    3 b_TREATMENTPreg8  0.03620256 0.023843389 -0.008463557  0.084372278 0.9999173 5790
    4 b_TREATMENTVirg1 -0.05404743 0.023815863 -0.100910410 -0.008313408 0.9997183 5633
    5 b_TREATMENTVirg8 -0.18198728 0.023869345 -0.228420724 -0.135256737 0.9997294 5810
    6         b_THORAX  1.19414513 0.100754965  0.995884101  1.389117902 0.9998769 6155
    7            sigma  0.08463568 0.005584061  0.074166370  0.095825344 1.0004425 6265
    
    mcmcpvalue(partridge.mcmc[, "b_TREATMENTPreg1"])
    
    [1] 0.3511111
    
    mcmcpvalue(partridge.mcmc[, "b_TREATMENTPreg8"])
    
    [1] 0.128
    
    mcmcpvalue(partridge.mcmc[, "b_TREATMENTVirg1"])
    
    [1] 0.024
    
    mcmcpvalue(partridge.mcmc[, "b_TREATMENTVirg8"])
    
    [1] 0
    
    wch = grep("b_TREATMENT|b_THORAX", colnames(partridge.mcmc))
    mcmcpvalue(partridge.mcmc[, wch])
    
    [1] 0
    
  8. Generate graphical summaries
    library(MCMCpack)
    partridge.mcmc = partridge.mcmcpack
    ## Calculate the fitted values
    newdata = with(partridge, expand.grid(TREATMENT = levels(TREATMENT), THORAX = seq(min(THORAX),
        max(THORAX), len = 100)))
    Xmat = model.matrix(~TREATMENT + THORAX, newdata)
    wch = grep("Intercept|TREATMENT|THORAX", colnames(partridge.mcmc))
    coefs = partridge.mcmc[, wch]
    fit = 10^(coefs %*% t(Xmat))
    newdata.95 = newdata %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.level = 0.95,
        conf.method = "HPDinterval"))
    newdata.80 = newdata %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.level = 0.8,
        conf.method = "HPDinterval"))
    newdata.95 %>% head
    
      TREATMENT    THORAX estimate std.error conf.low conf.high
    1      None 0.6400000 35.98025  2.167746 31.85706  40.31875
    2     Preg1 0.6400000 37.86609  2.167869 33.72704  42.22615
    3     Preg8 0.6400000 39.12340  2.159726 34.98930  43.45699
    4     Virg1 0.6400000 31.77057  1.920246 28.06600  35.52507
    5     Virg8 0.6400000 23.65556  1.281681 21.25357  26.25122
    6      None 0.6430303 36.27993  2.166315 32.15499  40.61696
    
    ggplot(newdata.95, aes(y = estimate, x = THORAX, color = TREATMENT)) +
        geom_ribbon(aes(ymin = conf.low, ymax = conf.high, fill = TREATMENT),
            color = NA, alpha = 0.2) + geom_ribbon(data = newdata.80, aes(ymin = conf.low,
        ymax = conf.high, fill = TREATMENT), color = NA, alpha = 0.4) + geom_line() +
        scale_y_continuous("Male fruitfly longevity (days)") + scale_fill_discrete("Substrate type",
        breaks = c("None", "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners",
            "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner",
            "8 virgin\npartners")) + scale_color_discrete("Substrate type",
        breaks = c("None", "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners",
            "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner",
            "8 virgin\npartners")) + theme_classic()
    
    plot of chunk tut7.5bQ3.5a1
    # And now with partial residuals
    fdata = rdata = partridge
    fMat = rMat = model.matrix(~TREATMENT + THORAX, fdata)
    fit = as.vector(apply(coefs, 2, median) %*% t(fMat))
    resid = as.vector(log10(partridge$LONGEV) - apply(coefs, 2, median) %*%
        t(rMat))
    rdata = rdata %>% mutate(partial.resid = 10^(resid + fit))
    
    br <- log10(pretty(partridge$LONGEV))
    ggplot(newdata.95, aes(y = estimate, x = THORAX, linetype = TREATMENT)) +
        geom_line() + geom_ribbon(aes(ymin = conf.low, ymax = conf.high), fill = "grey",
        alpha = 0.2) + geom_point(data = rdata, aes(y = partial.resid, shape = TREATMENT,
        fill = TREATMENT)) + scale_y_continuous("Longevity (days)") + scale_x_continuous("Thorax length (mm)") +
        scale_linetype_manual("Substrate type", values = 1:5, breaks = c("None",
            "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners",
            "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner",
            "8 virgin\npartners"), guide = guide_legend(ncol = 2)) + scale_shape_manual("Substrate type",
        values = c(23, 22, 21, 22, 21), breaks = c("None", "Preg1", "Preg8",
            "Virg1", "Virg8"), labels = c("No\npartners", "1 pregnant\npartner",
            "8 pregnant\npartners", "1 virgin\npartner", "8 virgin\npartners"),
        guide = guide_legend(ncol = 2)) + scale_fill_manual("Substrate type",
        values = c("black", "black", "black", "white", "white"), breaks = c("None",
            "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners",
            "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner",
            "8 virgin\npartners"), guide = guide_legend(ncol = 2)) + theme_classic() +
        theme(legend.justification = c(0, 1), legend.position = c(0.1, 1),
            axis.title.y = element_text(vjust = 2, size = rel(1.25)), axis.title.x = element_text(vjust = -2,
                size = rel(1.25)), plot.margin = unit(c(0.5, 0.5, 2, 2), "lines"))
    
    plot of chunk tut7.5bQ3.5a1
    partridge.mcmc = partridge.r2jags$BUGSoutput$sims.matrix
    ## Calculate the fitted values
    newdata = with(partridge, expand.grid(TREATMENT = levels(TREATMENT), THORAX = seq(min(THORAX),
        max(THORAX), len = 100)))
    Xmat = model.matrix(~TREATMENT + THORAX, newdata)
    wch = c(which(colnames(partridge.mcmc) == "beta0"), grep("beta\\[", colnames(partridge.mcmc)))
    coefs = partridge.mcmc[, wch]
    fit = 10^(coefs %*% t(Xmat))
    newdata.95 = newdata %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.level = 0.95,
        conf.method = "HPDinterval"))
    newdata.80 = newdata %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.level = 0.8,
        conf.method = "HPDinterval"))
    newdata.95 %>% head
    
      TREATMENT    THORAX estimate std.error conf.low conf.high
    1      None 0.6400000 35.96099  2.150108 31.84513  40.20482
    2     Preg1 0.6400000 37.84543  2.204559 33.43408  42.08953
    3     Preg8 0.6400000 39.10798  2.144250 34.96804  43.38181
    4     Virg1 0.6400000 31.76434  1.908763 28.00014  35.46843
    5     Virg8 0.6400000 23.65820  1.274451 21.22775  26.21792
    6      None 0.6430303 36.26065  2.148823 32.15733  40.50511
    
    ggplot(newdata.95, aes(y = estimate, x = THORAX, color = TREATMENT)) +
        geom_ribbon(aes(ymin = conf.low, ymax = conf.high, fill = TREATMENT),
            color = NA, alpha = 0.2) + geom_ribbon(data = newdata.80, aes(ymin = conf.low,
        ymax = conf.high, fill = TREATMENT), color = NA, alpha = 0.4) + geom_line() +
        scale_y_continuous("Male fruitfly longevity (days)") + scale_fill_discrete("Substrate type",
        breaks = c("None", "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners",
            "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner",
            "8 virgin\npartners")) + scale_color_discrete("Substrate type",
        breaks = c("None", "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners",
            "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner",
            "8 virgin\npartners")) + theme_classic()
    
    plot of chunk tut7.5bQ3.5b1
    # And now with partial residuals
    fdata = rdata = partridge
    fMat = rMat = model.matrix(~TREATMENT + THORAX, fdata)
    fit = as.vector(apply(coefs, 2, median) %*% t(fMat))
    resid = as.vector(log10(partridge$LONGEV) - apply(coefs, 2, median) %*%
        t(rMat))
    rdata = rdata %>% mutate(partial.resid = 10^(resid + fit))
    
    br <- log10(pretty(partridge$LONGEV))
    ggplot(newdata.95, aes(y = estimate, x = THORAX, linetype = TREATMENT)) +
        geom_line() + geom_ribbon(aes(ymin = conf.low, ymax = conf.high), fill = "grey",
        alpha = 0.2) + geom_point(data = rdata, aes(y = partial.resid, shape = TREATMENT,
        fill = TREATMENT)) + scale_y_continuous("Longevity (days)") + scale_x_continuous("Thorax length (mm)") +
        scale_linetype_manual("Substrate type", values = 1:5, breaks = c("None",
            "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners",
            "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner",
            "8 virgin\npartners"), guide = guide_legend(ncol = 2)) + scale_shape_manual("Substrate type",
        values = c(23, 22, 21, 22, 21), breaks = c("None", "Preg1", "Preg8",
            "Virg1", "Virg8"), labels = c("No\npartners", "1 pregnant\npartner",
            "8 pregnant\npartners", "1 virgin\npartner", "8 virgin\npartners"),
        guide = guide_legend(ncol = 2)) + scale_fill_manual("Substrate type",
        values = c("black", "black", "black", "white", "white"), breaks = c("None",
            "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners",
            "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner",
            "8 virgin\npartners"), guide = guide_legend(ncol = 2)) + theme_classic() +
        theme(legend.justification = c(0, 1), legend.position = c(0.1, 1),
            axis.title.y = element_text(vjust = 2, size = rel(1.25)), axis.title.x = element_text(vjust = -2,
                size = rel(1.25)), plot.margin = unit(c(0.5, 0.5, 2, 2), "lines"))
    
    plot of chunk tut7.5bQ3.5b1
    partridge.mcmc = as.matrix(partridge.rstan)
    ## Calculate the fitted values
    newdata = with(partridge, expand.grid(TREATMENT = levels(TREATMENT), THORAX = seq(min(THORAX),
        max(THORAX), len = 100)))
    Xmat = model.matrix(~TREATMENT + THORAX, newdata)
    wch = c(which(colnames(partridge.mcmc) == "beta0"), grep("beta\\[", colnames(partridge.mcmc)))
    coefs = partridge.mcmc[, wch]
    fit = 10^(coefs %*% t(Xmat))
    newdata.95 = newdata %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.level = 0.95,
        conf.method = "HPDinterval"))
    newdata.80 = newdata %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.level = 0.8,
        conf.method = "HPDinterval"))
    newdata.95 %>% head
    
      TREATMENT    THORAX estimate std.error conf.low conf.high
    1      None 0.6400000 35.95150  2.171163 32.04491  40.56730
    2     Preg1 0.6400000 37.90240  2.225653 33.73523  42.38996
    3     Preg8 0.6400000 39.07261  2.150087 34.97066  43.26534
    4     Virg1 0.6400000 31.77152  1.919648 28.18076  35.68269
    5     Virg8 0.6400000 23.68545  1.287215 21.32823  26.26502
    6      None 0.6430303 36.25091  2.169872 32.34034  40.84279
    
    ggplot(newdata.95, aes(y = estimate, x = THORAX, color = TREATMENT)) +
        geom_ribbon(aes(ymin = conf.low, ymax = conf.high, fill = TREATMENT),
            color = NA, alpha = 0.2) + geom_ribbon(data = newdata.80, aes(ymin = conf.low,
        ymax = conf.high, fill = TREATMENT), color = NA, alpha = 0.4) + geom_line() +
        scale_y_continuous("Male fruitfly longevity (days)") + scale_fill_discrete("Substrate type",
        breaks = c("None", "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners",
            "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner",
            "8 virgin\npartners")) + scale_color_discrete("Substrate type",
        breaks = c("None", "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners",
            "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner",
            "8 virgin\npartners")) + theme_classic()
    
    plot of chunk tut7.5bQ3.5c1
    # And now with partial residuals
    fdata = rdata = partridge
    fMat = rMat = model.matrix(~TREATMENT + THORAX, fdata)
    fit = as.vector(apply(coefs, 2, median) %*% t(fMat))
    resid = as.vector(log10(partridge$LONGEV) - apply(coefs, 2, median) %*%
        t(rMat))
    rdata = rdata %>% mutate(partial.resid = 10^(resid + fit))
    
    br <- log10(pretty(partridge$LONGEV))
    ggplot(newdata.95, aes(y = estimate, x = THORAX, linetype = TREATMENT)) +
        geom_line() + geom_ribbon(aes(ymin = conf.low, ymax = conf.high), fill = "grey",
        alpha = 0.2) + geom_point(data = rdata, aes(y = partial.resid, shape = TREATMENT,
        fill = TREATMENT)) + scale_y_continuous("Longevity (days)") + scale_x_continuous("Thorax length (mm)") +
        scale_linetype_manual("Substrate type", values = 1:5, breaks = c("None",
            "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners",
            "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner",
            "8 virgin\npartners"), guide = guide_legend(ncol = 2)) + scale_shape_manual("Substrate type",
        values = c(23, 22, 21, 22, 21), breaks = c("None", "Preg1", "Preg8",
            "Virg1", "Virg8"), labels = c("No\npartners", "1 pregnant\npartner",
            "8 pregnant\npartners", "1 virgin\npartner", "8 virgin\npartners"),
        guide = guide_legend(ncol = 2)) + scale_fill_manual("Substrate type",
        values = c("black", "black", "black", "white", "white"), breaks = c("None",
            "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners",
            "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner",
            "8 virgin\npartners"), guide = guide_legend(ncol = 2)) + theme_classic() +
        theme(legend.justification = c(0, 1), legend.position = c(0.1, 1),
            axis.title.y = element_text(vjust = 2, size = rel(1.25)), axis.title.x = element_text(vjust = -2,
                size = rel(1.25)), plot.margin = unit(c(0.5, 0.5, 2, 2), "lines"))
    
    plot of chunk tut7.5bQ3.5c1
    partridge.mcmc = as.matrix(partridge.rstanarm)
    ## Calculate the fitted values
    newdata = with(partridge, expand.grid(TREATMENT = levels(TREATMENT), THORAX = seq(min(THORAX),
        max(THORAX), len = 100)))
    Xmat = model.matrix(~TREATMENT + THORAX, newdata)
    wch = grep("Intercept|TREATMENT|THORAX", colnames(partridge.mcmc))
    coefs = partridge.mcmc[, wch]
    fit = 10^(coefs %*% t(Xmat))
    newdata.95 = newdata %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.level = 0.95,
        conf.method = "HPDinterval"))
    newdata.80 = newdata %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.level = 0.8,
        conf.method = "HPDinterval"))
    newdata.95 %>% head
    
      TREATMENT    THORAX estimate std.error conf.low conf.high
    1      None 0.6400000 36.00583  2.159989 31.81425  40.09867
    2     Preg1 0.6400000 37.90180  2.179165 33.48592  42.07839
    3     Preg8 0.6400000 39.14320  2.141342 35.04090  43.33835
    4     Virg1 0.6400000 31.79228  1.886028 27.94081  35.30833
    5     Virg8 0.6400000 23.65624  1.258929 21.30229  26.22091
    6      None 0.6430303 36.30531  2.158638 32.06754  40.35430
    
    ggplot(newdata.95, aes(y = estimate, x = THORAX, color = TREATMENT)) +
        geom_ribbon(aes(ymin = conf.low, ymax = conf.high, fill = TREATMENT),
            color = NA, alpha = 0.2) + geom_ribbon(data = newdata.80, aes(ymin = conf.low,
        ymax = conf.high, fill = TREATMENT), color = NA, alpha = 0.4) + geom_line() +
        scale_y_continuous("Male fruitfly longevity (days)") + scale_fill_discrete("Substrate type",
        breaks = c("None", "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners",
            "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner",
            "8 virgin\npartners")) + scale_color_discrete("Substrate type",
        breaks = c("None", "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners",
            "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner",
            "8 virgin\npartners")) + theme_classic()
    
    plot of chunk tut7.5bQ3.5d1
    # And now with partial residuals
    fdata = rdata = partridge
    fMat = rMat = model.matrix(~TREATMENT + THORAX, fdata)
    fit = as.vector(apply(coefs, 2, median) %*% t(fMat))
    resid = as.vector(log10(partridge$LONGEV) - apply(coefs, 2, median) %*%
        t(rMat))
    rdata = rdata %>% mutate(partial.resid = 10^(resid + fit))
    
    br <- log10(pretty(partridge$LONGEV))
    ggplot(newdata.95, aes(y = estimate, x = THORAX, linetype = TREATMENT)) +
        geom_line() + geom_ribbon(aes(ymin = conf.low, ymax = conf.high), fill = "grey",
        alpha = 0.2) + geom_point(data = rdata, aes(y = partial.resid, shape = TREATMENT,
        fill = TREATMENT)) + scale_y_continuous("Longevity (days)") + scale_x_continuous("Thorax length (mm)") +
        scale_linetype_manual("Substrate type", values = 1:5, breaks = c("None",
            "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners",
            "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner",
            "8 virgin\npartners"), guide = guide_legend(ncol = 2)) + scale_shape_manual("Substrate type",
        values = c(23, 22, 21, 22, 21), breaks = c("None", "Preg1", "Preg8",
            "Virg1", "Virg8"), labels = c("No\npartners", "1 pregnant\npartner",
            "8 pregnant\npartners", "1 virgin\npartner", "8 virgin\npartners"),
        guide = guide_legend(ncol = 2)) + scale_fill_manual("Substrate type",
        values = c("black", "black", "black", "white", "white"), breaks = c("None",
            "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners",
            "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner",
            "8 virgin\npartners"), guide = guide_legend(ncol = 2)) + theme_classic() +
        theme(legend.justification = c(0, 1), legend.position = c(0.1, 1),
            axis.title.y = element_text(vjust = 2, size = rel(1.25)), axis.title.x = element_text(vjust = -2,
                size = rel(1.25)), plot.margin = unit(c(0.5, 0.5, 2, 2), "lines"))
    
    plot of chunk tut7.5bQ3.5d1
    partridge.mcmc = as.matrix(partridge.brm)
    ## Calculate the fitted values
    newdata = with(partridge, expand.grid(TREATMENT = levels(TREATMENT), THORAX = seq(min(THORAX),
        max(THORAX), len = 100)))
    Xmat = model.matrix(~TREATMENT + THORAX, newdata)
    wch = grep("b_", colnames(partridge.mcmc))
    coefs = partridge.mcmc[, wch]
    fit = 10^(coefs %*% t(Xmat))
    newdata.95 = newdata %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.level = 0.95,
        conf.method = "HPDinterval"))
    newdata.80 = newdata %>% cbind(tidyMCMC(fit, conf.int = TRUE, conf.level = 0.8,
        conf.method = "HPDinterval"))
    newdata.95 %>% head
    
      TREATMENT    THORAX estimate std.error conf.low conf.high
    1      None 0.6400000 35.96536  2.125997 31.82643  40.13090
    2     Preg1 0.6400000 37.86659  2.190934 33.72999  42.27928
    3     Preg8 0.6400000 39.08282  2.151109 34.95758  43.39081
    4     Virg1 0.6400000 31.75761  1.891817 27.93170  35.27973
    5     Virg8 0.6400000 23.64734  1.288577 21.28669  26.30815
    6      None 0.6430303 36.26515  2.124425 32.13702  40.42999
    
    ggplot(newdata.95, aes(y = estimate, x = THORAX, color = TREATMENT)) +
        geom_ribbon(aes(ymin = conf.low, ymax = conf.high, fill = TREATMENT),
            color = NA, alpha = 0.2) + geom_ribbon(data = newdata.80, aes(ymin = conf.low,
        ymax = conf.high, fill = TREATMENT), color = NA, alpha = 0.4) + geom_line() +
        scale_y_continuous("Male fruitfly longevity (days)") + scale_fill_discrete("Substrate type",
        breaks = c("None", "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners",
            "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner",
            "8 virgin\npartners")) + scale_color_discrete("Substrate type",
        breaks = c("None", "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners",
            "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner",
            "8 virgin\npartners")) + theme_classic()
    
    plot of chunk tut7.5bQ3.5e1
    # And now with partial residuals
    fdata = rdata = partridge
    fMat = rMat = model.matrix(~TREATMENT + THORAX, fdata)
    fit = as.vector(apply(coefs, 2, median) %*% t(fMat))
    resid = as.vector(log10(partridge$LONGEV) - apply(coefs, 2, median) %*%
        t(rMat))
    rdata = rdata %>% mutate(partial.resid = 10^(resid + fit))
    
    br <- log10(pretty(partridge$LONGEV))
    ggplot(newdata.95, aes(y = estimate, x = THORAX, linetype = TREATMENT)) +
        geom_line() + geom_ribbon(aes(ymin = conf.low, ymax = conf.high), fill = "grey",
        alpha = 0.2) + geom_point(data = rdata, aes(y = partial.resid, shape = TREATMENT,
        fill = TREATMENT)) + scale_y_continuous("Longevity (days)") + scale_x_continuous("Thorax length (mm)") +
        scale_linetype_manual("Substrate type", values = 1:5, breaks = c("None",
            "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners",
            "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner",
            "8 virgin\npartners"), guide = guide_legend(ncol = 2)) + scale_shape_manual("Substrate type",
        values = c(23, 22, 21, 22, 21), breaks = c("None", "Preg1", "Preg8",
            "Virg1", "Virg8"), labels = c("No\npartners", "1 pregnant\npartner",
            "8 pregnant\npartners", "1 virgin\npartner", "8 virgin\npartners"),
        guide = guide_legend(ncol = 2)) + scale_fill_manual("Substrate type",
        values = c("black", "black", "black", "white", "white"), breaks = c("None",
            "Preg1", "Preg8", "Virg1", "Virg8"), labels = c("No\npartners",
            "1 pregnant\npartner", "8 pregnant\npartners", "1 virgin\npartner",
            "8 virgin\npartners"), guide = guide_legend(ncol = 2)) + theme_classic() +
        theme(legend.justification = c(0, 1), legend.position = c(0.1, 1),
            axis.title.y = element_text(vjust = 2, size = rel(1.25)), axis.title.x = element_text(vjust = -2,
                size = rel(1.25)), plot.margin = unit(c(0.5, 0.5, 2, 2), "lines"))
    
    plot of chunk tut7.5bQ3.5e1
  9. We have established that the male fruitfly longevity varies across the treatments. The effects model directly compared each of the partner type to the control treatment with no partners. We might also be interested in describing the difference in male fruitfly longevity between other combinations of partner types. Lets compare each treatment group to each other in a pairwise manner. Note, as we elected to fit the model on logarithmic (base ten) transformed longevity, we have to be a bit careful in how we process the results. Without any back transforms, the comparisons will be on a log scale. If we wish to express them on the natural response scale (difference in days, we will need to do a bit more manipulation and remember our log laws ($log(A) - log(B) = log(A)/log(B)$). Thus if we back transform the contrasts, we will get results that reflect the factor change.
    library(MCMCpack)
    partridge.mcmc = partridge.mcmcpack
    wch = grep("Intercept|TREATMENT|THORAX", colnames(partridge.mcmc))
    coefs = partridge.mcmc[, wch]
    newdata = with(partridge, expand.grid(TREATMENT = levels(TREATMENT), THORAX = mean(THORAX, na.rm = TRUE)))
    # A Tukeys contrast matrix
    library(multcomp)
    tuk.mat <- contrMat(n = table(newdata$TREATMENT), type = "Tukey")
    Xmat <- model.matrix(~TREATMENT + THORAX, data = newdata)
    pairwise.mat <- tuk.mat %*% Xmat
    pairwise.mat
    
                  (Intercept) TREATMENTPreg1 TREATMENTPreg8 TREATMENTVirg1 TREATMENTVirg8 THORAX
    Preg1 - None            0              1              0              0              0      0
    Preg8 - None            0              0              1              0              0      0
    Virg1 - None            0              0              0              1              0      0
    Virg8 - None            0              0              0              0              1      0
    Preg8 - Preg1           0             -1              1              0              0      0
    Virg1 - Preg1           0             -1              0              1              0      0
    Virg8 - Preg1           0             -1              0              0              1      0
    Virg1 - Preg8           0              0             -1              1              0      0
    Virg8 - Preg8           0              0             -1              0              1      0
    Virg8 - Virg1           0              0              0             -1              1      0
    
    mcmc_areas(coefs %*% t(pairwise.mat))
    
    plot of chunk tut7.5bQ3.6a1
    (comps = tidyMCMC(coefs %*% t(pairwise.mat), conf.int = TRUE, conf.method = "HPDinterval"))
    
                term    estimate  std.error    conf.low    conf.high
    1   Preg1 - None  0.02226244 0.02386887 -0.02580077  0.068161673
    2   Preg8 - None  0.03649878 0.02414175 -0.01202824  0.082087102
    3   Virg1 - None -0.05404332 0.02401023 -0.10160748 -0.007550831
    4   Virg8 - None -0.18197874 0.02414616 -0.23133368 -0.135891212
    5  Preg8 - Preg1  0.01423634 0.02396522 -0.03512377  0.059624253
    6  Virg1 - Preg1 -0.07630575 0.02391487 -0.12156718 -0.028062534
    7  Virg8 - Preg1 -0.20424117 0.02371341 -0.25045812 -0.156732472
    8  Virg1 - Preg8 -0.09054210 0.02433891 -0.13738602 -0.041521436
    9  Virg8 - Preg8 -0.21847752 0.02396819 -0.26613085 -0.171726858
    10 Virg8 - Virg1 -0.12793542 0.02426936 -0.17484823 -0.080668677
    
    ggplot(comps, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) +
        geom_hline(yintercept = 0, linetype = "dashed") + scale_y_continuous("Effect size") + scale_x_discrete("") +
        coord_flip() + theme_classic()
    
    plot of chunk tut7.5bQ3.6a1
    ## With back transformed
    mcmc_areas(10^(coefs %*% t(pairwise.mat)))
    
    plot of chunk tut7.5bQ3.6a1
    (comps1 = tidyMCMC(10^(coefs %*% t(pairwise.mat)), conf.int = TRUE, conf.method = "HPDinterval"))
    
                term  estimate  std.error  conf.low conf.high
    1   Preg1 - None 1.0541883 0.05796784 0.9423218 1.1699348
    2   Preg8 - None 1.0893557 0.06060224 0.9684462 1.2032111
    3   Virg1 - None 0.8843419 0.04891830 0.7906505 0.9820142
    4   Virg8 - None 0.6587068 0.03662892 0.5855194 0.7297431
    5  Preg8 - Preg1 1.0348974 0.05712792 0.9223085 1.1471607
    6  Virg1 - Preg1 0.8401419 0.04629954 0.7512269 0.9324124
    7  Virg8 - Preg1 0.6257574 0.03418221 0.5595466 0.6945545
    8  Virg1 - Preg8 0.8130922 0.04560181 0.7219419 0.9014199
    9  Virg8 - Preg8 0.6055967 0.03343095 0.5375230 0.6689410
    10 Virg8 - Virg1 0.7460068 0.04173700 0.6685775 0.8304841
    
    ggplot(comps1, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) +
        geom_hline(yintercept = 1, linetype = "dashed") + scale_y_continuous("Effect size") + scale_x_discrete("") +
        coord_flip() + theme_classic()
    
    plot of chunk tut7.5bQ3.6a1
    library(MCMCpack)
    partridge.mcmc = partridge.r2jags$BUGSoutput$sims.matrix
    wch = c(which(colnames(partridge.mcmc) == "beta0"), grep("beta\\[", colnames(partridge.mcmc)))
    coefs = partridge.mcmc[, wch]
    newdata = with(partridge, expand.grid(TREATMENT = levels(TREATMENT), THORAX = mean(THORAX, na.rm = TRUE)))
    # A Tukeys contrast matrix
    library(multcomp)
    tuk.mat <- contrMat(n = table(newdata$TREATMENT), type = "Tukey")
    Xmat <- model.matrix(~TREATMENT + THORAX, data = newdata)
    pairwise.mat <- tuk.mat %*% Xmat
    pairwise.mat
    
                  (Intercept) TREATMENTPreg1 TREATMENTPreg8 TREATMENTVirg1 TREATMENTVirg8 THORAX
    Preg1 - None            0              1              0              0              0      0
    Preg8 - None            0              0              1              0              0      0
    Virg1 - None            0              0              0              1              0      0
    Virg8 - None            0              0              0              0              1      0
    Preg8 - Preg1           0             -1              1              0              0      0
    Virg1 - Preg1           0             -1              0              1              0      0
    Virg8 - Preg1           0             -1              0              0              1      0
    Virg1 - Preg8           0              0             -1              1              0      0
    Virg8 - Preg8           0              0             -1              0              1      0
    Virg8 - Virg1           0              0              0             -1              1      0
    
    mcmc_areas(coefs %*% t(pairwise.mat))
    
    plot of chunk tut7.5bQ3.6b1
    (comps = tidyMCMC(coefs %*% t(pairwise.mat), conf.int = TRUE, conf.method = "HPDinterval"))
    
                term    estimate  std.error    conf.low    conf.high
    1   Preg1 - None  0.02222087 0.02410704 -0.02387136  0.070289150
    2   Preg8 - None  0.03655562 0.02413169 -0.01101267  0.082524482
    3   Virg1 - None -0.05390009 0.02394861 -0.09989605 -0.006049349
    4   Virg8 - None -0.18170503 0.02426155 -0.22861460 -0.133874873
    5  Preg8 - Preg1  0.01433475 0.02402038 -0.03409306  0.060470474
    6  Virg1 - Preg1 -0.07612096 0.02380513 -0.12371581 -0.030310117
    7  Virg8 - Preg1 -0.20392590 0.02419627 -0.25216901 -0.157132792
    8  Virg1 - Preg8 -0.09045571 0.02415687 -0.13813097 -0.043683249
    9  Virg8 - Preg8 -0.21826065 0.02403122 -0.26479914 -0.171535168
    10 Virg8 - Virg1 -0.12780494 0.02419583 -0.17496673 -0.080363292
    
    ggplot(comps, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) +
        geom_hline(yintercept = 0, linetype = "dashed") + scale_y_continuous("Effect size") + scale_x_discrete("") +
        coord_flip() + theme_classic()
    
    plot of chunk tut7.5bQ3.6b1
    ## With back transformed
    mcmc_areas(10^(coefs %*% t(pairwise.mat)))
    
    plot of chunk tut7.5bQ3.6b1
    (comps1 = tidyMCMC(10^(coefs %*% t(pairwise.mat)), conf.int = TRUE, conf.method = "HPDinterval"))
    
                term  estimate  std.error  conf.low conf.high
    1   Preg1 - None 1.0541196 0.05855850 0.9400143 1.1681953
    2   Preg8 - None 1.0894963 0.06054854 0.9744850 1.2087235
    3   Virg1 - None 0.8846269 0.04881624 0.7922053 0.9837108
    4   Virg8 - None 0.6591326 0.03686490 0.5880233 0.7316295
    5  Preg8 - Preg1 1.0351401 0.05731642 0.9245000 1.1493981
    6  Virg1 - Preg1 0.8404883 0.04613413 0.7521149 0.9325881
    7  Virg8 - Preg1 0.6262506 0.03492673 0.5595398 0.6964135
    8  Virg1 - Preg8 0.8132350 0.04527198 0.7270732 0.9038008
    9  Virg8 - Preg8 0.6059049 0.03357668 0.5435016 0.6736973
    10 Virg8 - Virg1 0.7462239 0.04162240 0.6678105 0.8304426
    
    ggplot(comps1, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) +
        geom_hline(yintercept = 1, linetype = "dashed") + scale_y_continuous("Effect size") + scale_x_discrete("") +
        coord_flip() + theme_classic()
    
    plot of chunk tut7.5bQ3.6b1
    library(MCMCpack)
    partridge.mcmc = as.matrix(partridge.rstan)
    Xmat = model.matrix(~TREATMENT + THORAX, newdata)
    wch = c(which(colnames(partridge.mcmc) == "beta0"), grep("beta\\[", colnames(partridge.mcmc)))
    coefs = partridge.mcmc[, wch]
    newdata = with(partridge, expand.grid(TREATMENT = levels(TREATMENT), THORAX = mean(THORAX, na.rm = TRUE)))
    # A Tukeys contrast matrix
    library(multcomp)
    tuk.mat <- contrMat(n = table(newdata$TREATMENT), type = "Tukey")
    Xmat <- model.matrix(~TREATMENT + THORAX, data = newdata)
    pairwise.mat <- tuk.mat %*% Xmat
    pairwise.mat
    
                  (Intercept) TREATMENTPreg1 TREATMENTPreg8 TREATMENTVirg1 TREATMENTVirg8 THORAX
    Preg1 - None            0              1              0              0              0      0
    Preg8 - None            0              0              1              0              0      0
    Virg1 - None            0              0              0              1              0      0
    Virg8 - None            0              0              0              0              1      0
    Preg8 - Preg1           0             -1              1              0              0      0
    Virg1 - Preg1           0             -1              0              1              0      0
    Virg8 - Preg1           0             -1              0              0              1      0
    Virg1 - Preg8           0              0             -1              1              0      0
    Virg8 - Preg8           0              0             -1              0              1      0
    Virg8 - Virg1           0              0              0             -1              1      0
    
    mcmc_areas(coefs %*% t(pairwise.mat))
    
    plot of chunk tut7.5bQ3.6c1
    (comps = tidyMCMC(coefs %*% t(pairwise.mat), conf.int = TRUE, conf.method = "HPDinterval"))
    
                term    estimate  std.error    conf.low    conf.high
    1   Preg1 - None  0.02299254 0.02408381 -0.02571685  0.067810084
    2   Preg8 - None  0.03628671 0.02437624 -0.01049910  0.083787879
    3   Virg1 - None -0.05368096 0.02366228 -0.10081880 -0.009464246
    4   Virg8 - None -0.18108684 0.02451849 -0.22869622 -0.132604319
    5  Preg8 - Preg1  0.01329417 0.02395326 -0.03264648  0.062360678
    6  Virg1 - Preg1 -0.07667351 0.02383856 -0.12247183 -0.030528752
    7  Virg8 - Preg1 -0.20407938 0.02424689 -0.25291361 -0.156602516
    8  Virg1 - Preg8 -0.08996768 0.02416122 -0.13611621 -0.041840729
    9  Virg8 - Preg8 -0.21737355 0.02391817 -0.26345342 -0.170901602
    10 Virg8 - Virg1 -0.12740587 0.02428197 -0.17411665 -0.079437471
    
    ggplot(comps, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) +
        geom_hline(yintercept = 0, linetype = "dashed") + scale_y_continuous("Effect size") + scale_x_discrete("") +
        coord_flip() + theme_classic()
    
    plot of chunk tut7.5bQ3.6c1
    ## With back transformed
    mcmc_areas(10^(coefs %*% t(pairwise.mat)))
    
    plot of chunk tut7.5bQ3.6c1
    (comps1 = tidyMCMC(10^(coefs %*% t(pairwise.mat)), conf.int = TRUE, conf.method = "HPDinterval"))
    
                term  estimate  std.error  conf.low conf.high
    1   Preg1 - None 1.0559907 0.05858701 0.9383382 1.1646937
    2   Preg8 - None 1.0888563 0.06114756 0.9758079 1.2124721
    3   Virg1 - None 0.8850402 0.04819619 0.7928321 0.9784435
    4   Virg8 - None 0.6600920 0.03724069 0.5898256 0.7359186
    5  Preg8 - Preg1 1.0326541 0.05703137 0.9215931 1.1470933
    6  Virg1 - Preg1 0.8394233 0.04615294 0.7542723 0.9321188
    7  Virg8 - Preg1 0.6260333 0.03498437 0.5583422 0.6969889
    8  Virg1 - Preg8 0.8141491 0.04529563 0.7309435 0.9081535
    9  Virg8 - Preg8 0.6071349 0.03348412 0.5447600 0.6741852
    10 Virg8 - Virg1 0.7469190 0.04185802 0.6680023 0.8307720
    
    ggplot(comps1, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) +
        geom_hline(yintercept = 1, linetype = "dashed") + scale_y_continuous("Effect size") + scale_x_discrete("") +
        coord_flip() + theme_classic()
    
    plot of chunk tut7.5bQ3.6c1
    library(MCMCpack)
    partridge.mcmc = as.matrix(partridge.rstanarm)
    Xmat = model.matrix(~TREATMENT + THORAX, newdata)
    wch = grep("Intercept|TREATMENT|THORAX", colnames(partridge.mcmc))
    coefs = partridge.mcmc[, wch]
    newdata = with(partridge, expand.grid(TREATMENT = levels(TREATMENT), THORAX = mean(THORAX, na.rm = TRUE)))
    # A Tukeys contrast matrix
    library(multcomp)
    tuk.mat <- contrMat(n = table(newdata$TREATMENT), type = "Tukey")
    Xmat <- model.matrix(~TREATMENT + THORAX, data = newdata)
    pairwise.mat <- tuk.mat %*% Xmat
    pairwise.mat
    
                  (Intercept) TREATMENTPreg1 TREATMENTPreg8 TREATMENTVirg1 TREATMENTVirg8 THORAX
    Preg1 - None            0              1              0              0              0      0
    Preg8 - None            0              0              1              0              0      0
    Virg1 - None            0              0              0              1              0      0
    Virg8 - None            0              0              0              0              1      0
    Preg8 - Preg1           0             -1              1              0              0      0
    Virg1 - Preg1           0             -1              0              1              0      0
    Virg8 - Preg1           0             -1              0              0              1      0
    Virg1 - Preg8           0              0             -1              1              0      0
    Virg8 - Preg8           0              0             -1              0              1      0
    Virg8 - Virg1           0              0              0             -1              1      0
    
    mcmc_areas(coefs %*% t(pairwise.mat))
    
    plot of chunk tut7.5bQ3.6d1
    (comps = tidyMCMC(coefs %*% t(pairwise.mat), conf.int = TRUE, conf.method = "HPDinterval"))
    
                term    estimate  std.error    conf.low    conf.high
    1   Preg1 - None  0.02234996 0.02407488 -0.02338140  0.070865192
    2   Preg8 - None  0.03641433 0.02440350 -0.01013371  0.086076915
    3   Virg1 - None -0.05403215 0.02354898 -0.09925124 -0.006906492
    4   Virg8 - None -0.18226103 0.02398424 -0.22966723 -0.136666158
    5  Preg8 - Preg1  0.01406437 0.02412565 -0.02968572  0.065601507
    6  Virg1 - Preg1 -0.07638211 0.02392343 -0.12328881 -0.029989351
    7  Virg8 - Preg1 -0.20461099 0.02415189 -0.25341139 -0.159071623
    8  Virg1 - Preg8 -0.09044648 0.02433552 -0.13815607 -0.042642378
    9  Virg8 - Preg8 -0.21867536 0.02387648 -0.26542574 -0.170323352
    10 Virg8 - Virg1 -0.12822888 0.02393159 -0.17425325 -0.080769365
    
    ggplot(comps, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) +
        geom_hline(yintercept = 0, linetype = "dashed") + scale_y_continuous("Effect size") + scale_x_discrete("") +
        coord_flip() + theme_classic()
    
    plot of chunk tut7.5bQ3.6d1
    ## With back transformed
    mcmc_areas(10^(coefs %*% t(pairwise.mat)))
    
    plot of chunk tut7.5bQ3.6d1
    (comps1 = tidyMCMC(10^(coefs %*% t(pairwise.mat)), conf.int = TRUE, conf.method = "HPDinterval"))
    
                term  estimate  std.error  conf.low conf.high
    1   Preg1 - None 1.0544289 0.05852064 0.9411695 1.1700444
    2   Preg8 - None 1.0891800 0.06122090 0.9669837 1.2075192
    3   Virg1 - None 0.8843145 0.04804840 0.7933975 0.9816892
    4   Virg8 - None 0.6582655 0.03637684 0.5892950 0.7300185
    5  Preg8 - Preg1 1.0345108 0.05759814 0.9221479 1.1490921
    6  Virg1 - Preg1 0.8399958 0.04635241 0.7520730 0.9324562
    7  Virg8 - Preg1 0.6252592 0.03476928 0.5579414 0.6933115
    8  Virg1 - Preg8 0.8132717 0.04565201 0.7240084 0.9027741
    9  Virg8 - Preg8 0.6053146 0.03332360 0.5402115 0.6728358
    10 Virg8 - Virg1 0.7454694 0.04105743 0.6646395 0.8252720
    
    ggplot(comps1, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) +
        geom_hline(yintercept = 1, linetype = "dashed") + scale_y_continuous("Effect size") + scale_x_discrete("") +
        coord_flip() + theme_classic()
    
    plot of chunk tut7.5bQ3.6d1
    library(MCMCpack)
    partridge.mcmc = as.matrix(partridge.brm)
    Xmat = model.matrix(~TREATMENT + THORAX, newdata)
    wch = grep("b_", colnames(partridge.mcmc))
    coefs = partridge.mcmc[, wch]
    newdata = with(partridge, expand.grid(TREATMENT = levels(TREATMENT), THORAX = mean(THORAX, na.rm = TRUE)))
    # A Tukeys contrast matrix
    library(multcomp)
    tuk.mat <- contrMat(n = table(newdata$TREATMENT), type = "Tukey")
    Xmat <- model.matrix(~TREATMENT + THORAX, data = newdata)
    pairwise.mat <- tuk.mat %*% Xmat
    pairwise.mat
    
                  (Intercept) TREATMENTPreg1 TREATMENTPreg8 TREATMENTVirg1 TREATMENTVirg8 THORAX
    Preg1 - None            0              1              0              0              0      0
    Preg8 - None            0              0              1              0              0      0
    Virg1 - None            0              0              0              1              0      0
    Virg8 - None            0              0              0              0              1      0
    Preg8 - Preg1           0             -1              1              0              0      0
    Virg1 - Preg1           0             -1              0              1              0      0
    Virg8 - Preg1           0             -1              0              0              1      0
    Virg1 - Preg8           0              0             -1              1              0      0
    Virg8 - Preg8           0              0             -1              0              1      0
    Virg8 - Virg1           0              0              0             -1              1      0
    
    mcmc_areas(coefs %*% t(pairwise.mat))
    
    plot of chunk tut7.5bQ3.6e1
    (comps = tidyMCMC(coefs %*% t(pairwise.mat), conf.int = TRUE, conf.method = "HPDinterval"))
    
                term    estimate  std.error     conf.low    conf.high
    1   Preg1 - None  0.02240460 0.02401104 -0.024874580  0.068487474
    2   Preg8 - None  0.03620256 0.02384339 -0.008463557  0.084372278
    3   Virg1 - None -0.05404743 0.02381586 -0.100910410 -0.008313408
    4   Virg8 - None -0.18198728 0.02386935 -0.228420724 -0.135256737
    5  Preg8 - Preg1  0.01379796 0.02421139 -0.033271153  0.061604483
    6  Virg1 - Preg1 -0.07645203 0.02403209 -0.123225559 -0.028926763
    7  Virg8 - Preg1 -0.20439189 0.02427000 -0.250323885 -0.155967121
    8  Virg1 - Preg8 -0.09024999 0.02394566 -0.136639855 -0.043432356
    9  Virg8 - Preg8 -0.21818985 0.02415905 -0.265948399 -0.171503685
    10 Virg8 - Virg1 -0.12793985 0.02412511 -0.176962850 -0.081758344
    
    ggplot(comps, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) +
        geom_hline(yintercept = 0, linetype = "dashed") + scale_y_continuous("Effect size") + scale_x_discrete("") +
        coord_flip() + theme_classic()
    
    plot of chunk tut7.5bQ3.6e1
    ## With back transformed
    mcmc_areas(10^(coefs %*% t(pairwise.mat)))
    
    plot of chunk tut7.5bQ3.6e1
    (comps1 = tidyMCMC(10^(coefs %*% t(pairwise.mat)), conf.int = TRUE, conf.method = "HPDinterval"))
    
                term  estimate  std.error  conf.low conf.high
    1   Preg1 - None 1.0545530 0.05836951 0.9443336 1.1708128
    2   Preg8 - None 1.0885725 0.05986066 0.9797715 1.2133891
    3   Virg1 - None 0.8843113 0.04850064 0.7926648 0.9810397
    4   Virg8 - None 0.6586722 0.03629164 0.5909888 0.7323914
    5  Preg8 - Preg1 1.0338855 0.05765372 0.9262513 1.1524033
    6  Virg1 - Preg1 0.8398717 0.04652671 0.7490572 0.9312345
    7  Virg8 - Preg1 0.6255845 0.03497534 0.5603392 0.6965585
    8  Virg1 - Preg8 0.8135973 0.04483752 0.7239839 0.8980175
    9  Virg8 - Preg8 0.6060138 0.03377457 0.5403503 0.6717646
    10 Virg8 - Virg1 0.7459856 0.04150406 0.6613482 0.8240819
    
    ggplot(comps1, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) +
        geom_hline(yintercept = 1, linetype = "dashed") + scale_y_continuous("Effect size") + scale_x_discrete("") +
        coord_flip() + theme_classic()
    
    plot of chunk tut7.5bQ3.6e1

    So (for example), the longevity of male fruitflies with 8 virgin partners is approximately 0.61 that of male fruitflies with 8 pregnant partners. Similar contrasts to those defined in Tutorial 7.4b, Q3 could be defined to explore specific comparisons.

  10. Explore $R^2$
    library(MCMCpack)
    library(broom)
    partridge.mcmc <- partridge.mcmcpack
    Xmat = model.matrix(~TREATMENT + THORAX, data = partridge)
    wch = grep("Intercept|TREATMENT|THORAX", colnames(partridge.mcmc))
    coefs = partridge.mcmc[, wch]
    fit = coefs %*% t(Xmat)
    resid = sweep(fit, 2, log10(partridge$LONGEV), "-")
    var_f = apply(fit, 1, var)
    var_e = apply(resid, 1, var)
    R2 = var_f/(var_f + var_e)
    tidyMCMC(as.mcmc(R2), conf.int = TRUE, conf.method = "HPDinterval")
    
      term  estimate  std.error  conf.low conf.high
    1 var1 0.6984863 0.02600139 0.6472273 0.7451868
    
    # for comparison with frequentist
    summary(lm(log10(LONGEV) ~ TREATMENT + THORAX, data = partridge))
    
    Call:
    lm(formula = log10(LONGEV) ~ TREATMENT + THORAX, data = partridge)
    
    Residuals:
          Min        1Q    Median        3Q       Max 
    -0.226736 -0.058445 -0.003469  0.059961  0.170389 
    
    Coefficients:
                   Estimate Std. Error t value Pr(>|t|)    
    (Intercept)     0.79095    0.08443   9.368 5.89e-16 ***
    TREATMENTPreg1  0.02260    0.02368   0.954   0.3419    
    TREATMENTPreg8  0.03648    0.02385   1.530   0.1287    
    TREATMENTVirg1 -0.05381    0.02366  -2.275   0.0247 *  
    TREATMENTVirg8 -0.18165    0.02392  -7.592 7.79e-12 ***
    THORAX          1.19385    0.09900  12.060  < 2e-16 ***
    ---
    Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
    
    Residual standard error: 0.08364 on 119 degrees of freedom
    Multiple R-squared:  0.7055,	Adjusted R-squared:  0.6932 
    F-statistic: 57.02 on 5 and 119 DF,  p-value: < 2.2e-16
    
    partridge.mcmc = partridge.r2jags$BUGSoutput$sims.matrix
    wch = c(which(colnames(partridge.mcmc) == "beta0"), grep("beta\\[", colnames(partridge.mcmc)))
    coefs = partridge.mcmc[, wch]
    Xmat = model.matrix(~TREATMENT + THORAX, data = partridge)
    fit = coefs %*% t(Xmat)
    resid = sweep(fit, 2, log10(partridge$LONGEV), "-")
    var_f = apply(fit, 1, var)
    var_e = apply(resid, 1, var)
    R2 = var_f/(var_f + var_e)
    tidyMCMC(as.mcmc(R2), conf.int = TRUE, conf.method = "HPDinterval")
    
      term  estimate  std.error  conf.low conf.high
    1 var1 0.6983588 0.02576383 0.6474857 0.7445594
    
    # for comparison with frequentist
    summary(lm(log10(LONGEV) ~ TREATMENT + THORAX, data = partridge))
    
    Call:
    lm(formula = log10(LONGEV) ~ TREATMENT + THORAX, data = partridge)
    
    Residuals:
          Min        1Q    Median        3Q       Max 
    -0.226736 -0.058445 -0.003469  0.059961  0.170389 
    
    Coefficients:
                   Estimate Std. Error t value Pr(>|t|)    
    (Intercept)     0.79095    0.08443   9.368 5.89e-16 ***
    TREATMENTPreg1  0.02260    0.02368   0.954   0.3419    
    TREATMENTPreg8  0.03648    0.02385   1.530   0.1287    
    TREATMENTVirg1 -0.05381    0.02366  -2.275   0.0247 *  
    TREATMENTVirg8 -0.18165    0.02392  -7.592 7.79e-12 ***
    THORAX          1.19385    0.09900  12.060  < 2e-16 ***
    ---
    Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
    
    Residual standard error: 0.08364 on 119 degrees of freedom
    Multiple R-squared:  0.7055,	Adjusted R-squared:  0.6932 
    F-statistic: 57.02 on 5 and 119 DF,  p-value: < 2.2e-16
    
    partridge.mcmc = as.matrix(partridge.rstan)
    Xmat = model.matrix(~TREATMENT + THORAX, newdata)
    wch = c(which(colnames(partridge.mcmc) == "beta0"), grep("beta\\[", colnames(partridge.mcmc)))
    coefs = partridge.mcmc[, wch]
    fit = coefs %*% t(Xmat)
    resid = sweep(fit, 2, log10(partridge$LONGEV), "-")
    var_f = apply(fit, 1, var)
    var_e = apply(resid, 1, var)
    R2 = var_f/(var_f + var_e)
    tidyMCMC(as.mcmc(R2), conf.int = TRUE, conf.method = "HPDinterval")
    
      term estimate std.error  conf.low conf.high
    1 var1 0.376688  0.148472 0.1275442 0.6879338
    
    # for comparison with frequentist
    summary(lm(log10(LONGEV) ~ TREATMENT + THORAX, data = partridge))
    
    Call:
    lm(formula = log10(LONGEV) ~ TREATMENT + THORAX, data = partridge)
    
    Residuals:
          Min        1Q    Median        3Q       Max 
    -0.226736 -0.058445 -0.003469  0.059961  0.170389 
    
    Coefficients:
                   Estimate Std. Error t value Pr(>|t|)    
    (Intercept)     0.79095    0.08443   9.368 5.89e-16 ***
    TREATMENTPreg1  0.02260    0.02368   0.954   0.3419    
    TREATMENTPreg8  0.03648    0.02385   1.530   0.1287    
    TREATMENTVirg1 -0.05381    0.02366  -2.275   0.0247 *  
    TREATMENTVirg8 -0.18165    0.02392  -7.592 7.79e-12 ***
    THORAX          1.19385    0.09900  12.060  < 2e-16 ***
    ---
    Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
    
    Residual standard error: 0.08364 on 119 degrees of freedom
    Multiple R-squared:  0.7055,	Adjusted R-squared:  0.6932 
    F-statistic: 57.02 on 5 and 119 DF,  p-value: < 2.2e-16
    
    partridge.mcmc = as.matrix(partridge.rstanarm)
    Xmat = model.matrix(~TREATMENT + THORAX, newdata)
    wch = grep("Intercept|TREATMENT|THORAX", colnames(partridge.mcmc))
    coefs = partridge.mcmc[, wch]
    fit = coefs %*% t(Xmat)
    resid = sweep(fit, 2, log10(partridge$LONGEV), "-")
    var_f = apply(fit, 1, var)
    var_e = apply(resid, 1, var)
    R2 = var_f/(var_f + var_e)
    tidyMCMC(as.mcmc(R2), conf.int = TRUE, conf.method = "HPDinterval")
    
      term  estimate std.error  conf.low conf.high
    1 var1 0.3773228 0.1478388 0.1241807 0.6818132
    
    # for comparison with frequentist
    summary(lm(log10(LONGEV) ~ TREATMENT + THORAX, data = partridge))
    
    Call:
    lm(formula = log10(LONGEV) ~ TREATMENT + THORAX, data = partridge)
    
    Residuals:
          Min        1Q    Median        3Q       Max 
    -0.226736 -0.058445 -0.003469  0.059961  0.170389 
    
    Coefficients:
                   Estimate Std. Error t value Pr(>|t|)    
    (Intercept)     0.79095    0.08443   9.368 5.89e-16 ***
    TREATMENTPreg1  0.02260    0.02368   0.954   0.3419    
    TREATMENTPreg8  0.03648    0.02385   1.530   0.1287    
    TREATMENTVirg1 -0.05381    0.02366  -2.275   0.0247 *  
    TREATMENTVirg8 -0.18165    0.02392  -7.592 7.79e-12 ***
    THORAX          1.19385    0.09900  12.060  < 2e-16 ***
    ---
    Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
    
    Residual standard error: 0.08364 on 119 degrees of freedom
    Multiple R-squared:  0.7055,	Adjusted R-squared:  0.6932 
    F-statistic: 57.02 on 5 and 119 DF,  p-value: < 2.2e-16
    
    partridge.mcmc = as.matrix(partridge.brm)
    Xmat = model.matrix(~TREATMENT + THORAX, newdata)
    wch = grep("b_", colnames(partridge.mcmc))
    coefs = partridge.mcmc[, wch]
    fit = coefs %*% t(Xmat)
    resid = sweep(fit, 2, log10(partridge$LONGEV), "-")
    var_f = apply(fit, 1, var)
    var_e = apply(resid, 1, var)
    R2 = var_f/(var_f + var_e)
    tidyMCMC(as.mcmc(R2), conf.int = TRUE, conf.method = "HPDinterval")
    
      term  estimate std.error  conf.low conf.high
    1 var1 0.3771413 0.1483151 0.1227373 0.6817201
    
    # for comparison with frequentist
    summary(lm(log10(LONGEV) ~ TREATMENT + THORAX, data = partridge))
    
    Call:
    lm(formula = log10(LONGEV) ~ TREATMENT + THORAX, data = partridge)
    
    Residuals:
          Min        1Q    Median        3Q       Max 
    -0.226736 -0.058445 -0.003469  0.059961  0.170389 
    
    Coefficients:
                   Estimate Std. Error t value Pr(>|t|)    
    (Intercept)     0.79095    0.08443   9.368 5.89e-16 ***
    TREATMENTPreg1  0.02260    0.02368   0.954   0.3419    
    TREATMENTPreg8  0.03648    0.02385   1.530   0.1287    
    TREATMENTVirg1 -0.05381    0.02366  -2.275   0.0247 *  
    TREATMENTVirg8 -0.18165    0.02392  -7.592 7.79e-12 ***
    THORAX          1.19385    0.09900  12.060  < 2e-16 ***
    ---
    Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
    
    Residual standard error: 0.08364 on 119 degrees of freedom
    Multiple R-squared:  0.7055,	Adjusted R-squared:  0.6932 
    F-statistic: 57.02 on 5 and 119 DF,  p-value: < 2.2e-16
    

Heterogeneous slopes

Constable (1993) compared the inter-radial suture widths of urchins maintained on one of three food regimes (Initial: no additional food supplied above what was in the initial sample, low: food supplied periodically and high: food supplied ad libitum. In an attempt to control for substantial variability in urchin sizes, the initial body volume of each urchin was measured as a covariate.

Download Partridge1 data set
Format of constable.csv data file
TREATIVSUTW
Initial3.50.010
Initial5.00.020
Initial8.00.061
......

TREATCategorical listing of the foot treatment (Initial: no additional food supplied above what was in the initial sample, low: food supplied periodically and high: food supplied ad libium)
IVContinuous covariate, the initial volume of the urchin
SUTWWidth of the suture. Response variable.
Sea Urchin

Open
the constable data file.
constable <- read.csv("../downloads/data/constable.csv", strip.white = T)
head(constable)
    TREAT   IV  SUTW
1 Initial  3.5 0.010
2 Initial  5.0 0.020
3 Initial  8.0 0.061
4 Initial 10.0 0.051
5 Initial 13.0 0.041
6 Initial 13.0 0.061
  1. Perform basic exploratory data analysis to examined.
    • normality
    • homogeneity of variance
    • homogeneity of slopes (and equality of covariate ranges)
    ggplot(constable, aes(y = SUTW, x = IV, color = TREAT)) + geom_smooth() + geom_point() + theme_classic()
    
    plot of chunk ws7.5bQ2.2
    # Or with linear smoother
    ggplot(constable, aes(y = SUTW, x = IV, color = TREAT)) + geom_smooth(method = "lm") + geom_point() +
        theme_classic()
    
    plot of chunk ws7.5bQ2.2
  2. It would appear that it is likely that homogeneity of slopes has been violated - the slope associated with the Low treatment appears less steep than the slopes associated with either High or Initial. Although the range of the continuous covariate (IV) is slightly less for the High treatment (compared to the Initial and Low treatments), the difference is not dramatic (and appears broadly symmetrical).

  3. To obtain a quick estimation of the likely residuals, lets explore the residuals from a simple linear model fitted in a frequentist framework.
    plot(lm(SUTW ~ IV * TREAT, data = constable))
    
    plot of chunk ws7.5bQ2.3
    plot of chunk ws7.5bQ2.3
    plot of chunk ws7.5bQ2.3
    plot of chunk ws7.5bQ2.3
  4. Constable applied a third root transform on IV to improve normality and linearity. For consistency sake, we will do the same.

    plot(lm(SUTW ~ I(IV^(1/3)) * TREAT, data = constable))
    
    plot of chunk ws7.5bQ2.4
    plot of chunk ws7.5bQ2.4
    plot of chunk ws7.5bQ2.4
    plot of chunk ws7.5bQ2.4
    Arguably this is a marginal improvement, although it will add some additional complexity when representing the outcomes.

  5. Given the likely heterogeneous slopes, it is arguably best to fit a model that permits heterogeneous slopes (that is, includes the IV by TREAT interaction). Fit the appropriate Bayesian model to explore the effect of treatment and inital volume (and their interaction) on the width of sutures. $$ \begin{align} y_{ij} &\sim{} N(\mu_{ij}, \sigma)\\ \mu_{ij} &= \beta_0 + \mathbf{X}\boldsymbol{\beta}\\[1em] \beta_0 &\sim{} N(0,10)\\ \beta &\sim{} N(0,10)\\ \sigma &\sim{} cauchy(0,5)\\ \end{align} $$
    library(MCMCpack)
    constable.mcmcpack = MCMCregress(SUTW ~ I(IV^(1/3)) * TREAT, data = constable)
    
    modelString = "
      model {
      #Likelihood
      for (i in 1:n) {
      y[i]~dnorm(mu[i],tau)
      mu[i] <- beta0 + inprod(beta[],X[i,])
      }
      #Priors
      beta0 ~ dnorm(0.01,1.0E-6)
      for (j in 1:nX) {
      beta[j] ~ dnorm(0.01,1.0E-6)
      }
      tau <- 1 / (sigma * sigma)
      sigma~dunif(0,100)
      }
      "
    
    X = model.matrix(~I(IV^(1/3)) * TREAT, data = constable)
    constable.list <- with(constable, list(y = SUTW, X = X[, -1], nX = ncol(X) -
        1, n = nrow(constable)))
    
    params <- c("beta0", "beta", "sigma")
    burnInSteps = 3000
    nChains = 3
    numSavedSteps = 15000
    thinSteps = 10
    nIter = ceiling((numSavedSteps * thinSteps)/nChains)
    
    constable.r2jags <- jags(data = constable.list, inits = NULL, parameters.to.save = params,
        model.file = textConnection(modelString), n.chains = nChains, n.iter = nIter,
        n.burnin = burnInSteps, n.thin = thinSteps)
    
    Compiling model graph
       Resolving undeclared variables
       Allocating nodes
    Graph information:
       Observed stochastic nodes: 72
       Unobserved stochastic nodes: 7
       Total graph size: 645
    
    Initializing model
    
    					  modelString=" 
    					  data { 
    					  int n;   // total number of observations 
    					  vector[n] Y;      // response variable 
    					  int nX;  // number of effects 
    					  matrix[n, nX] X;   // model matrix 
    					  } 
    					  transformed data { 
    					  matrix[n, nX - 1] Xc;  // centered version of X 
    					  vector[nX - 1] means_X;  // column means of X before centering 
    					  
    					  for (i in 2:nX) { 
    					  means_X[i - 1] = mean(X[, i]); 
    					  Xc[, i - 1] = X[, i] - means_X[i - 1]; 
    					  }  
    					  } 
    					  parameters { 
    					  vector[nX-1] beta;  // population-level effects 
    					  real cbeta0;  // center-scale intercept 
    					  real sigma;  // residual SD 
    					  } 
    					  transformed parameters { 
    					  } 
    					  model { 
    					  vector[n] mu; 
    					  mu = Xc * beta + cbeta0; 
    					  // prior specifications 
    					  beta ~ normal(0, 10); 
    					  cbeta0 ~ normal(0, 10); 
    					  sigma ~ cauchy(0, 5); 
    					  // likelihood contribution 
    					  Y ~ normal(mu, sigma); 
    					  } 
    					  generated quantities { 
    					  real beta0;  // population-level intercept 
    					  vector[n] log_lik;
    					  beta0 = cbeta0 - dot_product(means_X, beta);
    					  for (i in 1:n) {
    					  log_lik[i] = normal_lpdf(Y[i] | Xc[i] * beta + cbeta0, sigma);
    					  }
    					  }
    					  "
    
    X = model.matrix(~I(IV^(1/3)) * TREAT, data = constable)
    constable.list <- with(constable, list(Y = SUTW, X = X, nX = ncol(X), n = nrow(constable)))
    
    library(rstan)
    constable.rstan <- stan(data = constable.list, model_code = modelString, chains = 3, iter = 5000, warmup = 500,
        thin = 5)
    
    SAMPLING FOR MODEL 'd98dbf6a02725fc3fce11306b77873e9' NOW (CHAIN 1).
    
    Gradient evaluation took 1.9e-05 seconds
    1000 transitions using 10 leapfrog steps per transition would take 0.19 seconds.
    Adjust your expectations accordingly!
    
    
    Iteration:    1 / 5000 [  0%]  (Warmup)
    Iteration:  500 / 5000 [ 10%]  (Warmup)
    Iteration:  501 / 5000 [ 10%]  (Sampling)
    Iteration: 1000 / 5000 [ 20%]  (Sampling)
    Iteration: 1500 / 5000 [ 30%]  (Sampling)
    Iteration: 2000 / 5000 [ 40%]  (Sampling)
    Iteration: 2500 / 5000 [ 50%]  (Sampling)
    Iteration: 3000 / 5000 [ 60%]  (Sampling)
    Iteration: 3500 / 5000 [ 70%]  (Sampling)
    Iteration: 4000 / 5000 [ 80%]  (Sampling)
    Iteration: 4500 / 5000 [ 90%]  (Sampling)
    Iteration: 5000 / 5000 [100%]  (Sampling)
    
     Elapsed Time: 0.144418 seconds (Warm-up)
                   0.984664 seconds (Sampling)
                   1.12908 seconds (Total)
    
    
    SAMPLING FOR MODEL 'd98dbf6a02725fc3fce11306b77873e9' NOW (CHAIN 2).
    
    Gradient evaluation took 1e-05 seconds
    1000 transitions using 10 leapfrog steps per transition would take 0.1 seconds.
    Adjust your expectations accordingly!
    
    
    Iteration:    1 / 5000 [  0%]  (Warmup)
    Iteration:  500 / 5000 [ 10%]  (Warmup)
    Iteration:  501 / 5000 [ 10%]  (Sampling)
    Iteration: 1000 / 5000 [ 20%]  (Sampling)
    Iteration: 1500 / 5000 [ 30%]  (Sampling)
    Iteration: 2000 / 5000 [ 40%]  (Sampling)
    Iteration: 2500 / 5000 [ 50%]  (Sampling)
    Iteration: 3000 / 5000 [ 60%]  (Sampling)
    Iteration: 3500 / 5000 [ 70%]  (Sampling)
    Iteration: 4000 / 5000 [ 80%]  (Sampling)
    Iteration: 4500 / 5000 [ 90%]  (Sampling)
    Iteration: 5000 / 5000 [100%]  (Sampling)
    
     Elapsed Time: 0.140768 seconds (Warm-up)
                   1.06787 seconds (Sampling)
                   1.20863 seconds (Total)
    
    
    SAMPLING FOR MODEL 'd98dbf6a02725fc3fce11306b77873e9' NOW (CHAIN 3).
    
    Gradient evaluation took 1.6e-05 seconds
    1000 transitions using 10 leapfrog steps per transition would take 0.16 seconds.
    Adjust your expectations accordingly!
    
    
    Iteration:    1 / 5000 [  0%]  (Warmup)
    Iteration:  500 / 5000 [ 10%]  (Warmup)
    Iteration:  501 / 5000 [ 10%]  (Sampling)
    Iteration: 1000 / 5000 [ 20%]  (Sampling)
    Iteration: 1500 / 5000 [ 30%]  (Sampling)
    Iteration: 2000 / 5000 [ 40%]  (Sampling)
    Iteration: 2500 / 5000 [ 50%]  (Sampling)
    Iteration: 3000 / 5000 [ 60%]  (Sampling)
    Iteration: 3500 / 5000 [ 70%]  (Sampling)
    Iteration: 4000 / 5000 [ 80%]  (Sampling)
    Iteration: 4500 / 5000 [ 90%]  (Sampling)
    Iteration: 5000 / 5000 [100%]  (Sampling)
    
     Elapsed Time: 0.15773 seconds (Warm-up)
                   1.077 seconds (Sampling)
                   1.23473 seconds (Total)
    
    constable.rstanarm = stan_glm(SUTW ~ I(IV^(1/3)) * TREAT, data = constable,
        iter = 5000, warmup = 500, chains = 3, thin = 5, refresh = 0, prior_intercept = normal(0,
            10), prior = normal(0, 10), prior_aux = cauchy(0, 5))
    
    Gradient evaluation took 2.8e-05 seconds
    1000 transitions using 10 leapfrog steps per transition would take 0.28 seconds.
    Adjust your expectations accordingly!
    
    
    
     Elapsed Time: 0.276502 seconds (Warm-up)
                   1.72962 seconds (Sampling)
                   2.00612 seconds (Total)
    
    
    Gradient evaluation took 1.3e-05 seconds
    1000 transitions using 10 leapfrog steps per transition would take 0.13 seconds.
    Adjust your expectations accordingly!
    
    
    
     Elapsed Time: 0.256832 seconds (Warm-up)
                   1.64027 seconds (Sampling)
                   1.8971 seconds (Total)
    
    
    Gradient evaluation took 1.3e-05 seconds
    1000 transitions using 10 leapfrog steps per transition would take 0.13 seconds.
    Adjust your expectations accordingly!
    
    
    
     Elapsed Time: 0.263878 seconds (Warm-up)
                   1.5121 seconds (Sampling)
                   1.77597 seconds (Total)
    
    constable.brm = brm(SUTW ~ I(IV^(1/3)) * TREAT, data = constable, iter = 5000,
        warmup = 500, chains = 3, thin = 5, refresh = 0, prior = c(prior(normal(0,
            10), class = "Intercept"), prior(normal(0, 10), class = "b"), prior(cauchy(0,
            5), class = "sigma")))
    
    Gradient evaluation took 1.6e-05 seconds
    1000 transitions using 10 leapfrog steps per transition would take 0.16 seconds.
    Adjust your expectations accordingly!
    
    
    
     Elapsed Time: 0.143567 seconds (Warm-up)
                   1.0836 seconds (Sampling)
                   1.22716 seconds (Total)
    
    
    Gradient evaluation took 9e-06 seconds
    1000 transitions using 10 leapfrog steps per transition would take 0.09 seconds.
    Adjust your expectations accordingly!
    
    
    
     Elapsed Time: 0.133804 seconds (Warm-up)
                   1.12441 seconds (Sampling)
                   1.25822 seconds (Total)
    
    
    Gradient evaluation took 1.1e-05 seconds
    1000 transitions using 10 leapfrog steps per transition would take 0.11 seconds.
    Adjust your expectations accordingly!
    
    
    
     Elapsed Time: 0.139708 seconds (Warm-up)
                   1.10982 seconds (Sampling)
                   1.24953 seconds (Total)
    
  6. Explore MCMC diagnostics
    library(MCMCpack)
    plot(constable.mcmcpack)
    
    plot of chunk tut7.5bQ2.6a
    plot of chunk tut7.5bQ2.6a
    raftery.diag(constable.mcmcpack)
    
    Quantile (q) = 0.025
    Accuracy (r) = +/- 0.005
    Probability (s) = 0.95 
                                                                    
                              Burn-in  Total Lower bound  Dependence
                              (M)      (N)   (Nmin)       factor (I)
     (Intercept)              2        3865  3746         1.030     
     I(IV^(1/3))              2        3650  3746         0.974     
     TREATInitial             2        3710  3746         0.990     
     TREATLow                 2        3710  3746         0.990     
     I(IV^(1/3)):TREATInitial 2        3994  3746         1.070     
     I(IV^(1/3)):TREATLow     2        3897  3746         1.040     
     sigma2                   2        3929  3746         1.050     
    
    autocorr.diag(constable.mcmcpack)
    
            (Intercept)  I(IV^(1/3)) TREATInitial      TREATLow I(IV^(1/3)):TREATInitial
    Lag 0   1.000000000  1.000000000  1.000000000  1.0000000000              1.000000000
    Lag 1  -0.005949108 -0.008338260 -0.008217864  0.0004490301             -0.010759741
    Lag 5   0.020882211  0.023896208 -0.003101549  0.0115059926              0.001748665
    Lag 10  0.015981305  0.019153534 -0.002528803  0.0022271117              0.001748848
    Lag 50  0.007896860  0.005767901  0.008420514 -0.0058100440              0.003191181
           I(IV^(1/3)):TREATLow       sigma2
    Lag 0          1.0000000000 1.0000000000
    Lag 1         -0.0006921279 0.0907121678
    Lag 5          0.0146699538 0.0005388206
    Lag 10         0.0066382223 0.0170684058
    Lag 50        -0.0064255545 0.0075084152
    
    library(R2jags)
    library(coda)
    constable.mcmc = as.mcmc(constable.r2jags)
    plot(constable.mcmc)
    
    plot of chunk tut7.5bQ2.6b
    plot of chunk tut7.5bQ2.6b
    raftery.diag(constable.mcmc)
    
    [[1]]
    
    Quantile (q) = 0.025
    Accuracy (r) = +/- 0.005
    Probability (s) = 0.95 
                                                    
              Burn-in  Total Lower bound  Dependence
              (M)      (N)   (Nmin)       factor (I)
     beta0    20       38330 3746         10.20     
     beta[1]  10       37670 3746         10.10     
     beta[2]  10       37660 3746         10.10     
     beta[3]  20       37020 3746          9.88     
     beta[4]  20       38330 3746         10.20     
     beta[5]  20       39000 3746         10.40     
     deviance 10       37660 3746         10.10     
     sigma    20       37020 3746          9.88     
    
    
    [[2]]
    
    Quantile (q) = 0.025
    Accuracy (r) = +/- 0.005
    Probability (s) = 0.95 
                                                    
              Burn-in  Total Lower bound  Dependence
              (M)      (N)   (Nmin)       factor (I)
     beta0    20       37020 3746          9.88     
     beta[1]  20       36380 3746          9.71     
     beta[2]  10       37660 3746         10.10     
     beta[3]  20       37020 3746          9.88     
     beta[4]  20       36380 3746          9.71     
     beta[5]  20       36380 3746          9.71     
     deviance 20       38330 3746         10.20     
     sigma    20       38330 3746         10.20     
    
    
    [[3]]
    
    Quantile (q) = 0.025
    Accuracy (r) = +/- 0.005
    Probability (s) = 0.95 
                                                    
              Burn-in  Total Lower bound  Dependence
              (M)      (N)   (Nmin)       factor (I)
     beta0    20       38330 3746         10.20     
     beta[1]  20       39000 3746         10.40     
     beta[2]  10       37660 3746         10.10     
     beta[3]  10       37660 3746         10.10     
     beta[4]  20       38330 3746         10.20     
     beta[5]  20       36380 3746          9.71     
     deviance 10       37660 3746         10.10     
     sigma    20       36380 3746          9.71     
    
    autocorr.diag(constable.mcmc)
    
                    beta0       beta[1]      beta[2]       beta[3]      beta[4]       beta[5]
    Lag 0    1.0000000000  1.0000000000  1.000000000  1.0000000000  1.000000000  1.0000000000
    Lag 10  -0.0008583855 -0.0008855223  0.005002736  0.0044326527  0.005410357  0.0049190350
    Lag 50  -0.0046942718 -0.0052147927 -0.002716165  0.0001732884 -0.003266711  0.0001285954
    Lag 100  0.0073399128  0.0063289399  0.005797145  0.0107637633  0.004719665  0.0102551639
    Lag 500  0.0009396995  0.0022012866 -0.004393656 -0.0112211954 -0.003541198 -0.0096620252
                deviance        sigma
    Lag 0    1.000000000  1.000000000
    Lag 10   0.003079761 -0.004958203
    Lag 50  -0.013198225 -0.008532456
    Lag 100 -0.006321487 -0.014973307
    Lag 500  0.007831084 -0.006825210
    
    library(rstan)
    library(coda)
    s = as.array(constable.rstan)
    constable.mcmc <- do.call(mcmc.list, plyr:::alply(s[, , c("beta0", "beta[1]", "beta[2]", "beta[3]", "beta[4]",
        "beta[5]", "beta[6]", "sigma")], 2, as.mcmc))
    
    Error in s[, , c("beta0", "beta[1]", "beta[2]", "beta[3]", "beta[4]", : subscript out of bounds
    
    plot(constable.mcmc)
    
    plot of chunk tut7.5bQ2.6c
    plot of chunk tut7.5bQ2.6c
    raftery.diag(constable.mcmc)
    
    [[1]]
    
    Quantile (q) = 0.025
    Accuracy (r) = +/- 0.005
    Probability (s) = 0.95 
                                                    
              Burn-in  Total Lower bound  Dependence
              (M)      (N)   (Nmin)       factor (I)
     beta0    20       38330 3746         10.20     
     beta[1]  10       37670 3746         10.10     
     beta[2]  10       37660 3746         10.10     
     beta[3]  20       37020 3746          9.88     
     beta[4]  20       38330 3746         10.20     
     beta[5]  20       39000 3746         10.40     
     deviance 10       37660 3746         10.10     
     sigma    20       37020 3746          9.88     
    
    
    [[2]]
    
    Quantile (q) = 0.025
    Accuracy (r) = +/- 0.005
    Probability (s) = 0.95 
                                                    
              Burn-in  Total Lower bound  Dependence
              (M)      (N)   (Nmin)       factor (I)
     beta0    20       37020 3746          9.88     
     beta[1]  20       36380 3746          9.71     
     beta[2]  10       37660 3746         10.10     
     beta[3]  20       37020 3746          9.88     
     beta[4]  20       36380 3746          9.71     
     beta[5]  20       36380 3746          9.71     
     deviance 20       38330 3746         10.20     
     sigma    20       38330 3746         10.20     
    
    
    [[3]]
    
    Quantile (q) = 0.025
    Accuracy (r) = +/- 0.005
    Probability (s) = 0.95 
                                                    
              Burn-in  Total Lower bound  Dependence
              (M)      (N)   (Nmin)       factor (I)
     beta0    20       38330 3746         10.20     
     beta[1]  20       39000 3746         10.40     
     beta[2]  10       37660 3746         10.10     
     beta[3]  10       37660 3746         10.10     
     beta[4]  20       38330 3746         10.20     
     beta[5]  20       36380 3746          9.71     
     deviance 10       37660 3746         10.10     
     sigma    20       36380 3746          9.71     
    
    autocorr.diag(constable.mcmc)
    
                    beta0       beta[1]      beta[2]       beta[3]      beta[4]       beta[5]
    Lag 0    1.0000000000  1.0000000000  1.000000000  1.0000000000  1.000000000  1.0000000000
    Lag 10  -0.0008583855 -0.0008855223  0.005002736  0.0044326527  0.005410357  0.0049190350
    Lag 50  -0.0046942718 -0.0052147927 -0.002716165  0.0001732884 -0.003266711  0.0001285954
    Lag 100  0.0073399128  0.0063289399  0.005797145  0.0107637633  0.004719665  0.0102551639
    Lag 500  0.0009396995  0.0022012866 -0.004393656 -0.0112211954 -0.003541198 -0.0096620252
                deviance        sigma
    Lag 0    1.000000000  1.000000000
    Lag 10   0.003079761 -0.004958203
    Lag 50  -0.013198225 -0.008532456
    Lag 100 -0.006321487 -0.014973307
    Lag 500  0.007831084 -0.006825210
    
    library(rstan)
    library(coda)
    stan_ac(constable.rstan, pars = c("beta0", "beta", "sigma"))
    
    plot of chunk tut7.5bQ2.6c1
    stan_rhat(constable.rstan, pars = c("beta0", "beta", "sigma"))
    
    plot of chunk tut7.5bQ2.6c1
    stan_ess(constable.rstan, pars = c("beta0", "beta", "sigma"))
    
    plot of chunk tut7.5bQ2.6c1
    # using Bayeseplot
    library(bayesplot)
    detach("package:reshape")
    mcmc_trace(as.array(constable.rstan), regex_par = "beta|sigma")
    
    plot of chunk tut7.5bQ2.6c2
    mcmc_trace(as.array(constable.rstan), regex_pars = "beta|sigma")
    
    plot of chunk tut7.5bQ2.6c2
    mcmc_dens(as.array(constable.rstan), regex_par = "beta|sigma")
    
    plot of chunk tut7.5bQ2.6c2
    detach("package:reshape")
    library(bayesplot)
    mcmc_combo(as.array(constable.rstan), regex_par = "beta|sigma")
    
    plot of chunk tut7.5bQ2.6c3
    library(rstanarm)
    library(coda)
    s = as.array(constable.rstanarm)
    colnames(as.matrix(constable.rstanarm))
    
    [1] "(Intercept)"              "I(IV^(1/3))"              "TREATInitial"            
    [4] "TREATLow"                 "I(IV^(1/3)):TREATInitial" "I(IV^(1/3)):TREATLow"    
    [7] "sigma"                   
    
    constable.mcmc <- do.call(mcmc.list, plyr:::alply(s[, , colnames(as.matrix(constable.rstanarm))], 2,
        as.mcmc))
    plot(constable.mcmc)
    
    plot of chunk tut7.5bQ2.6d
    plot of chunk tut7.5bQ2.6d
    raftery.diag(constable.mcmc)
    
    $`1`
    
    Quantile (q) = 0.025
    Accuracy (r) = +/- 0.005
    Probability (s) = 0.95 
    
    You need a sample size of at least 3746 with these values of q, r and s
    
    $`2`
    
    Quantile (q) = 0.025
    Accuracy (r) = +/- 0.005
    Probability (s) = 0.95 
    
    You need a sample size of at least 3746 with these values of q, r and s
    
    $`3`
    
    Quantile (q) = 0.025
    Accuracy (r) = +/- 0.005
    Probability (s) = 0.95 
    
    You need a sample size of at least 3746 with these values of q, r and s
    
    autocorr.diag(constable.mcmc)
    
            (Intercept)  I(IV^(1/3)) TREATInitial     TREATLow I(IV^(1/3)):TREATInitial
    Lag 0   1.000000000  1.000000000   1.00000000  1.000000000              1.000000000
    Lag 1   0.094390624  0.093078031   0.09427246  0.088553480              0.094495420
    Lag 5   0.014550240  0.017891288   0.02036189 -0.003087282              0.025476705
    Lag 10 -0.018253834 -0.015818047  -0.01071217 -0.007707348             -0.006127881
    Lag 50  0.003953687  0.002826433   0.01488185  0.010416070              0.012186204
           I(IV^(1/3)):TREATLow        sigma
    Lag 0           1.000000000  1.000000000
    Lag 1           0.085966027  0.019297999
    Lag 5           0.004263184 -0.001089781
    Lag 10         -0.007911326  0.013127724
    Lag 50          0.002241020  0.019541951
    
    library(rstanarm)
    library(coda)
    stan_ac(constable.rstanarm, regex_pars = "Intercept|TREAT|IV|sigma")
    
    plot of chunk tut7.5bQ2.6d1
    stan_rhat(constable.rstanarm, regex_pars = "Intercept|TREAT|IV|sigma")
    
    plot of chunk tut7.5bQ2.6d1
    stan_ess(constable.rstanarm, regex_pars = "Intercept|TREAT|IV|sigma")
    
    plot of chunk tut7.5bQ2.6d1
    # using Bayeseplot
    library(bayesplot)
    detach("package:reshape")
    mcmc_trace(as.array(constable.rstanarm), regex_par = "Intercept|TREAT|IV|sigma")
    
    plot of chunk tut7.5bQ2.6d2
    mcmc_trace(as.array(constable.rstanarm), regex_pars = "Intercept|TREAT|IV|sigma")
    
    plot of chunk tut7.5bQ2.6d2
    mcmc_dens(as.array(constable.rstanarm), regex_pars = "Intercept|TREAT|IV|sigma")
    
    plot of chunk tut7.5bQ2.6d2
    detach("package:reshape")
    library(bayesplot)
    mcmc_combo(as.array(constable.rstanarm), regex_par = "Intercept|TREAT|IV|sigma")
    
    plot of chunk tut7.5bQ2.6d3
    library(rstanarm)
    posterior_vs_prior(constable.rstanarm, color_by = "vs", group_by = TRUE,
        facet_args = list(scales = "free_y"))
    
    Gradient evaluation took 2.2e-05 seconds
    1000 transitions using 10 leapfrog steps per transition would take 0.22 seconds.
    Adjust your expectations accordingly!
    
    
    
     Elapsed Time: 0.048119 seconds (Warm-up)
                   0.080688 seconds (Sampling)
                   0.128807 seconds (Total)
    
    
    Gradient evaluation took 1.5e-05 seconds
    1000 transitions using 10 leapfrog steps per transition would take 0.15 seconds.
    Adjust your expectations accordingly!
    
    
    
     Elapsed Time: 0.038981 seconds (Warm-up)
                   0.124669 seconds (Sampling)
                   0.16365 seconds (Total)
    
    plot of chunk tut7.5bQ2.6d4
    library(coda)
    library(brms)
    constable.mcmc = as.mcmc(constable.brm)
    plot(constable.mcmc)
    
    plot of chunk tut7.5bQ2.6e
    plot of chunk tut7.5bQ2.6e
    raftery.diag(constable.mcmc)
    
    [[1]]
    
    Quantile (q) = 0.025
    Accuracy (r) = +/- 0.005
    Probability (s) = 0.95 
    
    You need a sample size of at least 3746 with these values of q, r and s
    
    [[2]]
    
    Quantile (q) = 0.025
    Accuracy (r) = +/- 0.005
    Probability (s) = 0.95 
    
    You need a sample size of at least 3746 with these values of q, r and s
    
    [[3]]
    
    Quantile (q) = 0.025
    Accuracy (r) = +/- 0.005
    Probability (s) = 0.95 
    
    You need a sample size of at least 3746 with these values of q, r and s
    
    autocorr.diag(constable.mcmc)
    
    Error in ts(x, start = start(x), end = end(x), deltat = thin(x)): invalid time series parameters specified
    
    library(coda)
    stan_ac(constable.brm$fit)
    
    plot of chunk tut7.5bQ2.6e1
    stan_rhat(constable.brm$fit)
    
    plot of chunk tut7.5bQ2.6e1
    stan_ess(constable.brm$fit)
    
    plot of chunk tut7.5bQ2.6e1
  7. The chains appear well mixed and have converged on what is likely to be a stable posterior.

  8. Perform model validation
    library(MCMCpack)
    constable.mcmc = as.data.frame(constable.mcmcpack)
    # generate a model matrix
    newdata = constable
    Xmat = model.matrix(~I(IV^(1/3)) * TREAT, newdata)
    ## get median parameter estimates
    wch = grep("Intercept|TREAT|IV", colnames(constable.mcmc))
    coefs = apply(constable.mcmc[, wch], 2, median)
    fit = as.vector(coefs %*% t(Xmat))
    resid = constable$SUTW - fit
    ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))
    
    plot of chunk tut7.5bQ2.7a1
    # Plot residuals against TREAT level
    newdata = newdata %>% cbind(fit, resid)
    newdata.melt = newdata %>% gather(key = Pred, value = Value, TREAT, IV)
    ggplot(newdata.melt) + geom_point(aes(y = resid, x = Value)) + facet_wrap(~Pred, scales = "free")
    
    plot of chunk tut7.5bQ2.7a2
    # Standardized residuals
    sresid = resid/sd(resid)
    ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))
    
    plot of chunk tut7.5bQ2.7a3
    library(MCMCpack)
    constable.mcmc = as.matrix(constable.mcmcpack)
    # generate a model matrix
    Xmat = model.matrix(~I(IV^(1/3)) * TREAT, newdata)
    ## get median parameter estimates
    wch = grep("Intercept|TREAT|IV", colnames(constable.mcmc))
    coefs = constable.mcmc[, wch]
    fit = (coefs %*% t(Xmat))
    ## draw samples from this model
    yRep = sapply(1:nrow(constable.mcmc), function(i) rnorm(nrow(constable),
        fit[i, ], sqrt(constable.mcmc[i, "sigma2"])))
    newdata = data.frame(TREAT = constable$TREAT, IV = constable$IV, yRep) %>%
        gather(key = Sample, value = Value, -TREAT, -IV)
    ggplot(newdata) + geom_violin(aes(y = Value, x = TREAT, fill = "Model"),
        alpha = 0.5) + geom_violin(data = constable, aes(y = SUTW, x = TREAT,
        fill = "Obs"), alpha = 0.5) + geom_point(data = constable, aes(y = SUTW,
        x = TREAT), position = position_jitter(width = 0.1, height = 0), color = "black")
    
    plot of chunk tut7.5bQ2.7a4
    ggplot(newdata) + geom_violin(aes(y = Value, x = IV, fill = "Model", group = IV,
        color = TREAT), alpha = 0.5) + geom_point(data = constable, aes(y = SUTW,
        x = IV, group = IV, color = TREAT))
    
    plot of chunk tut7.5bQ2.7a4

    We can also explore the posteriors of each parameter.

    library(bayesplot)
    mcmc_intervals(as.matrix(constable.mcmcpack), regex_pars = "TREAT|IV")
    
    plot of chunk tut7.5bQ2.7a5
    mcmc_areas(as.matrix(constable.mcmcpack), regex_pars = "TREAT|IV")
    
    plot of chunk tut7.5bQ2.7a5
    constable.mcmc = constable.r2jags$BUGSoutput$sims.matrix
    # generate a model matrix
    newdata = constable
    Xmat = model.matrix(~I(IV^(1/3)) * TREAT, newdata)
    ## get median parameter estimates
    wch = c(which(colnames(constable.mcmc) == "beta0"), grep("beta\\[", colnames(constable.mcmc)))
    coefs = apply(constable.mcmc[, wch], 2, median)
    fit = as.vector(coefs %*% t(Xmat))
    resid = constable$SUTW - fit
    ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))
    
    plot of chunk tut7.5bQ2.7b1
    # Plot residuals against TREAT level
    newdata = newdata %>% cbind(fit, resid)
    newdata.melt = newdata %>% gather(key = Pred, value = Value, TREAT, IV)
    ggplot(newdata.melt) + geom_point(aes(y = resid, x = Value)) + facet_wrap(~Pred, scales = "free")
    
    plot of chunk tut7.5bQ2.7b2
    # Standardized residuals
    sresid = resid/sd(resid)
    ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))
    
    plot of chunk tut7.5bQ2.7b3
    constable.mcmc = constable.r2jags$BUGSoutput$sims.matrix
    # generate a model matrix
    Xmat = model.matrix(~I(IV^(1/3)) * TREAT, newdata)
    ## get median parameter estimates
    wch = c(which(colnames(constable.mcmc) == "beta0"), grep("beta\\[", colnames(constable.mcmc)))
    coefs = constable.mcmc[, wch]
    fit = (coefs %*% t(Xmat))
    ## draw samples from this model
    yRep = sapply(1:nrow(constable.mcmc), function(i) rnorm(nrow(constable),
        fit[i, ], constable.mcmc[i, "sigma"]))
    newdata = data.frame(TREAT = constable$TREAT, IV = constable$IV, yRep) %>%
        gather(key = Sample, value = Value, -TREAT, -IV)
    ggplot(newdata) + geom_violin(aes(y = Value, x = TREAT, fill = "Model"),
        alpha = 0.5) + geom_violin(data = constable, aes(y = SUTW, x = TREAT,
        fill = "Obs"), alpha = 0.5) + geom_point(data = constable, aes(y = SUTW,
        x = TREAT), position = position_jitter(width = 0.1, height = 0), color = "black")
    
    plot of chunk tut7.5bQ2.7b4
    ggplot(newdata) + geom_violin(aes(y = Value, x = IV, fill = "Model", group = IV,
        color = TREAT), alpha = 0.5) + geom_point(data = constable, aes(y = SUTW,
        x = IV, group = IV, color = TREAT))
    
    plot of chunk tut7.5bQ2.7b4

    We can also explore the posteriors of each parameter.

    library(bayesplot)
    mcmc_intervals(constable.r2jags$BUGSoutput$sims.matrix, regex_pars = "beta|sigma")
    
    plot of chunk tut7.5bQ2.7b5
    mcmc_areas(constable.r2jags$BUGSoutput$sims.matrix, regex_pars = "beta|sigma")
    
    plot of chunk tut7.5bQ2.7b5
    constable.mcmc = as.matrix(constable.rstan)
    # generate a model matrix
    newdata = constable
    Xmat = model.matrix(~I(IV^(1/3)) * TREAT, newdata)
    ## get median parameter estimates
    wch = c(which(colnames(constable.mcmc) == "beta0"), grep("beta\\[", colnames(constable.mcmc)))
    coefs = apply(constable.mcmc[, wch], 2, median)
    fit = as.vector(coefs %*% t(Xmat))
    resid = constable$SUTW - fit
    ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))
    
    plot of chunk tut7.5bQ2.7c1
    # Plot residuals against TREAT level
    newdata = newdata %>% cbind(fit, resid)
    newdata.melt = newdata %>% gather(key = Pred, value = Value, TREAT, IV)
    ggplot(newdata.melt) + geom_point(aes(y = resid, x = Value)) + facet_wrap(~Pred, scales = "free")
    
    plot of chunk tut7.5bQ2.7c2
    # Standardized residuals
    sresid = resid/sd(resid)
    ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))
    
    plot of chunk tut7.5bQ2.7c3
    constable.mcmc = as.matrix(constable.rstan)
    # generate a model matrix
    Xmat = model.matrix(~I(IV^(1/3)) * TREAT, newdata)
    ## get median parameter estimates
    wch = c(which(colnames(constable.mcmc) == "beta0"), grep("beta\\[", colnames(constable.mcmc)))
    coefs = constable.mcmc[, wch]
    fit = (coefs %*% t(Xmat))
    ## draw samples from this model
    yRep = sapply(1:nrow(constable.mcmc), function(i) rnorm(nrow(constable),
        fit[i, ], constable.mcmc[i, "sigma"]))
    newdata = data.frame(TREAT = constable$TREAT, IV = constable$IV, yRep) %>%
        gather(key = Sample, value = Value, -TREAT, -IV)
    ggplot(newdata) + geom_violin(aes(y = Value, x = TREAT, fill = "Model"),
        alpha = 0.5) + geom_violin(data = constable, aes(y = SUTW, x = TREAT,
        fill = "Obs"), alpha = 0.5) + geom_point(data = constable, aes(y = SUTW,
        x = TREAT), position = position_jitter(width = 0.1, height = 0), color = "black")
    
    plot of chunk tut7.5bQ2.7c4
    ggplot(newdata) + geom_violin(aes(y = Value, x = IV, fill = "Model", group = IV,
        color = TREAT), alpha = 0.5) + geom_point(data = constable, aes(y = SUTW,
        x = IV, group = IV, color = TREAT))
    
    plot of chunk tut7.5bQ2.7c4

    We can also explore the posteriors of each parameter.

    library(bayesplot)
    mcmc_intervals(as.matrix(constable.rstan), regex_pars = "beta|sigma")
    
    plot of chunk tut7.5bQ2.7c5
    mcmc_areas(as.matrix(constable.rstan), regex_pars = "beta|sigma")
    
    plot of chunk tut7.5bQ2.7c5
    constable.mcmc = as.matrix(constable.rstanarm)
    # generate a model matrix
    newdata = constable
    Xmat = model.matrix(~I(IV^(1/3)) * TREAT, newdata)
    ## get median parameter estimates
    wch = grep("Intercept|TREAT|IV", colnames(constable.mcmc))
    coefs = apply(constable.mcmc[, wch], 2, median)
    fit = as.vector(coefs %*% t(Xmat))
    resid = (constable$SUTW) - fit
    ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))
    
    plot of chunk tut7.5bQ2.7d1
    # Plot residuals against TREAT level
    newdata = newdata %>% cbind(fit, resid)
    newdata.melt = newdata %>% gather(key = Pred, value = Value, TREAT, IV)
    ggplot(newdata.melt) + geom_point(aes(y = resid, x = Value)) + facet_wrap(~Pred, scales = "free")
    
    plot of chunk tut7.5bQ2.7d2
    # Standardized residuals
    sresid = resid/sd(resid)
    ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))
    
    plot of chunk tut7.5bQ2.7d3
    constable.mcmc = as.matrix(constable.rstanarm)
    # generate a model matrix
    Xmat = model.matrix(~I(IV^(1/3)) * TREAT, newdata)
    ## get median parameter estimates
    wch = grep("Intercept|TREAT|IV", colnames(constable.mcmc))
    coefs = constable.mcmc[, wch]
    fit = (coefs %*% t(Xmat))
    ## draw samples from this model
    yRep = sapply(1:nrow(constable.mcmc), function(i) rnorm(nrow(constable),
        fit[i, ], constable.mcmc[i, "sigma"]))
    newdata = data.frame(TREAT = constable$TREAT, IV = constable$IV, yRep) %>%
        gather(key = Sample, value = Value, -TREAT, -IV)
    ggplot(newdata) + geom_violin(aes(y = Value, x = TREAT, fill = "Model"),
        alpha = 0.5) + geom_violin(data = constable, aes(y = SUTW, x = TREAT,
        fill = "Obs"), alpha = 0.5) + geom_point(data = constable, aes(y = SUTW,
        x = TREAT), position = position_jitter(width = 0.1, height = 0), color = "black")
    
    plot of chunk tut7.5bQ2.7d4
    ggplot(newdata) + geom_violin(aes(y = Value, x = IV, fill = "Model", group = IV,
        color = TREAT), alpha = 0.5) + geom_point(data = constable, aes(y = SUTW,
        x = IV, group = IV, color = TREAT))
    
    plot of chunk tut7.5bQ2.7d4

    We can also explore the posteriors of each parameter.

    library(bayesplot)
    mcmc_intervals(as.matrix(constable.rstanarm), regex_pars = "Intercept|TREAT|IV|sigma")
    
    plot of chunk tut7.5bQ2.7d5
    mcmc_areas(as.matrix(constable.rstanarm), regex_pars = "Intercept|TREAT|IV|sigma")
    
    plot of chunk tut7.5bQ2.7d5
    constable.mcmc = as.matrix(constable.brm)
    # generate a model matrix
    newdata = constable
    Xmat = model.matrix(~I(IV^(1/3)) * TREAT, newdata)
    ## get median parameter estimates
    wch = grep("b_", colnames(constable.mcmc))
    coefs = apply(constable.mcmc[, wch], 2, median)
    fit = as.vector(coefs %*% t(Xmat))
    resid = constable$SUTW - fit
    ggplot() + geom_point(data = NULL, aes(y = resid, x = fit))
    
    plot of chunk tut7.5bQ2.7e1
    # Plot residuals against TREAT level
    newdata = newdata %>% cbind(fit, resid)
    newdata.melt = newdata %>% gather(key = Pred, value = Value, TREAT, IV)
    ggplot(newdata.melt) + geom_point(aes(y = resid, x = Value)) + facet_wrap(~Pred, scales = "free")
    
    plot of chunk tut7.5bQ2.7e2
    # Standardized residuals
    sresid = resid/sd(resid)
    ggplot() + geom_point(data = NULL, aes(y = sresid, x = fit))
    
    plot of chunk tut7.5bQ2.7e3
    constable.mcmc = as.matrix(constable.brm)
    # generate a model matrix
    Xmat = model.matrix(~I(IV^(1/3)) * TREAT, newdata)
    ## get median parameter estimates
    wch = grep("b_", colnames(constable.mcmc))
    coefs = constable.mcmc[, wch]
    fit = (coefs %*% t(Xmat))
    ## draw samples from this model
    yRep = sapply(1:nrow(constable.mcmc), function(i) rnorm(nrow(constable),
        fit[i, ], constable.mcmc[i, "sigma"]))
    newdata = data.frame(TREAT = constable$TREAT, IV = constable$IV, yRep) %>%
        gather(key = Sample, value = Value, -TREAT, -IV)
    ggplot(newdata) + geom_violin(aes(y = Value, x = TREAT, fill = "Model"),
        alpha = 0.5) + geom_violin(data = constable, aes(y = SUTW, x = TREAT,
        fill = "Obs"), alpha = 0.5) + geom_point(data = constable, aes(y = SUTW,
        x = TREAT), position = position_jitter(width = 0.1, height = 0), color = "black")
    
    plot of chunk tut7.5bQ2.7e4
    ggplot(newdata) + geom_violin(aes(y = Value, x = IV, fill = "Model", group = IV,
        color = TREAT), alpha = 0.5) + geom_point(data = constable, aes(y = SUTW,
        x = IV, group = IV, color = TREAT))
    
    plot of chunk tut7.5bQ2.7e4

    We can also explore the posteriors of each parameter.

    library(bayesplot)
    mcmc_intervals(as.matrix(constable.brm), regex_pars = "b_|sigma")
    
    plot of chunk tut7.5bQ2.7e5
    mcmc_areas(as.matrix(constable.brm), regex_pars = "b_|sigma")
    
    plot of chunk tut7.5bQ2.7e5
  9. The model validation diagnostics all seem reasonable suggesting that the model is likely to be reliable.

  10. Explore parameter estimates
    library(MCMCpack)
    summary(constable.mcmcpack)
    
    Iterations = 1001:11000
    Thinning interval = 1 
    Number of chains = 1 
    Sample size per chain = 10000 
    
    1. Empirical mean and standard deviation for each variable,
       plus standard error of the mean:
    
                                   Mean        SD  Naive SE Time-series SE
    (Intercept)              -0.0281407 3.212e-02 3.212e-04      3.212e-04
    I(IV^(1/3))               0.0467453 1.256e-02 1.256e-04      1.277e-04
    TREATInitial             -0.0004898 3.950e-02 3.950e-04      3.950e-04
    TREATLow                  0.0627394 3.903e-02 3.903e-04      3.903e-04
    I(IV^(1/3)):TREATInitial -0.0113577 1.512e-02 1.512e-04      1.512e-04
    I(IV^(1/3)):TREATLow     -0.0376004 1.480e-02 1.480e-04      1.480e-04
    sigma2                    0.0004486 8.176e-05 8.176e-07      8.955e-07
    
    2. Quantiles for each variable:
    
                                   2.5%        25%        50%        75%      97.5%
    (Intercept)              -0.0904732 -0.0498242 -0.0280381 -0.0068701  0.0348912
    I(IV^(1/3))               0.0220741  0.0384528  0.0468257  0.0552233  0.0709941
    TREATInitial             -0.0774232 -0.0271221 -0.0004452  0.0258165  0.0769013
    TREATLow                 -0.0146895  0.0369222  0.0626721  0.0889185  0.1394793
    I(IV^(1/3)):TREATInitial -0.0409856 -0.0213530 -0.0114781 -0.0011979  0.0179675
    I(IV^(1/3)):TREATLow     -0.0663009 -0.0474367 -0.0376417 -0.0278228 -0.0084968
    sigma2                    0.0003162  0.0003897  0.0004405  0.0004971  0.0006353
    
    library(broom)
    tidyMCMC(constable.mcmcpack, conf.int = TRUE, conf.method = "HPDinterval")
    
                          term      estimate    std.error      conf.low     conf.high
    1              (Intercept) -0.0281407318 3.211710e-02 -0.0921183584  0.0328974751
    2              I(IV^(1/3))  0.0467453220 1.256260e-02  0.0216638045  0.0704309315
    3             TREATInitial -0.0004897981 3.950309e-02 -0.0768221293  0.0772328696
    4                 TREATLow  0.0627394109 3.902797e-02 -0.0164210766  0.1365547275
    5 I(IV^(1/3)):TREATInitial -0.0113576982 1.511950e-02 -0.0399768488  0.0185830940
    6     I(IV^(1/3)):TREATLow -0.0376004095 1.480079e-02 -0.0661201076 -0.0083137914
    7                   sigma2  0.0004485999 8.176199e-05  0.0003025742  0.0006110242
    
    Xmat = model.matrix(~I(IV^(1/3)) * TREAT, data = constable)
    terms = attr(Xmat, "assign")
    ## Effect effect
    for (i in 1:length(terms)) mcmcpvalue(constable.mcmcpack[, i])
    ## Effect of covariate
    mcmcpvalue(constable.mcmcpack[, which(terms == 1)])
    
    [1] 6e-04
    
    ## Effect of treatment (marginal)
    mcmcpvalue(constable.mcmcpack[, which(terms == 2)])
    
    [1] 0.096
    
    ## Effect of interaction
    mcmcpvalue(constable.mcmcpack[, which(terms == 3)])
    
    [1] 0.0165
    
    # Overal model
    wch = grep("TREAT|IV", colnames(constable.mcmcpack))
    mcmcpvalue(constable.mcmcpack[, wch])
    
    [1] 0
    
    ## Frequentist for comparison
    summary(lm(SUTW ~ I(IV^(1/3)) * TREAT, constable))
    
    Call:
    lm(formula = SUTW ~ I(IV^(1/3)) * TREAT, data = constable)
    
    Residuals:
          Min        1Q    Median        3Q       Max 
    -0.047764 -0.012727  0.001153  0.012959  0.065303 
    
    Coefficients:
                               Estimate Std. Error t value Pr(>|t|)    
    (Intercept)              -0.0282017  0.0307338  -0.918 0.362163    
    I(IV^(1/3))               0.0467744  0.0119981   3.898 0.000229 ***
    TREATInitial             -0.0005899  0.0380584  -0.015 0.987680    
    TREATLow                  0.0630730  0.0373498   1.689 0.095994 .  
    I(IV^(1/3)):TREATInitial -0.0113263  0.0145247  -0.780 0.438302    
    I(IV^(1/3)):TREATLow     -0.0377210  0.0141608  -2.664 0.009700 ** 
    ---
    Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
    
    Residual standard error: 0.02048 on 66 degrees of freedom
    Multiple R-squared:  0.4958,	Adjusted R-squared:  0.4576 
    F-statistic: 12.98 on 5 and 66 DF,  p-value: 8.431e-09
    
    print(constable.r2jags)
    
    Inference for Bugs model at "5", fit using jags,
     3 chains, each with 50000 iterations (first 3000 discarded), n.thin = 10
     n.sims = 14100 iterations saved
              mu.vect sd.vect     2.5%      25%      50%      75%    97.5%  Rhat n.eff
    beta[1]     0.047   0.012    0.022    0.039    0.047    0.055    0.071 1.001 14000
    beta[2]     0.000   0.039   -0.077   -0.026    0.000    0.025    0.076 1.001 14000
    beta[3]     0.063   0.038   -0.013    0.037    0.063    0.088    0.139 1.001 14000
    beta[4]    -0.011   0.015   -0.040   -0.021   -0.012   -0.001    0.018 1.001 14000
    beta[5]    -0.038   0.015   -0.067   -0.047   -0.038   -0.028   -0.009 1.001 14000
    beta0      -0.028   0.032   -0.090   -0.049   -0.028   -0.007    0.035 1.001 14000
    sigma       0.021   0.002    0.018    0.020    0.021    0.022    0.025 1.001  6700
    deviance -354.340   3.998 -360.045 -357.282 -355.026 -352.167 -344.666 1.001 14000
    
    For each parameter, n.eff is a crude measure of effective sample size,
    and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
    
    DIC info (using the rule, pD = var(deviance)/2)
    pD = 8.0 and DIC = -346.3
    DIC is an estimate of expected predictive error (lower deviance is better).
    
    library(broom)
    tidyMCMC(constable.r2jags$BUGSoutput$sims.matrix, conf.int = TRUE, conf.method = "HPDinterval")
    
          term      estimate   std.error      conf.low     conf.high
    1  beta[1]  4.670045e-02 0.012301012    0.02305671  7.161889e-02
    2  beta[2] -4.557708e-04 0.038871071   -0.07545077  7.762928e-02
    3  beta[3]  6.281161e-02 0.038401967   -0.01312925  1.383286e-01
    4  beta[4] -1.136903e-02 0.014860096   -0.04083542  1.740297e-02
    5  beta[5] -3.762094e-02 0.014575422   -0.06722313 -9.735764e-03
    6    beta0 -2.801120e-02 0.031515102   -0.08943297  3.500509e-02
    7 deviance -3.543402e+02 3.997734559 -360.86935897 -3.465127e+02
    8    sigma  2.089732e-02 0.001885792    0.01735789  2.460693e-02
    
    constable.mcmc = constable.r2jags$BUGSoutput$sims.matrix
    wch = c(which(colnames(constable.mcmc) == "beta0"), grep("beta\\[", colnames(constable.mcmc)))
    constable.mcmc = constable.mcmc[, wch]
    str(constable.mcmc)
    
     num [1:14100, 1:6] -0.0208 -0.00522 -0.01208 -0.00336 0.07698 ...
     - attr(*, "dimnames")=List of 2
      ..$ : NULL
      ..$ : chr [1:6] "beta0" "beta[1]" "beta[2]" "beta[3]" ...
    
    Xmat = model.matrix(~I(IV^(1/3)) * TREAT, data = constable)
    terms = attr(Xmat, "assign")
    ## Effect effect
    for (i in 1:length(terms)) print(mcmcpvalue(constable.mcmc[, i]))
    
    [1] 0.3675887
    [1] 0.000141844
    [1] 0.9907801
    [1] 0.1035461
    [1] 0.4405674
    [1] 0.01078014
    
    ## Effect of covariate
    mcmcpvalue(constable.mcmc[, which(terms == 1)])
    
    [1] 0.000141844
    
    ## Effect of treatment (marginal)
    mcmcpvalue(constable.mcmc[, which(terms == 2)])
    
    [1] 0.08524823
    
    ## Effect of interaction
    mcmcpvalue(constable.mcmc[, which(terms == 3)])
    
    [1] 0.01297872
    
    # Overal model
    wch = grep("beta\\[", colnames(constable.mcmc))
    mcmcpvalue(constable.mcmcpack[, wch])
    
    [1] 0
    
    ## Frequentist for comparison
    summary(lm(SUTW ~ I(IV^(1/3)) * TREAT, constable))
    
    Call:
    lm(formula = SUTW ~ I(IV^(1/3)) * TREAT, data = constable)
    
    Residuals:
          Min        1Q    Median        3Q       Max 
    -0.047764 -0.012727  0.001153  0.012959  0.065303 
    
    Coefficients:
                               Estimate Std. Error t value Pr(>|t|)    
    (Intercept)              -0.0282017  0.0307338  -0.918 0.362163    
    I(IV^(1/3))               0.0467744  0.0119981   3.898 0.000229 ***
    TREATInitial             -0.0005899  0.0380584  -0.015 0.987680    
    TREATLow                  0.0630730  0.0373498   1.689 0.095994 .  
    I(IV^(1/3)):TREATInitial -0.0113263  0.0145247  -0.780 0.438302    
    I(IV^(1/3)):TREATLow     -0.0377210  0.0141608  -2.664 0.009700 ** 
    ---
    Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
    
    Residual standard error: 0.02048 on 66 degrees of freedom
    Multiple R-squared:  0.4958,	Adjusted R-squared:  0.4576 
    F-statistic: 12.98 on 5 and 66 DF,  p-value: 8.431e-09
    
    library(broom)
    tidyMCMC(constable.rstan, conf.int = TRUE, conf.method = "HPDinterval", pars = c("beta0", "beta", "sigma"),
        ess = TRUE, rhat = TRUE)
    
         term      estimate   std.error    conf.low   conf.high      rhat  ess
    1   beta0 -0.0278369985 0.032630197 -0.08856619  0.03976331 1.0011292 1778
    2 beta[1]  0.0466281311 0.012731092  0.01859167  0.06908683 1.0010891 2018
    3 beta[2] -0.0007384085 0.039219652 -0.07668034  0.07500954 0.9997629 2152
    4 beta[3]  0.0618907446 0.039193081 -0.01722613  0.13859100 1.0000331 1810
    5 beta[4] -0.0112233626 0.014999941 -0.04061147  0.01841634 0.9998709 2106
    6 beta[5] -0.0372331205 0.014919054 -0.06591119 -0.00682343 1.0001722 1767
    7   sigma  0.0208891691 0.001872266  0.01713790  0.02446211 1.0004836 2571
    
    constable.mcmc = as.matrix(constable.rstan)
    wch = c(which(colnames(constable.mcmc) == "beta0"), grep("beta\\[", colnames(constable.mcmc)))
    constable.mcmc = constable.mcmc[, wch]
    str(constable.mcmc)
    
     num [1:2700, 1:6] -0.097 -0.0165 -0.0329 -0.0255 0.0274 ...
     - attr(*, "dimnames")=List of 2
      ..$ iterations: NULL
      ..$ parameters: chr [1:6] "beta0" "beta[1]" "beta[2]" "beta[3]" ...
    
    Xmat = model.matrix(~I(IV^(1/3)) * TREAT, data = constable)
    terms = attr(Xmat, "assign")
    ## Effect effect
    for (i in 1:length(terms)) print(mcmcpvalue(constable.mcmc[, i]))
    
    [1] 0.387037
    [1] 0.0007407407
    [1] 0.987037
    [1] 0.1166667
    [1] 0.422963
    [1] 0.01444444
    
    ## Effect of covariate
    mcmcpvalue(constable.mcmc[, which(terms == 1)])
    
    [1] 0.0007407407
    
    ## Effect of treatment (marginal)
    mcmcpvalue(constable.mcmc[, which(terms == 2)])
    
    [1] 0.0962963
    
    ## Effect of interaction
    mcmcpvalue(constable.mcmc[, which(terms == 3)])
    
    [1] 0.02074074
    
    # Overal model
    wch = grep("beta\\[", colnames(constable.mcmc))
    mcmcpvalue(constable.mcmcpack[, wch])
    
    [1] 0
    
    ## Frequentist for comparison
    summary(lm(SUTW ~ I(IV^(1/3)) * TREAT, constable))
    
    Call:
    lm(formula = SUTW ~ I(IV^(1/3)) * TREAT, data = constable)
    
    Residuals:
          Min        1Q    Median        3Q       Max 
    -0.047764 -0.012727  0.001153  0.012959  0.065303 
    
    Coefficients:
                               Estimate Std. Error t value Pr(>|t|)    
    (Intercept)              -0.0282017  0.0307338  -0.918 0.362163    
    I(IV^(1/3))               0.0467744  0.0119981   3.898 0.000229 ***
    TREATInitial             -0.0005899  0.0380584  -0.015 0.987680    
    TREATLow                  0.0630730  0.0373498   1.689 0.095994 .  
    I(IV^(1/3)):TREATInitial -0.0113263  0.0145247  -0.780 0.438302    
    I(IV^(1/3)):TREATLow     -0.0377210  0.0141608  -2.664 0.009700 ** 
    ---
    Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
    
    Residual standard error: 0.02048 on 66 degrees of freedom
    Multiple R-squared:  0.4958,	Adjusted R-squared:  0.4576 
    F-statistic: 12.98 on 5 and 66 DF,  p-value: 8.431e-09
    
    summary(constable.rstan)
    
    $summary
                         mean      se_mean          sd         2.5%          25%           50%
    beta[1]      4.662813e-02 2.834136e-04 0.012731092   0.02031882   0.03842780  4.691583e-02
    beta[2]     -7.384085e-04 8.455188e-04 0.039219652  -0.07661515  -0.02554020  5.813345e-05
    beta[3]      6.189074e-02 9.213031e-04 0.039193081  -0.01636681   0.03653035  6.230858e-02
    beta[4]     -1.122336e-02 3.268895e-04 0.014999941  -0.04039101  -0.02063698 -1.126060e-02
    beta[5]     -3.723312e-02 3.549614e-04 0.014919054  -0.06648371  -0.04703486 -3.747664e-02
    cbeta0       7.244843e-02 4.899537e-05 0.002545874   0.06748716   0.07071114  7.243238e-02
    sigma        2.088917e-02 3.692373e-05 0.001872266   0.01745036   0.01961179  2.082794e-02
    beta0       -2.783700e-02 7.738103e-04 0.032630197  -0.09105514  -0.04933693 -2.819851e-02
    log_lik[1]   2.555053e+00 8.279597e-03 0.412625825   1.48002609   2.37565170  2.668924e+00
    log_lik[2]   2.689869e+00 5.886304e-03 0.291712678   1.94169390   2.56823620  2.765803e+00
    log_lik[3]   2.481437e+00 6.408258e-03 0.321730766   1.70879340   2.29899530  2.541077e+00
    log_lik[4]   2.897622e+00 2.274280e-03 0.116633788   2.64138042   2.83049352  2.907077e+00
    log_lik[5]   2.705299e+00 3.510597e-03 0.174263133   2.31349245   2.61010610  2.728462e+00
    log_lik[6]   2.876877e+00 2.286666e-03 0.117374730   2.61123537   2.80959929  2.887540e+00
    log_lik[7]   2.560570e+00 3.857742e-03 0.194705349   2.12570324   2.45110459  2.581614e+00
    log_lik[8]   2.752342e+00 2.840581e-03 0.146866659   2.40683117   2.67032992  2.770447e+00
    log_lik[9]   1.775614e+00 6.846079e-03 0.348432336   0.99574820   1.56920160  1.809863e+00
    log_lik[10]  2.778273e+00 2.698093e-03 0.135681420   2.48780917   2.70187045  2.792084e+00
    log_lik[11]  2.673033e+00 3.005238e-03 0.156156766   2.33437255   2.58537417  2.689725e+00
    log_lik[12]  2.686719e+00 2.928559e-03 0.152172402   2.33092385   2.59809399  2.701096e+00
    log_lik[13]  1.661739e+00 6.924671e-03 0.359816441   0.87329102   1.45046387  1.695174e+00
    log_lik[14]  2.316722e+00 4.462239e-03 0.231864737   1.78428279   2.18156727  2.335296e+00
    log_lik[15]  2.333874e+00 4.626360e-03 0.240392696   1.79242627   2.19062150  2.361844e+00
    log_lik[16]  2.745435e+00 2.852268e-03 0.148208202   2.40700659   2.66093197  2.759778e+00
    log_lik[17]  2.845838e+00 2.312208e-03 0.120145827   2.58647190   2.77648380  2.855412e+00
    log_lik[18]  2.849556e+00 2.509111e-03 0.128577370   2.55917320   2.77590000  2.860467e+00
    log_lik[19]  2.822053e+00 2.724216e-03 0.141554399   2.49033298   2.74328713  2.837016e+00
    log_lik[20]  2.873692e+00 2.581911e-03 0.131199533   2.55664162   2.80698230  2.888037e+00
    log_lik[21]  2.026990e+00 8.336486e-03 0.433176512   1.04071341   1.76431161  2.086384e+00
    log_lik[22]  2.787911e+00 3.709842e-03 0.192769033   2.29265457   2.70351788  2.824156e+00
    log_lik[23]  2.399441e+00 6.861572e-03 0.356537753   1.53604209   2.20306355  2.459279e+00
    log_lik[24]  2.839579e+00 3.259153e-03 0.169350543   2.39523909   2.76589671  2.871430e+00
    log_lik[25]  2.762956e+00 4.962960e-03 0.241860710   2.11040028   2.66516845  2.820364e+00
    log_lik[26]  2.334286e+00 7.549496e-03 0.372101667   1.41557147   2.13302617  2.400365e+00
    log_lik[27]  2.723067e+00 4.487012e-03 0.218326652   2.16329698   2.61613234  2.763296e+00
    log_lik[28]  2.078010e+00 7.032911e-03 0.364625211   1.26525032   1.86876336  2.121607e+00
    log_lik[29]  2.581916e+00 4.920716e-03 0.255687918   1.97016520   2.45489963  2.616756e+00
    log_lik[30]  2.886086e+00 2.276165e-03 0.114568562   2.63526733   2.82031241  2.895578e+00
    log_lik[31]  2.910017e+00 2.060950e-03 0.104986513   2.68691980   2.84838520  2.915109e+00
    log_lik[32]  2.222772e+00 5.339118e-03 0.275582619   1.61533570   2.06599337  2.248303e+00
    log_lik[33]  2.924467e+00 1.925423e-03 0.097708975   2.72776828   2.86256038  2.927915e+00
    log_lik[34]  2.924467e+00 1.925423e-03 0.097708975   2.72776828   2.86256038  2.927915e+00
    log_lik[35]  2.752860e+00 2.808358e-03 0.144358905   2.43182351   2.67047426  2.770488e+00
    log_lik[36]  2.490421e+00 3.964137e-03 0.200794362   2.03246369   2.37362879  2.510642e+00
    log_lik[37]  2.930899e+00 1.871049e-03 0.095021752   2.73934583   2.86974024  2.932586e+00
    log_lik[38]  2.929999e+00 1.879522e-03 0.095393674   2.73761400   2.86804181  2.931895e+00
    log_lik[39]  2.438143e+00 4.251045e-03 0.219655037   1.93589288   2.30486686  2.461310e+00
    log_lik[40]  2.927682e+00 1.885400e-03 0.096231668   2.73082310   2.86582223  2.929051e+00
    log_lik[41]  2.872912e+00 2.539450e-03 0.131822831   2.56366237   2.80429922  2.889157e+00
    log_lik[42]  2.615441e+00 5.050475e-03 0.262430360   1.97208172   2.47776371  2.660553e+00
    log_lik[43]  2.864143e+00 2.761726e-03 0.143503495   2.50170516   2.79416131  2.883797e+00
    log_lik[44]  2.851867e+00 2.988698e-03 0.155297296   2.45692210   2.77823353  2.876845e+00
    log_lik[45]  2.873560e+00 2.681272e-03 0.139322967   2.54704307   2.80559996  2.891678e+00
    log_lik[46]  2.689669e+00 5.963575e-03 0.280196813   1.97318990   2.56913396  2.756082e+00
    log_lik[47]  2.435259e+00 6.821403e-03 0.324916984   1.68842917   2.25753450  2.485678e+00
    log_lik[48]  2.699319e+00 5.452140e-03 0.238778108   2.11922450   2.58832085  2.747008e+00
    log_lik[49]  2.279966e+00 8.200409e-03 0.392277547   1.34804999   2.06478509  2.343175e+00
    log_lik[50]  2.719443e+00 3.563200e-03 0.177344190   2.31249922   2.62214007  2.743720e+00
    log_lik[51]  2.418619e+00 4.698342e-03 0.240938206   1.86752765   2.27902509  2.445047e+00
    log_lik[52]  1.504658e+00 7.710691e-03 0.400659237   0.63779990   1.25467856  1.541890e+00
    log_lik[53]  2.672471e+00 3.090797e-03 0.160602534   2.32001149   2.57212431  2.685993e+00
    log_lik[54]  1.854814e+00 6.997527e-03 0.360937583   1.04480966   1.63603052  1.886166e+00
    log_lik[55]  2.869673e+00 2.582922e-03 0.128647935   2.56622994   2.80382516  2.884795e+00
    log_lik[56]  2.714235e+00 3.056682e-03 0.158474265   2.35245828   2.62540402  2.731998e+00
    log_lik[57]  1.609668e+00 7.477827e-03 0.388559294   0.76105837   1.36902866  1.643380e+00
    log_lik[58]  2.883513e+00 2.143228e-03 0.109107631   2.64053696   2.82097839  2.890654e+00
    log_lik[59]  2.903621e+00 2.067792e-03 0.103244226   2.68945368   2.83949878  2.906093e+00
    log_lik[60]  2.841091e+00 2.293749e-03 0.119186689   2.57997266   2.77081249  2.847705e+00
    log_lik[61]  2.171238e+00 5.368653e-03 0.277867834   1.58346702   1.99630633  2.198158e+00
    log_lik[62]  2.894702e+00 2.153447e-03 0.107158693   2.67477624   2.83273941  2.900048e+00
    log_lik[63] -2.080309e+00 2.132534e-02 1.075508483  -4.42410661  -2.74679165 -1.978252e+00
    log_lik[64]  2.529744e-01 1.288737e-02 0.665509296  -1.22137507  -0.16784435  3.028655e-01
    log_lik[65]  2.866611e+00 2.735208e-03 0.134152799   2.55952547   2.79899747  2.881208e+00
    log_lik[66]  1.357742e+00 1.521447e-02 0.735355609  -0.30509519   0.93351831  1.446848e+00
    log_lik[67]  1.958808e+00 9.352149e-03 0.462405731   0.96385176   1.67520242  2.014435e+00
    log_lik[68]  2.829446e+00 3.852587e-03 0.185710982   2.35089393   2.75399152  2.865516e+00
    log_lik[69]  1.858093e+00 1.656846e-02 0.786121382   0.08716029   1.44708394  2.012602e+00
    log_lik[70]  2.020418e+00 9.992458e-03 0.517707061   0.79164539   1.72680316  2.100529e+00
    log_lik[71]  2.828951e+00 3.433464e-03 0.178408006   2.35566306   2.75449274  2.862395e+00
    log_lik[72]  2.224567e+00 6.792642e-03 0.329938655   1.44633068   2.04324904  2.270276e+00
    lp__         2.394017e+02 4.082506e-02 2.031752421 234.41983863 238.38380889  2.397764e+02
                          75%         97.5%    n_eff      Rhat
    beta[1]       0.055023423   0.071624769 2017.855 1.0010891
    beta[2]       0.023974137   0.074990062 2151.597 0.9997629
    beta[3]       0.088135179   0.140132781 1809.730 1.0000331
    beta[4]      -0.001792672   0.018702992 2105.607 0.9998709
    beta[5]      -0.027204509  -0.007036234 1766.527 1.0001722
    cbeta0        0.074176438   0.077529500 2700.000 0.9999105
    sigma         0.022074894   0.024931207 2571.130 1.0004836
    beta0        -0.006626736   0.037656284 1778.158 1.0011292
    log_lik[1]    2.854702141   3.027767860 2483.673 0.9998988
    log_lik[2]    2.895481029   3.046566686 2455.982 0.9999209
    log_lik[3]    2.723139669   2.933952105 2520.607 0.9994487
    log_lik[4]    2.972367347   3.102625886 2630.032 1.0005403
    log_lik[5]    2.826250193   2.983700261 2464.047 0.9997244
    log_lik[6]    2.955006090   3.080157288 2634.777 1.0002897
    log_lik[7]    2.694386076   2.885570014 2547.355 0.9994033
    log_lik[8]    2.852596334   2.992124738 2673.204 0.9994858
    log_lik[9]    2.015621018   2.348219252 2590.318 0.9992977
    log_lik[10]   2.870571012   3.011641778 2528.873 0.9998255
    log_lik[11]   2.780840934   2.935968260 2700.000 0.9993728
    log_lik[12]   2.791360985   2.936695879 2700.000 0.9991006
    log_lik[13]   1.914075113   2.262740896 2700.000 0.9992730
    log_lik[14]   2.481417212   2.695023365 2700.000 0.9989629
    log_lik[15]   2.504695666   2.736091080 2700.000 0.9989709
    log_lik[16]   2.845698153   2.998183094 2700.000 0.9993760
    log_lik[17]   2.929761685   3.048195661 2700.000 0.9996054
    log_lik[18]   2.933501427   3.068342788 2625.968 0.9996591
    log_lik[19]   2.917454089   3.057175764 2700.000 0.9994765
    log_lik[20]   2.958389389   3.088677398 2582.152 0.9995303
    log_lik[21]   2.339246000   2.712177499 2700.000 0.9991631
    log_lik[22]   2.916649381   3.066762554 2700.000 0.9992182
    log_lik[23]   2.659415848   2.913365312 2700.000 0.9994161
    log_lik[24]   2.949364739   3.078818204 2700.000 0.9996608
    log_lik[25]   2.923960888   3.063099634 2374.921 0.9992894
    log_lik[26]   2.596365553   2.875991184 2429.334 0.9990705
    log_lik[27]   2.877477863   3.028095503 2367.550 0.9991650
    log_lik[28]   2.331170183   2.673624000 2687.962 0.9990338
    log_lik[29]   2.763275814   2.961246284 2700.000 0.9990731
    log_lik[30]   2.963759598   3.084771224 2533.514 0.9998388
    log_lik[31]   2.979865899   3.100417988 2594.970 1.0002251
    log_lik[32]   2.419334769   2.679983872 2664.186 0.9990859
    log_lik[33]   2.989942674   3.104349686 2575.233 1.0003795
    log_lik[34]   2.989942674   3.104349686 2575.233 1.0003795
    log_lik[35]   2.851052734   2.995707580 2642.301 0.9994472
    log_lik[36]   2.632252355   2.819382873 2565.699 0.9994201
    log_lik[37]   2.994852604   3.109550116 2579.145 1.0004301
    log_lik[38]   2.994345912   3.109560401 2575.990 1.0004253
    log_lik[39]   2.592934426   2.801667081 2669.874 0.9992269
    log_lik[40]   2.993273134   3.109223738 2605.128 1.0003846
    log_lik[41]   2.959792583   3.082061406 2694.648 0.9998075
    log_lik[42]   2.806068088   2.982019372 2700.000 0.9992635
    log_lik[43]   2.959054769   3.085870780 2700.000 0.9997016
    log_lik[44]   2.954728408   3.084005662 2700.000 0.9996167
    log_lik[45]   2.964382735   3.087830485 2700.000 0.9997871
    log_lik[46]   2.884750873   3.039018211 2207.563 1.0007124
    log_lik[47]   2.673113795   2.900754398 2268.806 1.0009325
    log_lik[48]   2.864892011   3.029914666 1918.029 1.0007769
    log_lik[49]   2.564203145   2.851315200 2288.316 1.0009860
    log_lik[50]   2.841307097   2.998910586 2477.156 1.0004831
    log_lik[51]   2.590685805   2.803870352 2629.796 1.0003989
    log_lik[52]   1.785915792   2.185808724 2700.000 1.0004247
    log_lik[53]   2.786300204   2.939195545 2700.000 0.9999469
    log_lik[54]   2.109215956   2.466082445 2660.573 1.0002311
    log_lik[55]   2.951541567   3.081437035 2480.750 1.0001909
    log_lik[56]   2.826197665   2.975459589 2687.924 0.9999721
    log_lik[57]   1.883403557   2.264911382 2700.000 1.0004782
    log_lik[58]   2.956303334   3.078375450 2591.634 0.9998096
    log_lik[59]   2.973835591   3.091447248 2492.975 1.0004020
    log_lik[60]   2.921379864   3.054351592 2700.000 1.0001861
    log_lik[61]   2.371166175   2.625066849 2678.834 0.9994100
    log_lik[62]   2.967469714   3.092051497 2476.205 1.0004038
    log_lik[63]  -1.345084019  -0.151002795 2543.522 0.9998128
    log_lik[64]   0.716672150   1.386447846 2666.733 1.0004144
    log_lik[65]   2.954903918   3.080417715 2405.574 1.0002802
    log_lik[66]   1.901290620   2.479112735 2336.044 1.0002470
    log_lik[67]   2.299771383   2.682749103 2444.689 1.0001665
    log_lik[68]   2.947306682   3.076343209 2323.648 1.0001059
    log_lik[69]   2.455324591   2.901446564 2251.204 1.0005565
    log_lik[70]   2.404917313   2.781645283 2684.253 0.9993194
    log_lik[71]   2.945926717   3.076691532 2700.000 0.9994939
    log_lik[72]   2.449956379   2.748148303 2359.331 0.9991587
    lp__        240.880868913 242.260342463 2476.783 1.0000042
    
    $c_summary
    , , chains = chain:1
    
                 stats
    parameter             mean          sd         2.5%          25%           50%           75%
      beta[1]       0.04643878 0.012628851   0.02049142   0.03819352   0.047001702   0.054672036
      beta[2]      -0.00140730 0.038707451  -0.07566133  -0.02657238  -0.001206666   0.023579548
      beta[3]       0.06144959 0.038584649  -0.01440674   0.03616748   0.062178000   0.086730348
      beta[4]      -0.01092832 0.014820893  -0.03970598  -0.01989798  -0.010928983  -0.001653754
      beta[5]      -0.03703277 0.014717248  -0.06681856  -0.04624826  -0.037645450  -0.027113155
      cbeta0        0.07233788 0.002567254   0.06744530   0.07047948   0.072270528   0.074185837
      sigma         0.02088342 0.001801060   0.01756746   0.01968722   0.020763381   0.022048136
      beta0        -0.02752243 0.032411022  -0.09009245  -0.04939535  -0.028054042  -0.006953943
      log_lik[1]    2.56619719 0.386706491   1.59400980   2.37328301   2.672661552   2.859236864
      log_lik[2]    2.69769984 0.272845128   1.98620346   2.56874004   2.761938188   2.897393450
      log_lik[3]    2.47594474 0.323693862   1.69910023   2.28717682   2.547969182   2.728703862
      log_lik[4]    2.89727496 0.111618302   2.64978004   2.83629549   2.906427271   2.970347414
      log_lik[5]    2.70914643 0.170436041   2.32168344   2.61641677   2.726831739   2.829876539
      log_lik[6]    2.87563467 0.114646737   2.60723928   2.81059665   2.885256035   2.957128443
      log_lik[7]    2.56457166 0.192918143   2.12902567   2.45110459   2.584004526   2.697265641
      log_lik[8]    2.75010690 0.147052720   2.40386344   2.66460552   2.771962688   2.852821975
      log_lik[9]    1.77134842 0.349560716   0.99003727   1.56550220   1.815295936   2.012138440
      log_lik[10]   2.78006235 0.133421540   2.47801156   2.70383799   2.792799223   2.868294665
      log_lik[11]   2.67475691 0.155625778   2.32203922   2.58018861   2.692522610   2.782818989
      log_lik[12]   2.68498164 0.155023970   2.29977060   2.59330457   2.698320431   2.798590880
      log_lik[13]   1.65985037 0.360418884   0.89068235   1.46004939   1.697094096   1.914231916
      log_lik[14]   2.31487288 0.235761992   1.77381209   2.17536525   2.334570894   2.487175516
      log_lik[15]   2.33468537 0.244450693   1.79728694   2.18363611   2.363513939   2.507809026
      log_lik[16]   2.74531350 0.146942403   2.40098995   2.66206681   2.759728108   2.845661447
      log_lik[17]   2.84473866 0.121651793   2.56637788   2.77018750   2.855964719   2.930368125
      log_lik[18]   2.84836050 0.124991501   2.56720774   2.77502286   2.860786724   2.930722856
      log_lik[19]   2.82067855 0.138656474   2.49797805   2.74079294   2.836743786   2.914811821
      log_lik[20]   2.87160993 0.127822155   2.55664162   2.80458157   2.887362248   2.956733309
      log_lik[21]   2.02326072 0.442043884   1.01995598   1.76598398   2.091522685   2.337897774
      log_lik[22]   2.78483581 0.191962506   2.29043026   2.70465406   2.825989913   2.914067188
      log_lik[23]   2.40144099 0.360625988   1.52170229   2.21428410   2.454859425   2.663272843
      log_lik[24]   2.83878972 0.171493897   2.36936858   2.77060961   2.870977392   2.951985745
      log_lik[25]   2.76133288 0.244915410   2.08467745   2.66901967   2.814460740   2.927555797
      log_lik[26]   2.33706705 0.373495823   1.40039185   2.14087735   2.401893557   2.586771140
      log_lik[27]   2.72367597 0.217117568   2.14375907   2.62772142   2.759618192   2.880211910
      log_lik[28]   2.07252272 0.377822224   1.22616249   1.87548557   2.126163599   2.335521250
      log_lik[29]   2.57746553 0.270483601   1.91450699   2.46143195   2.616645250   2.757665637
      log_lik[30]   2.88675406 0.110410301   2.64244539   2.82040964   2.894386823   2.965726700
      log_lik[31]   2.90888211 0.105058582   2.68203084   2.85311175   2.915896937   2.980100965
      log_lik[32]   2.21823719 0.281967292   1.59731029   2.06529730   2.244658537   2.418590396
      log_lik[33]   2.92415553 0.094569850   2.73343181   2.86704316   2.927491276   2.989958024
      log_lik[34]   2.92415553 0.094569850   2.73343181   2.86704316   2.927491276   2.989958024
      log_lik[35]   2.75061357 0.145843757   2.43537000   2.66769711   2.765627025   2.848613027
      log_lik[36]   2.49607600 0.198002437   2.04899646   2.38959501   2.512395924   2.637258523
      log_lik[37]   2.93143450 0.090363117   2.75250085   2.87302732   2.934560232   2.994320317
      log_lik[38]   2.93030317 0.090979234   2.74790741   2.87207759   2.932461862   2.992971635
      log_lik[39]   2.43531206 0.213065398   1.98449071   2.30651265   2.453205797   2.588139651
      log_lik[40]   2.92917402 0.091618923   2.74163152   2.87227354   2.933409764   2.992376304
      log_lik[41]   2.87407193 0.128092826   2.57449305   2.80683064   2.888796562   2.955843418
      log_lik[42]   2.61463105 0.254516015   2.01089367   2.49005984   2.656709687   2.797789416
      log_lik[43]   2.86815347 0.141087264   2.53640461   2.80010385   2.885622298   2.967183895
      log_lik[44]   2.85616001 0.152884332   2.47186398   2.78796614   2.878885033   2.960758216
      log_lik[45]   2.87559553 0.138250730   2.56917960   2.81058096   2.890960303   2.963947123
      log_lik[46]   2.69008281 0.283917921   1.96586691   2.57628890   2.763585968   2.887108630
      log_lik[47]   2.43747472 0.328414163   1.65197874   2.26339289   2.486732406   2.676129202
      log_lik[48]   2.69839958 0.236417416   2.17502125   2.58295372   2.745340511   2.861814041
      log_lik[49]   2.27883673 0.386355982   1.45222305   2.06555808   2.343264621   2.557833239
      log_lik[50]   2.71678324 0.179708493   2.32029715   2.61505324   2.742927293   2.842417539
      log_lik[51]   2.42328589 0.242035474   1.86492726   2.28534136   2.445765955   2.598111981
      log_lik[52]   1.51516071 0.405765066   0.63754932   1.24500374   1.561765287   1.811060194
      log_lik[53]   2.66809741 0.165060831   2.31515588   2.56318601   2.684335816   2.788183906
      log_lik[54]   1.84906107 0.357617969   1.07283639   1.63513126   1.878543706   2.100243272
      log_lik[55]   2.87035773 0.125928891   2.56220684   2.81045538   2.887266935   2.951675898
      log_lik[56]   2.71733895 0.155961154   2.35872958   2.63572326   2.737815662   2.825749716
      log_lik[57]   1.61918583 0.393637151   0.74655674   1.37246160   1.661039538   1.904720334
      log_lik[58]   2.88490353 0.104276129   2.64891234   2.82659054   2.895265125   2.953385111
      log_lik[59]   2.90174367 0.103735279   2.68693494   2.83570346   2.904555372   2.973073249
      log_lik[60]   2.83763561 0.121571323   2.58481459   2.76487096   2.840841214   2.926235407
      log_lik[61]   2.16193728 0.280500856   1.57032379   1.98537353   2.188761272   2.363179124
      log_lik[62]   2.89219948 0.107328018   2.67744335   2.83274848   2.896589033   2.967400958
      log_lik[63]  -2.10034498 1.041646594  -4.24981910  -2.73987793  -2.014045595  -1.380158421
      log_lik[64]   0.27561537 0.665348554  -1.20843591  -0.15119014   0.330771393   0.725514176
      log_lik[65]   2.86342990 0.134861235   2.55853032   2.79991943   2.876403988   2.951796612
      log_lik[66]   1.33790401 0.735276393  -0.38787265   0.92206097   1.431817388   1.869504079
      log_lik[67]   1.97541155 0.463248298   0.98445109   1.70116814   2.043817962   2.308607555
      log_lik[68]   2.82591211 0.188062849   2.33717742   2.75523581   2.863449237   2.944952099
      log_lik[69]   1.88260766 0.782023975   0.09272054   1.50201925   2.030936022   2.469296567
      log_lik[70]   2.03136356 0.502394575   0.76271761   1.72487556   2.107985446   2.397895212
      log_lik[71]   2.83371694 0.175854009   2.36938926   2.76275895   2.868649328   2.955230203
      log_lik[72]   2.22923323 0.331626335   1.42320658   2.05182700   2.271105437   2.446813604
      lp__        239.44026955 2.010619423 234.67689436 238.46085590 239.797177966 240.872201552
                 stats
    parameter             97.5%
      beta[1]       0.070997930
      beta[2]       0.073470942
      beta[3]       0.135432823
      beta[4]       0.017911495
      beta[5]      -0.006989883
      cbeta0        0.077166216
      sigma         0.024781110
      beta0         0.036654853
      log_lik[1]    3.035155201
      log_lik[2]    3.058167044
      log_lik[3]    2.914011898
      log_lik[4]    3.087978727
      log_lik[5]    2.996974110
      log_lik[6]    3.070112311
      log_lik[7]    2.889723644
      log_lik[8]    2.988011154
      log_lik[9]    2.346405109
      log_lik[10]   3.015262487
      log_lik[11]   2.935126827
      log_lik[12]   2.934023523
      log_lik[13]   2.262333605
      log_lik[14]   2.697797765
      log_lik[15]   2.735874665
      log_lik[16]   2.998391043
      log_lik[17]   3.039732820
      log_lik[18]   3.068541307
      log_lik[19]   3.056813475
      log_lik[20]   3.088247309
      log_lik[21]   2.719985017
      log_lik[22]   3.056647660
      log_lik[23]   2.918513926
      log_lik[24]   3.070462984
      log_lik[25]   3.058959782
      log_lik[26]   2.870613272
      log_lik[27]   3.021389854
      log_lik[28]   2.663906666
      log_lik[29]   2.957058564
      log_lik[30]   3.081060162
      log_lik[31]   3.099254939
      log_lik[32]   2.662048937
      log_lik[33]   3.102196023
      log_lik[34]   3.102196023
      log_lik[35]   3.000591811
      log_lik[36]   2.810342407
      log_lik[37]   3.102119633
      log_lik[38]   3.102970987
      log_lik[39]   2.792410045
      log_lik[40]   3.105380151
      log_lik[41]   3.087018410
      log_lik[42]   2.973726612
      log_lik[43]   3.071301206
      log_lik[44]   3.072751994
      log_lik[45]   3.093960956
      log_lik[46]   3.023221570
      log_lik[47]   2.887093200
      log_lik[48]   3.018906758
      log_lik[49]   2.855439411
      log_lik[50]   3.002832613
      log_lik[51]   2.803885796
      log_lik[52]   2.193960653
      log_lik[53]   2.939089632
      log_lik[54]   2.466902790
      log_lik[55]   3.072813188
      log_lik[56]   2.959424332
      log_lik[57]   2.264911382
      log_lik[58]   3.066635705
      log_lik[59]   3.087659949
      log_lik[60]   3.045881107
      log_lik[61]   2.622928278
      log_lik[62]   3.082103318
      log_lik[63]  -0.240078709
      log_lik[64]   1.404206011
      log_lik[65]   3.074284684
      log_lik[66]   2.444088025
      log_lik[67]   2.705141768
      log_lik[68]   3.070300109
      log_lik[69]   2.907808508
      log_lik[70]   2.770486038
      log_lik[71]   3.068707350
      log_lik[72]   2.761004676
      lp__        242.332331928
    
    , , chains = chain:2
    
                 stats
    parameter              mean          sd         2.5%          25%           50%           75%
      beta[1]       0.046573687 0.012968365   0.01922452   0.03835562  4.661458e-02   0.055590980
      beta[2]      -0.000679594 0.040203025  -0.07731360  -0.02586551  5.623988e-04   0.024419305
      beta[3]       0.061865889 0.039153730  -0.01625011   0.03530993  6.245849e-02   0.088713322
      beta[4]      -0.011300228 0.015393923  -0.04019140  -0.02072802 -1.139336e-02  -0.001914545
      beta[5]      -0.037263246 0.014880247  -0.06671254  -0.04719578 -3.782046e-02  -0.027202808
      cbeta0        0.072487794 0.002452994   0.06759219   0.07086659  7.248487e-02   0.074086053
      sigma         0.020822161 0.001911947   0.01719854   0.01954794  2.080912e-02   0.021998851
      beta0        -0.027567000 0.033285659  -0.09202049  -0.04997486 -2.714554e-02  -0.006274045
      log_lik[1]    2.555705813 0.403423762   1.48454389   2.35885186  2.666857e+00   2.857311348
      log_lik[2]    2.692566472 0.282362063   1.98030750   2.55752521  2.773079e+00   2.895380478
      log_lik[3]    2.488818904 0.314253339   1.73848551   2.30260169  2.550341e+00   2.718012127
      log_lik[4]    2.904301284 0.114464995   2.66369293   2.83579996  2.910541e+00   2.973724689
      log_lik[5]    2.708181240 0.165508750   2.34289436   2.60738262  2.726828e+00   2.825767111
      log_lik[6]    2.882838378 0.116550615   2.64664631   2.81232773  2.891636e+00   2.959361921
      log_lik[7]    2.562381074 0.185216998   2.15783856   2.44856916  2.578104e+00   2.694194514
      log_lik[8]    2.757078048 0.143584590   2.44330464   2.67191527  2.772100e+00   2.858925630
      log_lik[9]    1.772948432 0.336986219   1.01470755   1.56772350  1.812002e+00   2.003120904
      log_lik[10]   2.782270219 0.129327353   2.51908851   2.70783879  2.789442e+00   2.872401573
      log_lik[11]   2.676493721 0.149165473   2.37028892   2.58833726  2.685417e+00   2.781310637
      log_lik[12]   2.689346179 0.147548781   2.34898298   2.59842165  2.703917e+00   2.790301885
      log_lik[13]   1.655117646 0.350212508   0.89035564   1.44754192  1.692204e+00   1.897835567
      log_lik[14]   2.315735602 0.223570596   1.82856620   2.18855504  2.339754e+00   2.470968679
      log_lik[15]   2.336257552 0.234659918   1.80962277   2.20220904  2.359357e+00   2.499221976
      log_lik[16]   2.750067765 0.144799562   2.43936329   2.66846252  2.761700e+00   2.847352381
      log_lik[17]   2.849237344 0.119040432   2.59302233   2.77905012  2.855412e+00   2.936732183
      log_lik[18]   2.854873665 0.128383725   2.56594111   2.78634984  2.867374e+00   2.939324668
      log_lik[19]   2.827445694 0.141079131   2.49274560   2.75050895  2.844707e+00   2.921729538
      log_lik[20]   2.879222346 0.132562882   2.56239744   2.81618183  2.892368e+00   2.960501675
      log_lik[21]   2.031276845 0.430813964   1.04375922   1.77254493  2.092124e+00   2.340981437
      log_lik[22]   2.794210646 0.192898810   2.29975020   2.70992941  2.829427e+00   2.922225572
      log_lik[23]   2.394610738 0.354565258   1.57687928   2.19509281  2.461157e+00   2.655566303
      log_lik[24]   2.842089822 0.170187987   2.42147977   2.75955151  2.876108e+00   2.954115976
      log_lik[25]   2.766117805 0.248386281   2.10687229   2.67176434  2.825790e+00   2.924098495
      log_lik[26]   2.331365276 0.375907485   1.38641093   2.13389389  2.390509e+00   2.596039001
      log_lik[27]   2.724524902 0.222618176   2.17864248   2.62328313  2.766477e+00   2.875725092
      log_lik[28]   2.079046902 0.359767402   1.25986051   1.88166778  2.115781e+00   2.314875490
      log_lik[29]   2.586032398 0.249523499   1.99385199   2.45626700  2.620800e+00   2.763710609
      log_lik[30]   2.888643896 0.117231901   2.63984452   2.82861591  2.899498e+00   2.961617625
      log_lik[31]   2.913703139 0.107853256   2.68263237   2.84652764  2.916332e+00   2.982253251
      log_lik[32]   2.223400348 0.275288841   1.61533570   2.06813725  2.253168e+00   2.415660064
      log_lik[33]   2.927525467 0.101472925   2.72150855   2.86208301  2.929968e+00   2.992745671
      log_lik[34]   2.927525467 0.101472925   2.72150855   2.86208301  2.929968e+00   2.992745671
      log_lik[35]   2.755647376 0.147836897   2.41523010   2.67709073  2.772998e+00   2.855836564
      log_lik[36]   2.487393541 0.206046024   1.98245468   2.36167166  2.510462e+00   2.632465902
      log_lik[37]   2.933274090 0.098615500   2.74204186   2.86895264  2.934076e+00   2.997768181
      log_lik[38]   2.932564812 0.099111750   2.73933259   2.86591848  2.934297e+00   2.996539613
      log_lik[39]   2.437828466 0.228237669   1.90492161   2.29833236  2.456494e+00   2.601064487
      log_lik[40]   2.929237732 0.099076377   2.73652812   2.86610808  2.928029e+00   2.993847562
      log_lik[41]   2.872977576 0.137155080   2.53364805   2.80200486  2.891932e+00   2.961984770
      log_lik[42]   2.613639764 0.273410265   1.94461724   2.46231211  2.656621e+00   2.815443167
      log_lik[43]   2.863288232 0.147422553   2.49647431   2.79417469  2.882816e+00   2.954996633
      log_lik[44]   2.850717739 0.159741131   2.43437337   2.77547577  2.877347e+00   2.949145748
      log_lik[45]   2.872785300 0.143342279   2.52580455   2.80503759  2.892572e+00   2.962509449
      log_lik[46]   2.685380832 0.292611053   1.93400117   2.55999152  2.750716e+00   2.886388763
      log_lik[47]   2.427475556 0.333365670   1.63538610   2.24720149  2.475169e+00   2.674218088
      log_lik[48]   2.703845523 0.238349124   2.11995209   2.58779645  2.756249e+00   2.868074998
      log_lik[49]   2.284398132 0.394152205   1.31242504   2.05321112  2.354677e+00   2.571721537
      log_lik[50]   2.725743277 0.175421428   2.32753495   2.62855619  2.751185e+00   2.842482819
      log_lik[51]   2.410985515 0.243619157   1.83048111   2.27471731  2.440028e+00   2.582200688
      log_lik[52]   1.484840583 0.407022483   0.58639078   1.23938840  1.509267e+00   1.760637242
      log_lik[53]   2.679988030 0.157482077   2.34833440   2.57916554  2.699494e+00   2.793792204
      log_lik[54]   1.860775940 0.356882336   1.03674010   1.63799099  1.898505e+00   2.115788390
      log_lik[55]   2.870298842 0.134041249   2.56742267   2.80068961  2.882107e+00   2.952984959
      log_lik[56]   2.711573351 0.159903051   2.33805445   2.62600057  2.723940e+00   2.823512093
      log_lik[57]   1.591186467 0.394174023   0.73181106   1.35362082  1.613820e+00   1.866826900
      log_lik[58]   2.884902478 0.110680249   2.63966371   2.82448235  2.888689e+00   2.957854051
      log_lik[59]   2.909263246 0.105483239   2.69454185   2.84375911  2.913201e+00   2.975829342
      log_lik[60]   2.848098240 0.120460802   2.60433367   2.77054882  2.859614e+00   2.924853777
      log_lik[61]   2.179722887 0.267678945   1.63529342   2.01854716  2.206163e+00   2.368992763
      log_lik[62]   2.900948499 0.110290993   2.67519424   2.83559986  2.908828e+00   2.972706274
      log_lik[63]  -2.089037883 1.055614388  -4.46485297  -2.74823615 -1.978084e+00  -1.388993058
      log_lik[64]   0.217412591 0.691331758  -1.30244131  -0.19300339  2.577232e-01   0.718684971
      log_lik[65]   2.873034012 0.138028791   2.57703490   2.80208973  2.890364e+00   2.958817045
      log_lik[66]   1.365285258 0.724725875  -0.19645973   0.93482568  1.448547e+00   1.898831201
      log_lik[67]   1.943082513 0.468814524   0.92274043   1.65337128  2.000942e+00   2.292893347
      log_lik[68]   2.835324941 0.191329297   2.38977540   2.75220845  2.876349e+00   2.954466999
      log_lik[69]   1.837905780 0.800885947   0.03459476   1.41330361  2.013041e+00   2.437525180
      log_lik[70]   2.011678229 0.546238574   0.75603394   1.71100922  2.105416e+00   2.420740559
      log_lik[71]   2.827298672 0.183832251   2.34311443   2.74559818  2.865355e+00   2.942688040
      log_lik[72]   2.219976843 0.331687049   1.45466437   2.03431183  2.269058e+00   2.444227358
      lp__        239.409255794 2.010666043 234.21417225 238.40189389  2.397621e+02 240.927403580
                 stats
    parameter             97.5%
      beta[1]       0.071992668
      beta[2]       0.073293156
      beta[3]       0.141115378
      beta[4]       0.019578524
      beta[5]      -0.007556109
      cbeta0        0.077377172
      sigma         0.024884828
      beta0         0.039992168
      log_lik[1]    3.026332813
      log_lik[2]    3.047029543
      log_lik[3]    2.936209844
      log_lik[4]    3.118233931
      log_lik[5]    2.975066210
      log_lik[6]    3.095420634
      log_lik[7]    2.872583557
      log_lik[8]    2.988620973
      log_lik[9]    2.327503062
      log_lik[10]   3.010633233
      log_lik[11]   2.928746939
      log_lik[12]   2.931229033
      log_lik[13]   2.219096585
      log_lik[14]   2.685523533
      log_lik[15]   2.739687069
      log_lik[16]   2.991905960
      log_lik[17]   3.061505579
      log_lik[18]   3.070723784
      log_lik[19]   3.060320927
      log_lik[20]   3.094216870
      log_lik[21]   2.706689967
      log_lik[22]   3.069935435
      log_lik[23]   2.914197728
      log_lik[24]   3.096854114
      log_lik[25]   3.072929553
      log_lik[26]   2.874959521
      log_lik[27]   3.027264044
      log_lik[28]   2.673131999
      log_lik[29]   2.954345189
      log_lik[30]   3.091642472
      log_lik[31]   3.112538394
      log_lik[32]   2.674622119
      log_lik[33]   3.120917517
      log_lik[34]   3.120917517
      log_lik[35]   2.991665016
      log_lik[36]   2.821154493
      log_lik[37]   3.123357094
      log_lik[38]   3.121133963
      log_lik[39]   2.820126271
      log_lik[40]   3.112857625
      log_lik[41]   3.092796353
      log_lik[42]   3.000980740
      log_lik[43]   3.093727037
      log_lik[44]   3.092411004
      log_lik[45]   3.094175887
      log_lik[46]   3.048142404
      log_lik[47]   2.897845283
      log_lik[48]   3.034282903
      log_lik[49]   2.857102923
      log_lik[50]   3.005570431
      log_lik[51]   2.803074764
      log_lik[52]   2.164799968
      log_lik[53]   2.945753353
      log_lik[54]   2.481880445
      log_lik[55]   3.091733647
      log_lik[56]   2.979217346
      log_lik[57]   2.244341122
      log_lik[58]   3.084134002
      log_lik[59]   3.107411797
      log_lik[60]   3.063353061
      log_lik[61]   2.617976030
      log_lik[62]   3.107239376
      log_lik[63]  -0.104810738
      log_lik[64]   1.361214817
      log_lik[65]   3.101222371
      log_lik[66]   2.505238319
      log_lik[67]   2.673459163
      log_lik[68]   3.088420640
      log_lik[69]   2.888233643
      log_lik[70]   2.812263455
      log_lik[71]   3.088213914
      log_lik[72]   2.745370189
      lp__        242.154798132
    
    , , chains = chain:3
    
                 stats
    parameter              mean          sd         2.5%          25%           50%           75%
      beta[1]      4.687192e-02 0.012603092   0.02224220   0.03867635   0.047139790   0.054893417
      beta[2]     -1.283318e-04 0.038763274  -0.07783005  -0.02446199   0.000084502   0.023693054
      beta[3]      6.235675e-02 0.039868629  -0.01668995   0.03842740   0.061796315   0.088541641
      beta[4]     -1.144154e-02 0.014789280  -0.04132340  -0.02073492  -0.011456795  -0.001800738
      beta[5]     -3.740335e-02 0.015170379  -0.06604108  -0.04750805  -0.037285619  -0.027253067
      cbeta0       7.251963e-02 0.002613811   0.06749783   0.07076701   0.072521665   0.074274167
      sigma        2.096193e-02 0.001901251   0.01773725   0.01956603   0.020908996   0.022155558
      beta0       -2.842157e-02 0.032212332  -0.08978096  -0.04848694  -0.029013968  -0.007174124
      log_lik[1]   2.543255e+00 0.445648129   1.35643247   2.38973617   2.668732449   2.848637326
      log_lik[2]   2.679340e+00 0.318002708   1.83242494   2.57231879   2.767129757   2.891120727
      log_lik[3]   2.479547e+00 0.327324202   1.66949962   2.31774815   2.533466420   2.720535050
      log_lik[4]   2.891291e+00 0.123267181   2.61890325   2.82533162   2.903477493   2.971991740
      log_lik[5]   2.698571e+00 0.186173125   2.27612565   2.61092779   2.729292028   2.824701722
      log_lik[6]   2.872157e+00 0.120721951   2.60140898   2.80731748   2.886327032   2.950186436
      log_lik[7]   2.554756e+00 0.205521180   2.08956405   2.45378128   2.586371941   2.691147705
      log_lik[8]   2.749841e+00 0.149942242   2.39220996   2.67530446   2.766726334   2.848744030
      log_lik[9]   1.782544e+00 0.358691448   0.96038763   1.57342869   1.801725763   2.028949554
      log_lik[10]  2.772485e+00 0.143839111   2.46159142   2.69159289   2.793381958   2.871859762
      log_lik[11]  2.667849e+00 0.163393600   2.32331294   2.58386800   2.689725491   2.778114534
      log_lik[12]  2.685829e+00 0.153970898   2.33092385   2.60561674   2.700734799   2.788799366
      log_lik[13]  1.670250e+00 0.368810375   0.81809852   1.45175679   1.697262192   1.924279780
      log_lik[14]  2.319557e+00 0.236269756   1.77811673   2.18246032   2.333584566   2.484271179
      log_lik[15]  2.330679e+00 0.242191082   1.77962269   2.18436460   2.361122876   2.501892752
      log_lik[16]  2.740925e+00 0.152790818   2.38029758   2.65480512   2.757621960   2.843398770
      log_lik[17]  2.843539e+00 0.119788649   2.60376944   2.77674104   2.854694013   2.921465129
      log_lik[18]  2.845433e+00 0.132216436   2.54584447   2.77123839   2.854735799   2.931767124
      log_lik[19]  2.818036e+00 0.144849657   2.48360608   2.73542575   2.830595022   2.914239450
      log_lik[20]  2.870244e+00 0.133116362   2.55836437   2.80230385   2.884702813   2.958389389
      log_lik[21]  2.026432e+00 0.426974261   1.06753753   1.76148862   2.075696186   2.334955456
      log_lik[22]  2.784688e+00 0.193502497   2.29078868   2.69561585   2.821072439   2.914967101
      log_lik[23]  2.402272e+00 0.354735536   1.53525342   2.21494043   2.463363730   2.664450834
      log_lik[24]  2.837859e+00 0.166489089   2.43289367   2.76790740   2.867090885   2.942032049
      log_lik[25]  2.761416e+00 0.232218915   2.16651161   2.65500375   2.818873499   2.922452774
      log_lik[26]  2.334425e+00 0.367239856   1.43915107   2.12329551   2.406861297   2.610997989
      log_lik[27]  2.720999e+00 0.215406481   2.18526727   2.60539194   2.764038407   2.877389917
      log_lik[28]  2.082459e+00 0.356254341   1.30625966   1.85106525   2.123343001   2.347841754
      log_lik[29]  2.582251e+00 0.246605185   1.99841557   2.44865824   2.614345599   2.765280355
      log_lik[30]  2.882860e+00 0.115999558   2.62788423   2.81367354   2.894781170   2.963137365
      log_lik[31]  2.907466e+00 0.101980319   2.70196050   2.84611255   2.911904471   2.977204132
      log_lik[32]  2.226680e+00 0.269593263   1.63323880   2.06213572   2.246627350   2.423137703
      log_lik[33]  2.921720e+00 0.096980106   2.72719266   2.85846907   2.925002858   2.988224740
      log_lik[34]  2.921720e+00 0.096980106   2.72719266   2.85846907   2.925002858   2.988224740
      log_lik[35]  2.752319e+00 0.139375641   2.45220917   2.66904890   2.767407925   2.850811976
      log_lik[36]  2.487794e+00 0.198335059   2.06437099   2.37395622   2.506328349   2.625296854
      log_lik[37]  2.927989e+00 0.095929992   2.73202238   2.86684606   2.929645461   2.993729564
      log_lik[38]  2.927129e+00 0.095941733   2.73326851   2.86616420   2.927792276   2.993193641
      log_lik[39]  2.441287e+00 0.217589122   1.91279392   2.31437840   2.476516993   2.590618562
      log_lik[40]  2.924636e+00 0.097867797   2.71596945   2.86192363   2.926170520   2.992888492
      log_lik[41]  2.871686e+00 0.130185605   2.56676868   2.80296213   2.887305574   2.962437543
      log_lik[42]  2.618054e+00 0.259268166   1.97570786   2.48792198   2.670616488   2.808522364
      log_lik[43]  2.860988e+00 0.141985036   2.51742669   2.78987478   2.882226215   2.954328525
      log_lik[44]  2.848723e+00 0.153247799   2.46276145   2.77345460   2.875510662   2.949810676
      log_lik[45]  2.872298e+00 0.136415664   2.56090920   2.80057521   2.892332344   2.966459442
      log_lik[46]  2.693543e+00 0.263516998   2.00993637   2.57235205   2.755530615   2.880694694
      log_lik[47]  2.440828e+00 0.312830395   1.75240830   2.25904706   2.490531141   2.665107608
      log_lik[48]  2.695712e+00 0.241731074   2.08650310   2.59475265   2.739669276   2.859903144
      log_lik[49]  2.276663e+00 0.396646685   1.30603706   2.07779517   2.335814769   2.560675260
      log_lik[50]  2.715804e+00 0.176903798   2.28948005   2.62036002   2.736187641   2.836878428
      log_lik[51]  2.421584e+00 0.237196957   1.89738930   2.27718525   2.447709815   2.593513923
      log_lik[52]  1.513972e+00 0.388635526   0.67569262   1.27511861   1.557145771   1.781665415
      log_lik[53]  2.669327e+00 0.159077357   2.27953195   2.57883574   2.678221201   2.780516762
      log_lik[54]  1.854605e+00 0.368501379   1.02196362   1.63704496   1.883634141   2.119129762
      log_lik[55]  2.868364e+00 0.125936387   2.56736765   2.80216209   2.882735413   2.950846082
      log_lik[56]  2.713792e+00 0.159650706   2.35490186   2.61368570   2.735357061   2.827847699
      log_lik[57]  1.618631e+00 0.377404851   0.79196877   1.38615520   1.660955990   1.877974517
      log_lik[58]  2.880732e+00 0.112270531   2.63823331   2.80849040   2.885498452   2.959566422
      log_lik[59]  2.899855e+00 0.100322020   2.68656831   2.83802576   2.902549832   2.972149943
      log_lik[60]  2.837540e+00 0.115255717   2.57048769   2.77478079   2.841990214   2.914173713
      log_lik[61]  2.172053e+00 0.285151335   1.56670605   1.98080320   2.206015395   2.383092413
      log_lik[62]  2.890959e+00 0.103594113   2.66722047   2.82970932   2.896851054   2.965754884
      log_lik[63] -2.051545e+00 1.127862154  -4.48890840  -2.74414429  -1.920156284  -1.241284998
      log_lik[64]  2.658952e-01 0.638058407  -1.14200598  -0.14728861   0.329610001   0.713418734
      log_lik[65]  2.863370e+00 0.129342612   2.55939992   2.79486933   2.877800212   2.951068869
      log_lik[66]  1.370036e+00 0.746314536  -0.27054195   0.93559717   1.461945778   1.951360927
      log_lik[67]  1.957931e+00 0.454997252   0.98142421   1.67520242   2.000501556   2.306393491
      log_lik[68]  2.827101e+00 0.177525467   2.35745483   2.75316732   2.859642402   2.943337063
      log_lik[69]  1.853766e+00 0.775452868   0.13718003   1.41781292   1.995595237   2.456305549
      log_lik[70]  2.018212e+00 0.503666454   0.85765690   1.74248777   2.088647125   2.386342436
      log_lik[71]  2.825837e+00 0.175513384   2.36819786   2.74830371   2.854341721   2.941178275
      log_lik[72]  2.224492e+00 0.326780516   1.46098246   2.03768262   2.272001753   2.459073299
      lp__         2.393557e+02 2.074657630 234.39765993 238.34523662 239.750981231 240.847796260
                 stats
    parameter             97.5%
      beta[1]       0.071101532
      beta[2]       0.078602922
      beta[3]       0.140317772
      beta[4]       0.018162768
      beta[5]      -0.007173446
      cbeta0        0.077935288
      sigma         0.025175510
      beta0         0.036585080
      log_lik[1]    3.020214142
      log_lik[2]    3.039543731
      log_lik[3]    2.944433765
      log_lik[4]    3.095518023
      log_lik[5]    2.981648625
      log_lik[6]    3.065689222
      log_lik[7]    2.889227799
      log_lik[8]    3.000572582
      log_lik[9]    2.378855814
      log_lik[10]   3.009334319
      log_lik[11]   2.939300034
      log_lik[12]   2.945842961
      log_lik[13]   2.308624793
      log_lik[14]   2.699801767
      log_lik[15]   2.731453705
      log_lik[16]   2.994657001
      log_lik[17]   3.045674943
      log_lik[18]   3.065330036
      log_lik[19]   3.056970236
      log_lik[20]   3.081258086
      log_lik[21]   2.707667044
      log_lik[22]   3.067973384
      log_lik[23]   2.906512509
      log_lik[24]   3.066921951
      log_lik[25]   3.057859099
      log_lik[26]   2.883942507
      log_lik[27]   3.034807944
      log_lik[28]   2.673624000
      log_lik[29]   2.972272327
      log_lik[30]   3.082047715
      log_lik[31]   3.088764392
      log_lik[32]   2.687800554
      log_lik[33]   3.092161010
      log_lik[34]   3.092161010
      log_lik[35]   2.989691164
      log_lik[36]   2.824570429
      log_lik[37]   3.100103357
      log_lik[38]   3.097497426
      log_lik[39]   2.791071280
      log_lik[40]   3.097253257
      log_lik[41]   3.074381737
      log_lik[42]   2.977064743
      log_lik[43]   3.095820345
      log_lik[44]   3.092828391
      log_lik[45]   3.076685914
      log_lik[46]   3.039502292
      log_lik[47]   2.907380515
      log_lik[48]   3.028609563
      log_lik[49]   2.837183843
      log_lik[50]   2.985369808
      log_lik[51]   2.805323449
      log_lik[52]   2.217762713
      log_lik[53]   2.932874831
      log_lik[54]   2.444394038
      log_lik[55]   3.071198074
      log_lik[56]   2.986448837
      log_lik[57]   2.291878056
      log_lik[58]   3.070533537
      log_lik[59]   3.084007544
      log_lik[60]   3.047210898
      log_lik[61]   2.627840652
      log_lik[62]   3.083295756
      log_lik[63]  -0.135317133
      log_lik[64]   1.394156822
      log_lik[65]   3.074725146
      log_lik[66]   2.480596310
      log_lik[67]   2.685455658
      log_lik[68]   3.069122790
      log_lik[69]   2.911229857
      log_lik[70]   2.775577287
      log_lik[71]   3.080374891
      log_lik[72]   2.729579098
      lp__        242.277502989
    
    library(broom)
    tidyMCMC(constable.rstanarm$stanfit, conf.int = TRUE, conf.method = "HPDinterval", ess = TRUE, rhat = TRUE)
    
                          term     estimate   std.error     conf.low     conf.high      rhat  ess
    1              (Intercept)  -0.02662339 0.030640340  -0.08652069   0.032339963 0.9990184 2266
    2              I(IV^(1/3))   0.04617694 0.011955033   0.02348520   0.069677622 0.9990403 2279
    3             TREATInitial  -0.00262214 0.037692592  -0.07602074   0.073892281 0.9990874 2277
    4                 TREATLow   0.06178365 0.037596699  -0.01092534   0.133310013 0.9989826 2298
    5 I(IV^(1/3)):TREATInitial  -0.01050189 0.014390980  -0.03897242   0.018017304 0.9991925 2267
    6     I(IV^(1/3)):TREATLow  -0.03725927 0.014237953  -0.06442732  -0.009792636 0.9990557 2308
    7                    sigma   0.02083855 0.001867566   0.01735171   0.024641950 0.9999474 2569
    8                 mean_PPD   0.07249442 0.003450626   0.06591716   0.079287256 0.9993001 2598
    9            log-posterior 169.80857221 1.999673060 165.95405422 172.982238703 1.0003394 2513
    
    constable.mcmc = as.matrix(constable.rstanarm)
    str(constable.mcmc)
    
     num [1:2700, 1:7] -0.00323 -0.08829 -0.03744 0.04064 -0.03022 ...
     - attr(*, "dimnames")=List of 2
      ..$ iterations: NULL
      ..$ parameters: chr [1:7] "(Intercept)" "I(IV^(1/3))" "TREATInitial" "TREATLow" ...
    
    Xmat = model.matrix(~I(IV^(1/3)) * TREAT, data = constable)
    terms = attr(Xmat, "assign")
    ## Effect effect
    for (i in 1:length(terms)) print(mcmcpvalue(constable.mcmc[, i]))
    
    [1] 0.3711111
    [1] 0.0003703704
    [1] 0.9322222
    [1] 0.09259259
    [1] 0.4592593
    [1] 0.01111111
    
    ## Effect of covariate
    mcmcpvalue(constable.mcmc[, which(terms == 1)])
    
    [1] 0.0003703704
    
    ## Effect of treatment (marginal)
    mcmcpvalue(constable.mcmc[, which(terms == 2)])
    
    [1] 0.08888889
    
    ## Effect of interaction
    mcmcpvalue(constable.mcmc[, which(terms == 3)])
    
    [1] 0.01555556
    
    # Overal model
    wch = grep("TREAT|IV", colnames(constable.mcmc))
    mcmcpvalue(constable.mcmcpack[, wch])
    
    [1] 0
    
    ## Frequentist for comparison
    summary(lm(SUTW ~ I(IV^(1/3)) * TREAT, constable))
    
    Call:
    lm(formula = SUTW ~ I(IV^(1/3)) * TREAT, data = constable)
    
    Residuals:
          Min        1Q    Median        3Q       Max 
    -0.047764 -0.012727  0.001153  0.012959  0.065303 
    
    Coefficients:
                               Estimate Std. Error t value Pr(>|t|)    
    (Intercept)              -0.0282017  0.0307338  -0.918 0.362163    
    I(IV^(1/3))               0.0467744  0.0119981   3.898 0.000229 ***
    TREATInitial             -0.0005899  0.0380584  -0.015 0.987680    
    TREATLow                  0.0630730  0.0373498   1.689 0.095994 .  
    I(IV^(1/3)):TREATInitial -0.0113263  0.0145247  -0.780 0.438302    
    I(IV^(1/3)):TREATLow     -0.0377210  0.0141608  -2.664 0.009700 ** 
    ---
    Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
    
    Residual standard error: 0.02048 on 66 degrees of freedom
    Multiple R-squared:  0.4958,	Adjusted R-squared:  0.4576 
    F-statistic: 12.98 on 5 and 66 DF,  p-value: 8.431e-09
    
    summary(constable.brm)
    
     Family: gaussian(identity) 
    Formula: SUTW ~ I(IV^(1/3)) * TREAT 
       Data: constable (Number of observations: 72) 
    Samples: 3 chains, each with iter = 5000; warmup = 500; thin = 5; 
             total post-warmup samples = 2700
        ICs: LOO = NA; WAIC = NA; R2 = NA
     
    Population-Level Effects: 
                         Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
    Intercept               -0.03      0.03    -0.09     0.03       2365    1
    IIVE1D3                  0.05      0.01     0.02     0.07       2361    1
    TREATInitial             0.00      0.04    -0.07     0.07       2375    1
    TREATLow                 0.06      0.04    -0.01     0.14       2432    1
    IIVE1D3:TREATInitial    -0.01      0.01    -0.04     0.02       2369    1
    IIVE1D3:TREATLow        -0.04      0.01    -0.07    -0.01       2417    1
    
    Family Specific Parameters: 
          Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
    sigma     0.02         0     0.02     0.02       2583    1
    
    Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample 
    is a crude measure of effective sample size, and Rhat is the potential 
    scale reduction factor on split chains (at convergence, Rhat = 1).
    
    library(broom)
    tidyMCMC(constable.brm$fit, conf.int = TRUE, conf.method = "HPDinterval", ess = TRUE, rhat = TRUE)
    
                        term      estimate   std.error    conf.low    conf.high      rhat  ess
    1            b_Intercept -0.0280889091 0.030786671 -0.08771759  0.030012311 1.0014465 2365
    2              b_IIVE1D3  0.0467122190 0.012000835  0.02321772  0.068893899 1.0012960 2361
    3         b_TREATInitial -0.0009689161 0.038416903 -0.07024299  0.077346297 1.0008347 2375
    4             b_TREATLow  0.0629078547 0.038275435 -0.01974641  0.129499738 1.0000511 2432
    5 b_IIVE1D3:TREATInitial -0.0111653298 0.014644996 -0.04043846  0.016701214 1.0009657 2369
    6     b_IIVE1D3:TREATLow -0.0376073240 0.014462735 -0.06485635 -0.008515056 1.0000974 2417
    7                  sigma  0.0209160338 0.001855845  0.01739081  0.024625223 0.9993802 2583
    
    constable.mcmc = as.matrix(constable.rstanarm)
    str(constable.mcmc)
    
     num [1:2700, 1:7] -0.00323 -0.08829 -0.03744 0.04064 -0.03022 ...
     - attr(*, "dimnames")=List of 2
      ..$ iterations: NULL
      ..$ parameters: chr [1:7] "(Intercept)" "I(IV^(1/3))" "TREATInitial" "TREATLow" ...
    
    Xmat = model.matrix(~I(IV^(1/3)) * TREAT, data = constable)
    terms = attr(Xmat, "assign")
    ## Effect effect
    for (i in 1:length(terms)) print(mcmcpvalue(constable.mcmc[, i]))
    
    [1] 0.3711111
    [1] 0.0003703704
    [1] 0.9322222
    [1] 0.09259259
    [1] 0.4592593
    [1] 0.01111111
    
    ## Effect of covariate
    mcmcpvalue(constable.mcmc[, which(terms == 1)])
    
    [1] 0.0003703704
    
    ## Effect of treatment (marginal)
    mcmcpvalue(constable.mcmc[, which(terms == 2)])
    
    [1] 0.08888889
    
    ## Effect of interaction
    mcmcpvalue(constable.mcmc[, which(terms == 3)])
    
    [1] 0.01555556
    
    # Overal model
    wch = grep("b_TREAT|IV", colnames(constable.mcmc))
    mcmcpvalue(constable.mcmcpack[, wch])
    
    [1] 0
    
    ## Frequentist for comparison
    summary(lm(SUTW ~ I(IV^(1/3)) * TREAT, constable))
    
    Call:
    lm(formula = SUTW ~ I(IV^(1/3)) * TREAT, data = constable)
    
    Residuals:
          Min        1Q    Median        3Q       Max 
    -0.047764 -0.012727  0.001153  0.012959  0.065303 
    
    Coefficients:
                               Estimate Std. Error t value Pr(>|t|)    
    (Intercept)              -0.0282017  0.0307338  -0.918 0.362163    
    I(IV^(1/3))               0.0467744  0.0119981   3.898 0.000229 ***
    TREATInitial             -0.0005899  0.0380584  -0.015 0.987680    
    TREATLow                  0.0630730  0.0373498   1.689 0.095994 .  
    I(IV^(1/3)):TREATInitial -0.0113263  0.0145247  -0.780 0.438302    
    I(IV^(1/3)):TREATLow     -0.0377210  0.0141608  -2.664 0.009700 ** 
    ---
    Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
    
    Residual standard error: 0.02048 on 66 degrees of freedom
    Multiple R-squared:  0.4958,	Adjusted R-squared:  0.4576 
    F-statistic: 12.98 on 5 and 66 DF,  p-value: 8.431e-09
    
  11. The presence of an interaction means that we cannot simply make conclusions about the effect of treatment without reference to intial volume. That is, we need to delve further into the interaction. One approach we took for multiple linear regression was to explore the effect of one predictor at set levels of the other predictor.

    Whilst we could take a similar approach here, Constable, took a different approach. Rather than explore the effects of the different treatments at 3-5 different suture widths, Constable elected to estimate the range of suture widths over which pairwise treatment differences were different. We will take the same approach here, just in a Bayesian framework.

  12. Explore the range of suture widths for which pairwise treatment differences occur.
    library(MCMCpack)
    constable.mcmc = constable.mcmcpack
    ## It is easier to first determine the range of IV for which the the treatment pairs do not differ.
    ## After which we can subtract this interval from the full domain Lets consider the three treatments
    ## (High, Initial, Low) Start by calculating the intercepts and slopes.
    a.High = as.vector(constable.mcmc[, 1])
    a.Initial = as.vector(constable.mcmc[, 3]) + a.High
    a.Low = as.vector(constable.mcmc[, 4]) + a.High
    
    b.High = as.vector(constable.mcmc[, 2])
    b.Initial = as.vector(constable.mcmc[, 5]) + b.High
    b.Low = as.vector(constable.mcmc[, 6]) + b.High
    
    ## Generate a function that finds the intersection of two lines
    inter <- function(m1, a1, m2, a2) {
        data.frame(x = ((a2 - a1)/(m1 - m2)), y = m1 * ((a2 - a1)/(m1 - m2)) + a1)
    }
    
    ## We will also define the maximum and minimum domain in which to explore the range of inital volume.
    minx = min(constable$IV)
    maxx = max(constable$IV)
    
    ## Start with High vs Initial
    intersect.HvsI = inter(b.High, a.High, b.Initial, a.Initial)
    ## Narrow this to the likely domain
    intersect.HvsI.1 <- subset(intersect.HvsI, x < 15^(1/3) & x > minx^(1/3))$x
    comp.HvsI.1 = tidyMCMC(as.mcmc(intersect.HvsI.1^3), conf.int = TRUE, conf.method = "HPDinterval")
    
    intersect.HvsI.2 <- subset(intersect.HvsI, x < 45.5^(1/3) & x > 17^(1/3))$x
    comp.HvsI.2 = tidyMCMC(as.mcmc(intersect.HvsI.2^3), conf.int = TRUE, conf.method = "HPDinterval")
    (comp.HvsI = data.frame(lower = comp.HvsI.1$conf.high, upper = comp.HvsI.2$conf.low))
    
         lower    upper
    1 9.026002 25.59764
    
    ## Now High vs Low
    intersect.HvsL = inter(b.High, a.High, b.Low, a.Low)$x
    comp.HvsL = tidyMCMC(as.mcmc(intersect.HvsL^3), conf.int = TRUE, conf.method = "HPDinterval")
    (comp.HvsL = data.frame(lower = comp.HvsL$conf.high, upper = maxx))
    
         lower upper
    1 10.62578  47.5
    
    ## Finally Initial vs Low
    intersect.IvsL = inter(b.Initial, a.Initial, b.Low, a.Low)$x
    comp.IvsL = tidyMCMC(as.mcmc(intersect.IvsL^3), conf.int = TRUE, conf.method = "HPDinterval")
    (comp.IvsL = data.frame(lower = comp.IvsL$conf.high, upper = maxx))
    
         lower upper
    1 23.49624  47.5
    
    constable.mcmc = constable.r2jags$BUGSoutput$sims.matrix
    wch = c(which(colnames(constable.mcmc) == "beta0"), grep("beta\\[", colnames(constable.mcmc)))
    constable.mcmc = constable.mcmc[, wch]
    
    ## It is easier to first determine the range of IV for which the the treatment pairs do not differ.
    ## After which we can subtract this interval from the full domain Lets consider the three treatments
    ## (High, Initial, Low) Start by calculating the intercepts and slopes.
    a.High = as.vector(constable.mcmc[, 1])
    a.Initial = as.vector(constable.mcmc[, 3]) + a.High
    a.Low = as.vector(constable.mcmc[, 4]) + a.High
    
    b.High = as.vector(constable.mcmc[, 2])
    b.Initial = as.vector(constable.mcmc[, 5]) + b.High
    b.Low = as.vector(constable.mcmc[, 6]) + b.High
    
    ## Generate a function that finds the intersection of two lines
    inter <- function(m1, a1, m2, a2) {
        data.frame(x = ((a2 - a1)/(m1 - m2)), y = m1 * ((a2 - a1)/(m1 - m2)) + a1)
    }
    
    ## We will also define the maximum and minimum domain in which to explore the range of inital volume.
    minx = min(constable$IV)
    maxx = max(constable$IV)
    
    ## Start with High vs Initial
    intersect.HvsI = inter(b.High, a.High, b.Initial, a.Initial)
    ## Narrow this to the likely domain
    intersect.HvsI.1 <- subset(intersect.HvsI, x < 15^(1/3) & x > minx^(1/3))$x
    comp.HvsI.1 = tidyMCMC(as.mcmc(intersect.HvsI.1^3), conf.int = TRUE, conf.method = "HPDinterval")
    
    intersect.HvsI.2 <- subset(intersect.HvsI, x < 45.5^(1/3) & x > 17^(1/3))$x
    comp.HvsI.2 = tidyMCMC(as.mcmc(intersect.HvsI.2^3), conf.int = TRUE, conf.method = "HPDinterval")
    (comp.HvsI = data.frame(lower = comp.HvsI.1$conf.high, upper = comp.HvsI.2$conf.low))
    
         lower    upper
    1 8.816773 31.58686
    
    ## Now High vs Low
    intersect.HvsL = inter(b.High, a.High, b.Low, a.Low)$x
    comp.HvsL = tidyMCMC(as.mcmc(intersect.HvsL^3), conf.int = TRUE, conf.method = "HPDinterval")
    (comp.HvsL = data.frame(lower = comp.HvsL$conf.high, upper = maxx))
    
         lower upper
    1 10.39011  47.5
    
    ## Finally Initial vs Low
    intersect.IvsL = inter(b.Initial, a.Initial, b.Low, a.Low)$x
    comp.IvsL = tidyMCMC(as.mcmc(intersect.IvsL^3), conf.int = TRUE, conf.method = "HPDinterval")
    (comp.IvsL = data.frame(lower = comp.IvsL$conf.high, upper = maxx))
    
         lower upper
    1 22.97178  47.5
    
    constable.mcmc = as.matrix(constable.rstan)
    wch = c(which(colnames(constable.mcmc) == "beta0"), grep("beta\\[", colnames(constable.mcmc)))
    constable.mcmc = constable.mcmc[, wch]
    
    ## It is easier to first determine the range of IV for which the the treatment pairs do not differ.
    ## After which we can subtract this interval from the full domain Lets consider the three treatments
    ## (High, Initial, Low) Start by calculating the intercepts and slopes.
    a.High = as.vector(constable.mcmc[, 1])
    a.Initial = as.vector(constable.mcmc[, 3]) + a.High
    a.Low = as.vector(constable.mcmc[, 4]) + a.High
    
    b.High = as.vector(constable.mcmc[, 2])
    b.Initial = as.vector(constable.mcmc[, 5]) + b.High
    b.Low = as.vector(constable.mcmc[, 6]) + b.High
    
    ## Generate a function that finds the intersection of two lines
    inter <- function(m1, a1, m2, a2) {
        data.frame(x = ((a2 - a1)/(m1 - m2)), y = m1 * ((a2 - a1)/(m1 - m2)) + a1)
    }
    
    ## We will also define the maximum and minimum domain in which to explore the range of inital volume.
    minx = min(constable$IV)
    maxx = max(constable$IV)
    
    ## Start with High vs Initial
    intersect.HvsI = inter(b.High, a.High, b.Initial, a.Initial)
    ## Narrow this to the likely domain
    intersect.HvsI.1 <- subset(intersect.HvsI, x < 15^(1/3) & x > minx^(1/3))$x
    comp.HvsI.1 = tidyMCMC(as.mcmc(intersect.HvsI.1^3), conf.int = TRUE, conf.method = "HPDinterval")
    
    intersect.HvsI.2 <- subset(intersect.HvsI, x < 45.5^(1/3) & x > 17^(1/3))$x
    comp.HvsI.2 = tidyMCMC(as.mcmc(intersect.HvsI.2^3), conf.int = TRUE, conf.method = "HPDinterval")
    (comp.HvsI = data.frame(lower = comp.HvsI.1$conf.high, upper = comp.HvsI.2$conf.low))
    
         lower    upper
    1 9.191592 28.29357
    
    ## Now High vs Low
    intersect.HvsL = inter(b.High, a.High, b.Low, a.Low)$x
    comp.HvsL = tidyMCMC(as.mcmc(intersect.HvsL^3), conf.int = TRUE, conf.method = "HPDinterval")
    (comp.HvsL = data.frame(lower = comp.HvsL$conf.high, upper = maxx))
    
         lower upper
    1 10.52219  47.5
    
    ## Finally Initial vs Low
    intersect.IvsL = inter(b.Initial, a.Initial, b.Low, a.Low)$x
    comp.IvsL = tidyMCMC(as.mcmc(intersect.IvsL^3), conf.int = TRUE, conf.method = "HPDinterval")
    (comp.IvsL = data.frame(lower = comp.IvsL$conf.high, upper = maxx))
    
         lower upper
    1 23.28955  47.5
    
    constable.mcmc = as.matrix(constable.rstan)
    wch = c(which(colnames(constable.mcmc) == "beta0"), grep("beta\\[", colnames(constable.mcmc)))
    constable.mcmc = constable.mcmc[, wch]
    
    ## It is easier to first determine the range of IV for which the the treatment pairs do not differ.
    ## After which we can subtract this interval from the full domain Lets consider the three treatments
    ## (High, Initial, Low) Start by calculating the intercepts and slopes.
    a.High = as.vector(constable.mcmc[, 1])
    a.Initial = as.vector(constable.mcmc[, 3]) + a.High
    a.Low = as.vector(constable.mcmc[, 4]) + a.High
    
    b.High = as.vector(constable.mcmc[, 2])
    b.Initial = as.vector(constable.mcmc[, 5]) + b.High
    b.Low = as.vector(constable.mcmc[, 6]) + b.High
    
    ## Generate a function that finds the intersection of two lines
    inter <- function(m1, a1, m2, a2) {
        data.frame(x = ((a2 - a1)/(m1 - m2)), y = m1 * ((a2 - a1)/(m1 - m2)) + a1)
    }
    
    ## We will also define the maximum and minimum domain in which to explore the range of inital volume.
    minx = min(constable$IV)
    maxx = max(constable$IV)
    
    ## Start with High vs Initial
    intersect.HvsI = inter(b.High, a.High, b.Initial, a.Initial)
    ## Narrow this to the likely domain
    intersect.HvsI.1 <- subset(intersect.HvsI, x < 15^(1/3) & x > minx^(1/3))$x
    comp.HvsI.1 = tidyMCMC(as.mcmc(intersect.HvsI.1^3), conf.int = TRUE, conf.method = "HPDinterval")
    
    intersect.HvsI.2 <- subset(intersect.HvsI, x < 45.5^(1/3) & x > 17^(1/3))$x
    comp.HvsI.2 = tidyMCMC(as.mcmc(intersect.HvsI.2^3), conf.int = TRUE, conf.method = "HPDinterval")
    (comp.HvsI = data.frame(lower = comp.HvsI.1$conf.high, upper = comp.HvsI.2$conf.low))
    
         lower    upper
    1 9.191592 28.29357
    
    ## Now High vs Low
    intersect.HvsL = inter(b.High, a.High, b.Low, a.Low)$x
    comp.HvsL = tidyMCMC(as.mcmc(intersect.HvsL^3), conf.int = TRUE, conf.method = "HPDinterval")
    (comp.HvsL = data.frame(lower = comp.HvsL$conf.high, upper = maxx))
    
         lower upper
    1 10.52219  47.5
    
    ## Finally Initial vs Low
    intersect.IvsL = inter(b.Initial, a.Initial, b.Low, a.Low)$x
    comp.IvsL = tidyMCMC(as.mcmc(intersect.IvsL^3), conf.int = TRUE, conf.method = "HPDinterval")
    (comp.IvsL = data.frame(lower = comp.IvsL$conf.high, upper = maxx))
    
         lower upper
    1 23.28955  47.5
    
    constable.mcmc = as.matrix(constable.brm)
    wch = grep("b_", colnames(constable.mcmc))
    constable.mcmc = constable.mcmc[, wch]
    
    ## It is easier to first determine the range of IV for which the the treatment pairs do not differ.
    ## After which we can subtract this interval from the full domain Lets consider the three treatments
    ## (High, Initial, Low) Start by calculating the intercepts and slopes.
    a.High = as.vector(constable.mcmc[, 1])
    a.Initial = as.vector(constable.mcmc[, 3]) + a.High
    a.Low = as.vector(constable.mcmc[, 4]) + a.High
    
    b.High = as.vector(constable.mcmc[, 2])
    b.Initial = as.vector(constable.mcmc[, 5]) + b.High
    b.Low = as.vector(constable.mcmc[, 6]) + b.High
    
    ## Generate a function that finds the intersection of two lines
    inter <- function(m1, a1, m2, a2) {
        data.frame(x = ((a2 - a1)/(m1 - m2)), y = m1 * ((a2 - a1)/(m1 - m2)) + a1)
    }
    
    ## We will also define the maximum and minimum domain in which to explore the range of inital volume.
    minx = min(constable$IV)
    maxx = max(constable$IV)
    
    ## Start with High vs Initial
    intersect.HvsI = inter(b.High, a.High, b.Initial, a.Initial)
    ## Narrow this to the likely domain
    intersect.HvsI.1 <- subset(intersect.HvsI, x < 15^(1/3) & x > minx^(1/3))$x
    comp.HvsI.1 = tidyMCMC(as.mcmc(intersect.HvsI.1^3), conf.int = TRUE, conf.method = "HPDinterval")
    
    intersect.HvsI.2 <- subset(intersect.HvsI, x < 45.5^(1/3) & x > 17^(1/3))$x
    comp.HvsI.2 = tidyMCMC(as.mcmc(intersect.HvsI.2^3), conf.int = TRUE, conf.method = "HPDinterval")
    (comp.HvsI = data.frame(lower = comp.HvsI.1$conf.high, upper = comp.HvsI.2$conf.low))
    
         lower    upper
    1 8.654387 31.48765
    
    ## Now High vs Low
    intersect.HvsL = inter(b.High, a.High, b.Low, a.Low)$x
    comp.HvsL = tidyMCMC(as.mcmc(intersect.HvsL^3), conf.int = TRUE, conf.method = "HPDinterval")
    (comp.HvsL = data.frame(lower = comp.HvsL$conf.high, upper = maxx))
    
         lower upper
    1 10.36942  47.5
    
    ## Finally Initial vs Low
    intersect.IvsL = inter(b.Initial, a.Initial, b.Low, a.Low)$x
    comp.IvsL = tidyMCMC(as.mcmc(intersect.IvsL^3), conf.int = TRUE, conf.method = "HPDinterval")
    (comp.IvsL = data.frame(lower = comp.IvsL$conf.high, upper = maxx))
    
         lower upper
    1 22.98159  47.5
    
  13. Generate a summary figure.
    library(MCMCpack)
    constable.mcmc = constable.mcmcpack
    newdata <- with(constable, expand.grid(TREAT = levels(TREAT), IV = seq(min(IV), max(IV), len = 100)))
    Xmat <- model.matrix(~I(IV^(1/3)) * TREAT, data = newdata)
    wch = grep("Intercept|TREAT|IV", colnames(constable.mcmc))
    coefs = constable.mcmc[, wch]
    fit = coefs %*% t(Xmat)
    
    newdata = newdata %>% bind_cols(tidyMCMC(fit, conf.int = TRUE, conf.method = "HPDinterval"))
    ggplot(newdata, aes(y = estimate, x = IV, linetype = TREAT)) + geom_point(data = constable, aes(y = SUTW,
        x = IV, shape = TREAT, fill = TREAT)) + geom_segment(data = NULL, aes(y = 0.02, yend = 0.02, x = comp.HvsI[1,
        1], xend = comp.HvsI[1, 2])) + annotate(geom = "text", y = 0.025, x = rowMeans(comp.HvsI[1, 1:2]),
        label = "High vs Initial") + geom_segment(data = NULL, aes(y = 0.01, yend = 0.01, x = comp.HvsL[1,
        1], xend = comp.HvsL[1, 2])) + annotate(geom = "text", y = 0.015, x = rowMeans(comp.HvsL[1, 1:2]),
        label = "High vs Low") + geom_segment(data = NULL, aes(y = 0, yend = 0, x = comp.IvsL[1, 1], xend = comp.IvsL[1,
        2])) + annotate(geom = "text", y = 0.005, x = rowMeans(comp.IvsL[1, 1:2]), label = "Initial vs Low") +
        geom_line() + geom_ribbon(aes(ymin = conf.low, ymax = conf.high), fill = "grey", alpha = 0.2) + scale_y_continuous("Suture width (mm)") +
        scale_x_continuous("Suture width (mm)", trans = trans_new("root", function(x) x^(1/3), function(x) x^3,
            domain = c(0, Inf))) + scale_shape_manual("Food regime", values = c(21, 21, 21)) + scale_fill_manual("Food regime",
        values = c("black", "grey40", "white")) + scale_linetype_manual("Food regime", values = c("solid",
        "dashed", "dotted")) + theme_classic() + theme(legend.justification = c(0, 1), legend.position = c(0,
        1), legend.key.width = unit(1, "cm"), axis.title.y = element_text(vjust = 2, size = rel(1.25)), axis.title.x = element_text(vjust = -2,
        size = rel(1.25)), plot.margin = unit(c(0.5, 0.5, 2, 2), "lines"))
    
    plot of chunk tut7.5bQ2.10a
                                              constable.mcmc = constable.r2jags$BUGSoutput$sims.matrix
                                              newdata <- with(constable, expand.grid(TREAT=levels(TREAT),
                                                              IV=seq(min(IV), max(IV), len=100)))
                                              Xmat <- model.matrix(~I(IV^(1/3))*TREAT, data=newdata)
                                              wch = c(which(colnames(constable.mcmc)=='beta0'), grep('beta\\[', colnames(constable.mcmc)))
    
                                              coefs = constable.mcmc[,wch]
                                              fit = coefs %*% t(Xmat)
    
                                              newdata = newdata %>% bind_cols(tidyMCMC(fit, conf.int=TRUE, conf.method='HPDinterval'))
                                              ggplot(newdata, aes(y=estimate, x=IV, linetype=TREAT)) +
                                                geom_point(data=constable, aes(y=SUTW, x=IV, shape=TREAT, fill=TREAT))+
                                                geom_segment(data=NULL, aes(y=0.02,yend=0.02, x=comp.HvsI[1,1], xend=comp.HvsI[1,2]))+
                                                annotate(geom='text', y=0.025,x=rowMeans(comp.HvsI[1,1:2]), label='High vs Initial')+
                                                geom_segment(data=NULL, aes(y=0.01,yend=0.01, x=comp.HvsL[1,1], xend=comp.HvsL[1,2]))+
                                                annotate(geom='text', y=0.015,x=rowMeans(comp.HvsL[1,1:2]), label='High vs Low')+
                                                geom_segment(data=NULL, aes(y=0.00,yend=0.00, x=comp.IvsL[1,1], xend=comp.IvsL[1,2]))+
                                                annotate(geom='text', y=0.005,x=rowMeans(comp.IvsL[1,1:2]), label='Initial vs Low')+
                                                geom_line()+
                                                    geom_ribbon(aes(ymin=conf.low, ymax=conf.high), fill='grey',alpha=0.2)+
                                                    scale_y_continuous('Suture width (mm)')+
                                                    scale_x_continuous('Suture width (mm)',trans=trans_new('root',function(x) x^(1/3),
                              function(x) x^3, domain=c(0,Inf)))+
                                                    scale_shape_manual('Food regime', values=c(21,21,21))+
                                                    scale_fill_manual('Food regime', values=c('black','grey40','white'))+
                                                    scale_linetype_manual('Food regime', values=c('solid','dashed','dotted'))+
                                                    theme_classic() +
                                                    theme(legend.justification=c(0,1), legend.position=c(0,1), legend.key.width=unit(1,'cm'),
                                                     axis.title.y=element_text(vjust=2, size=rel(1.25)),
                                                     axis.title.x=element_text(vjust=-2, size=rel(1.25)),
                                                     plot.margin=unit(c(0.5,0.5,2,2), 'lines'))
    
    plot of chunk tut7.5bQ2.10b
                                              constable.mcmc = as.matrix(constable.rstan)
                                              newdata <- with(constable, expand.grid(TREAT=levels(TREAT),
                                              IV=seq(min(IV), max(IV), len=100)))
                                              Xmat <- model.matrix(~I(IV^(1/3))*TREAT, data=newdata)
                                              wch = c(which(colnames(constable.mcmc) == "beta0"), grep("beta\\[", colnames(constable.mcmc)))
                                              coefs = constable.mcmc[,wch]
                                              fit = coefs %*% t(Xmat)
    
                                              newdata = newdata %>% bind_cols(tidyMCMC(fit, conf.int=TRUE, conf.method='HPDinterval'))
                                              ggplot(newdata, aes(y=estimate, x=IV, linetype=TREAT)) +
                                              geom_point(data=constable, aes(y=SUTW, x=IV, shape=TREAT, fill=TREAT))+
                                              geom_segment(data=NULL, aes(y=0.02,yend=0.02, x=comp.HvsI[1,1], xend=comp.HvsI[1,2]))+
                                              annotate(geom='text', y=0.025,x=rowMeans(comp.HvsI[1,1:2]), label='High vs Initial')+
                                              geom_segment(data=NULL, aes(y=0.01,yend=0.01, x=comp.HvsL[1,1], xend=comp.HvsL[1,2]))+
                                              annotate(geom='text', y=0.015,x=rowMeans(comp.HvsL[1,1:2]), label='High vs Low')+
                                              geom_segment(data=NULL, aes(y=0.00,yend=0.00, x=comp.IvsL[1,1], xend=comp.IvsL[1,2]))+
                                              annotate(geom='text', y=0.005,x=rowMeans(comp.IvsL[1,1:2]), label='Initial vs Low')+
                                              geom_line()+
                                              geom_ribbon(aes(ymin=conf.low, ymax=conf.high), fill='grey',alpha=0.2)+
                                              scale_y_continuous('Suture width (mm)')+
                                              scale_x_continuous('Suture width (mm)',trans=trans_new('root',function(x) x^(1/3),
                          function(x) x^3, domain=c(0,Inf)))+
                                              scale_shape_manual('Food regime', values=c(21,21,21))+
                                              scale_fill_manual('Food regime', values=c('black','grey40','white'))+
                                              scale_linetype_manual('Food regime', values=c('solid','dashed','dotted'))+
                                              theme_classic() +
                                              theme(legend.justification=c(0,1), legend.position=c(0,1), legend.key.width=unit(1,'cm'),
                                              axis.title.y=element_text(vjust=2, size=rel(1.25)),
                                              axis.title.x=element_text(vjust=-2, size=rel(1.25)),
                                              plot.margin=unit(c(0.5,0.5,2,2), 'lines'))
    
    plot of chunk tut7.5bQ2.10c
                                              constable.mcmc = as.matrix(constable.rstanarm)
                                              newdata <- with(constable, expand.grid(TREAT=levels(TREAT),
                                              IV=seq(min(IV), max(IV), len=100)))
                                              Xmat <- model.matrix(~I(IV^(1/3))*TREAT, data=newdata)
                                              wch = grep("Intercept|TREAT|IV", colnames(constable.mcmc))
                                              coefs = constable.mcmc[,wch]
                                              fit = coefs %*% t(Xmat)
    
                                              newdata = newdata %>% bind_cols(tidyMCMC(fit, conf.int=TRUE, conf.method='HPDinterval'))
                                              ggplot(newdata, aes(y=estimate, x=IV, linetype=TREAT)) +
                                              geom_point(data=constable, aes(y=SUTW, x=IV, shape=TREAT, fill=TREAT))+
                                              geom_segment(data=NULL, aes(y=0.02,yend=0.02, x=comp.HvsI[1,1], xend=comp.HvsI[1,2]))+
                                              annotate(geom='text', y=0.025,x=rowMeans(comp.HvsI[1,1:2]), label='High vs Initial')+
                                              geom_segment(data=NULL, aes(y=0.01,yend=0.01, x=comp.HvsL[1,1], xend=comp.HvsL[1,2]))+
                                              annotate(geom='text', y=0.015,x=rowMeans(comp.HvsL[1,1:2]), label='High vs Low')+
                                              geom_segment(data=NULL, aes(y=0.00,yend=0.00, x=comp.IvsL[1,1], xend=comp.IvsL[1,2]))+
                                              annotate(geom='text', y=0.005,x=rowMeans(comp.IvsL[1,1:2]), label='Initial vs Low')+
                                              geom_line()+
                                              geom_ribbon(aes(ymin=conf.low, ymax=conf.high), fill='grey',alpha=0.2)+
                                              scale_y_continuous('Suture width (mm)')+
                                              scale_x_continuous('Suture width (mm)',trans=trans_new('root',function(x) x^(1/3),
                          function(x) x^3, domain=c(0,Inf)))+
                                              scale_shape_manual('Food regime', values=c(21,21,21))+
                                              scale_fill_manual('Food regime', values=c('black','grey40','white'))+
                                              scale_linetype_manual('Food regime', values=c('solid','dashed','dotted'))+
                                              theme_classic() +
                                              theme(legend.justification=c(0,1), legend.position=c(0,1), legend.key.width=unit(1,'cm'),
                                              axis.title.y=element_text(vjust=2, size=rel(1.25)),
                                              axis.title.x=element_text(vjust=-2, size=rel(1.25)),
                                              plot.margin=unit(c(0.5,0.5,2,2), 'lines'))
    
    plot of chunk tut7.5bQ2.10d
                                              constable.mcmc = as.matrix(constable.brm)
                                              newdata <- with(constable, expand.grid(TREAT=levels(TREAT),
                                              IV=seq(min(IV), max(IV), len=100)))
                                              Xmat <- model.matrix(~I(IV^(1/3))*TREAT, data=newdata)
                                              wch = grep("b_", colnames(constable.mcmc))
                                              coefs = constable.mcmc[,wch]
                                              fit = coefs %*% t(Xmat)
    
                                              newdata = newdata %>% bind_cols(tidyMCMC(fit, conf.int=TRUE, conf.method='HPDinterval'))
                                              ggplot(newdata, aes(y=estimate, x=IV, linetype=TREAT)) +
                                              geom_point(data=constable, aes(y=SUTW, x=IV, shape=TREAT, fill=TREAT))+
                                              geom_segment(data=NULL, aes(y=0.02,yend=0.02, x=comp.HvsI[1,1], xend=comp.HvsI[1,2]))+
                                              annotate(geom='text', y=0.025,x=rowMeans(comp.HvsI[1,1:2]), label='High vs Initial')+
                                              geom_segment(data=NULL, aes(y=0.01,yend=0.01, x=comp.HvsL[1,1], xend=comp.HvsL[1,2]))+
                                              annotate(geom='text', y=0.015,x=rowMeans(comp.HvsL[1,1:2]), label='High vs Low')+
                                              geom_segment(data=NULL, aes(y=0.00,yend=0.00, x=comp.IvsL[1,1], xend=comp.IvsL[1,2]))+
                                              annotate(geom='text', y=0.005,x=rowMeans(comp.IvsL[1,1:2]), label='Initial vs Low')+
                                              geom_line()+
                                              geom_ribbon(aes(ymin=conf.low, ymax=conf.high), fill='grey',alpha=0.2)+
                                              scale_y_continuous('Suture width (mm)')+
                                              scale_x_continuous('Suture width (mm)',trans=trans_new('root',function(x) x^(1/3),
                          function(x) x^3, domain=c(0,Inf)))+
                                              scale_shape_manual('Food regime', values=c(21,21,21))+
                                              scale_fill_manual('Food regime', values=c('black','grey40','white'))+
                                              scale_linetype_manual('Food regime', values=c('solid','dashed','dotted'))+
                                              theme_classic() +
                                              theme(legend.justification=c(0,1), legend.position=c(0,1), legend.key.width=unit(1,'cm'),
                                              axis.title.y=element_text(vjust=2, size=rel(1.25)),
                                              axis.title.x=element_text(vjust=-2, size=rel(1.25)),
                                              plot.margin=unit(c(0.5,0.5,2,2), 'lines'))
    
    plot of chunk tut7.5bQ2.10e
  14. Explore $R^2$
    library(MCMCpack)
    library(broom)
    constable.mcmc <- constable.mcmcpack
    Xmat = model.matrix(~I(IV^(1/3)) * TREAT, data = constable)
    wch = grep("Intercept|IV|TREAT", colnames(constable.mcmc))
    coefs = constable.mcmc[, wch]
    fit = coefs %*% t(Xmat)
    resid = sweep(fit, 2, constable$SUTW, "-")
    var_f = apply(fit, 1, var)
    var_e = apply(resid, 1, var)
    R2 = var_f/(var_f + var_e)
    tidyMCMC(as.mcmc(R2), conf.int = TRUE, conf.method = "HPDinterval")
    
      term estimate  std.error  conf.low conf.high
    1 var1 0.488892 0.06173019 0.3642251  0.596765
    
    # for comparison with frequentist
    summary(lm(SUTW ~ I(IV^(1/3)) * TREAT, data = constable))
    
    Call:
    lm(formula = SUTW ~ I(IV^(1/3)) * TREAT, data = constable)
    
    Residuals:
          Min        1Q    Median        3Q       Max 
    -0.047764 -0.012727  0.001153  0.012959  0.065303 
    
    Coefficients:
                               Estimate Std. Error t value Pr(>|t|)    
    (Intercept)              -0.0282017  0.0307338  -0.918 0.362163    
    I(IV^(1/3))               0.0467744  0.0119981   3.898 0.000229 ***
    TREATInitial             -0.0005899  0.0380584  -0.015 0.987680    
    TREATLow                  0.0630730  0.0373498   1.689 0.095994 .  
    I(IV^(1/3)):TREATInitial -0.0113263  0.0145247  -0.780 0.438302    
    I(IV^(1/3)):TREATLow     -0.0377210  0.0141608  -2.664 0.009700 ** 
    ---
    Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
    
    Residual standard error: 0.02048 on 66 degrees of freedom
    Multiple R-squared:  0.4958,	Adjusted R-squared:  0.4576 
    F-statistic: 12.98 on 5 and 66 DF,  p-value: 8.431e-09
    
    library(broom)
    constable.mcmc <- constable.r2jags$BUGSoutput$sims.matrix
    Xmat = model.matrix(~I(IV^(1/3)) * TREAT, data = constable)
    wch = c(which(colnames(constable.mcmc) == "beta0"), grep("beta\\[", colnames(constable.mcmc)))
    coefs = constable.mcmc[, wch]
    fit = coefs %*% t(Xmat)
    resid = sweep(fit, 2, constable$SUTW, "-")
    var_f = apply(fit, 1, var)
    var_e = apply(resid, 1, var)
    R2 = var_f/(var_f + var_e)
    tidyMCMC(as.mcmc(R2), conf.int = TRUE, conf.method = "HPDinterval")
    
      term  estimate  std.error  conf.low conf.high
    1 var1 0.4889044 0.05956118 0.3673575 0.5925647
    
    # for comparison with frequentist
    summary(lm(SUTW ~ I(IV^(1/3)) * TREAT, data = constable))
    
    Call:
    lm(formula = SUTW ~ I(IV^(1/3)) * TREAT, data = constable)
    
    Residuals:
          Min        1Q    Median        3Q       Max 
    -0.047764 -0.012727  0.001153  0.012959  0.065303 
    
    Coefficients:
                               Estimate Std. Error t value Pr(>|t|)    
    (Intercept)              -0.0282017  0.0307338  -0.918 0.362163    
    I(IV^(1/3))               0.0467744  0.0119981   3.898 0.000229 ***
    TREATInitial             -0.0005899  0.0380584  -0.015 0.987680    
    TREATLow                  0.0630730  0.0373498   1.689 0.095994 .  
    I(IV^(1/3)):TREATInitial -0.0113263  0.0145247  -0.780 0.438302    
    I(IV^(1/3)):TREATLow     -0.0377210  0.0141608  -2.664 0.009700 ** 
    ---
    Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
    
    Residual standard error: 0.02048 on 66 degrees of freedom
    Multiple R-squared:  0.4958,	Adjusted R-squared:  0.4576 
    F-statistic: 12.98 on 5 and 66 DF,  p-value: 8.431e-09
    
    library(broom)
    constable.mcmc <- as.matrix(constable.rstan)
    Xmat = model.matrix(~I(IV^(1/3)) * TREAT, data = constable)
    wch = c(which(colnames(constable.mcmc) == "beta0"), grep("beta\\[", colnames(constable.mcmc)))
    coefs = constable.mcmc[, wch]
    fit = coefs %*% t(Xmat)
    resid = sweep(fit, 2, constable$SUTW, "-")
    var_f = apply(fit, 1, var)
    var_e = apply(resid, 1, var)
    R2 = var_f/(var_f + var_e)
    tidyMCMC(as.mcmc(R2), conf.int = TRUE, conf.method = "HPDinterval")
    
      term  estimate  std.error  conf.low conf.high
    1 var1 0.4882446 0.06035995 0.3734576 0.6005657
    
    # for comparison with frequentist
    summary(lm(SUTW ~ I(IV^(1/3)) * TREAT, data = constable))
    
    Call:
    lm(formula = SUTW ~ I(IV^(1/3)) * TREAT, data = constable)
    
    Residuals:
          Min        1Q    Median        3Q       Max 
    -0.047764 -0.012727  0.001153  0.012959  0.065303 
    
    Coefficients:
                               Estimate Std. Error t value Pr(>|t|)    
    (Intercept)              -0.0282017  0.0307338  -0.918 0.362163    
    I(IV^(1/3))               0.0467744  0.0119981   3.898 0.000229 ***
    TREATInitial             -0.0005899  0.0380584  -0.015 0.987680    
    TREATLow                  0.0630730  0.0373498   1.689 0.095994 .  
    I(IV^(1/3)):TREATInitial -0.0113263  0.0145247  -0.780 0.438302    
    I(IV^(1/3)):TREATLow     -0.0377210  0.0141608  -2.664 0.009700 ** 
    ---
    Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
    
    Residual standard error: 0.02048 on 66 degrees of freedom
    Multiple R-squared:  0.4958,	Adjusted R-squared:  0.4576 
    F-statistic: 12.98 on 5 and 66 DF,  p-value: 8.431e-09
    
    library(broom)
    constable.mcmc <- as.matrix(constable.rstanarm)
    Xmat = model.matrix(~I(IV^(1/3)) * TREAT, data = constable)
    wch = grep("Intercept|TREAT|IV", colnames(constable.mcmc))
    coefs = constable.mcmc[, wch]
    fit = coefs %*% t(Xmat)
    resid = sweep(fit, 2, constable$SUTW, "-")
    var_f = apply(fit, 1, var)
    var_e = apply(resid, 1, var)
    R2 = var_f/(var_f + var_e)
    tidyMCMC(as.mcmc(R2), conf.int = TRUE, conf.method = "HPDinterval")
    
      term  estimate  std.error  conf.low conf.high
    1 var1 0.4892397 0.05991533 0.3732155 0.5966039
    
    # for comparison with frequentist
    summary(lm(SUTW ~ I(IV^(1/3)) * TREAT, data = constable))
    
    Call:
    lm(formula = SUTW ~ I(IV^(1/3)) * TREAT, data = constable)
    
    Residuals:
          Min        1Q    Median        3Q       Max 
    -0.047764 -0.012727  0.001153  0.012959  0.065303 
    
    Coefficients:
                               Estimate Std. Error t value Pr(>|t|)    
    (Intercept)              -0.0282017  0.0307338  -0.918 0.362163    
    I(IV^(1/3))               0.0467744  0.0119981   3.898 0.000229 ***
    TREATInitial             -0.0005899  0.0380584  -0.015 0.987680    
    TREATLow                  0.0630730  0.0373498   1.689 0.095994 .  
    I(IV^(1/3)):TREATInitial -0.0113263  0.0145247  -0.780 0.438302    
    I(IV^(1/3)):TREATLow     -0.0377210  0.0141608  -2.664 0.009700 ** 
    ---
    Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
    
    Residual standard error: 0.02048 on 66 degrees of freedom
    Multiple R-squared:  0.4958,	Adjusted R-squared:  0.4576 
    F-statistic: 12.98 on 5 and 66 DF,  p-value: 8.431e-09
    
    library(broom)
    constable.mcmc <- as.matrix(constable.brm)
    Xmat = model.matrix(~I(IV^(1/3)) * TREAT, data = constable)
    wch = grep("b_", colnames(constable.mcmc))
    coefs = constable.mcmc[, wch]
    fit = coefs %*% t(Xmat)
    resid = sweep(fit, 2, constable$SUTW, "-")
    var_f = apply(fit, 1, var)
    var_e = apply(resid, 1, var)
    R2 = var_f/(var_f + var_e)
    tidyMCMC(as.mcmc(R2), conf.int = TRUE, conf.method = "HPDinterval")
    
      term  estimate  std.error  conf.low conf.high
    1 var1 0.4888112 0.06064694 0.3706096 0.5948545
    
    # for comparison with frequentist
    summary(lm(SUTW ~ I(IV^(1/3)) * TREAT, data = constable))
    
    Call:
    lm(formula = SUTW ~ I(IV^(1/3)) * TREAT, data = constable)
    
    Residuals:
          Min        1Q    Median        3Q       Max 
    -0.047764 -0.012727  0.001153  0.012959  0.065303 
    
    Coefficients:
                               Estimate Std. Error t value Pr(>|t|)    
    (Intercept)              -0.0282017  0.0307338  -0.918 0.362163    
    I(IV^(1/3))               0.0467744  0.0119981   3.898 0.000229 ***
    TREATInitial             -0.0005899  0.0380584  -0.015 0.987680    
    TREATLow                  0.0630730  0.0373498   1.689 0.095994 .  
    I(IV^(1/3)):TREATInitial -0.0113263  0.0145247  -0.780 0.438302    
    I(IV^(1/3)):TREATLow     -0.0377210  0.0141608  -2.664 0.009700 ** 
    ---
    Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
    
    Residual standard error: 0.02048 on 66 degrees of freedom
    Multiple R-squared:  0.4958,	Adjusted R-squared:  0.4576 
    F-statistic: 12.98 on 5 and 66 DF,  p-value: 8.431e-09