Jump to main navigation


Tutorial 9.1 - Linear mixed effects models

25 Apr 2018

Overview

For the purpose of reorientation and overall context, I present (again) the following representation of the linear model.
Of note, one of the primary assumptions of linear modelling is that the residuals are all independent of one another. In the representation above, that is depicted by all the zeros in the covariance matrix.

If landscapes, individuals (subjects) etc were homogeneous, scientific experiments would be much easier to conduct. One of the greatest challenges in designing experiments (particular those in the field or those involving organisms) is to be able to minimize unexplained variability (noise) so as to maximize the likelihood of detecting any signals (effects). Alternatively, collecting additional covariates in an attempt to standardize or control for other influences is one of the common strategies used to attempt to reduce unexplained variability. However, this is a statistical method of attempting to control for predictable sources of noise.

When the source of heterogeneity is difficult to identify or measure, collecting additional covariates may not be effective. For example, in designing a field experiment that will involve measuring some property of a natural system (such as the abundance of a taxa within quadrats), we may be concerned that the underlying landscape varies enormously and that this variation could mask any treatment effects. That is, the unexplained variability will reduce the power to detect effects.

One way to compensate for this is to provision a very large number of replicates. However, this will necessarily require substantially more resources and effort. An alternative is to group together sets of sampling units at sufficiently small scales that the underlying conditions are relatively homogeneous. For example, rather than distribute the sampling units randomly (or haphazardly) throughout the landscape, clusters of sampling units (called blocks or plots) could be distributed throughout the landscape. Similarly, rather than subject each individual to a single treatment, each individual could be subjected to each of the treatments sequentially. In this way, the unexplained variability is based on variation within a cluster or individual - clusters and individuals partially act as their own control.

Single factorNested
Single factorRandomized Complete Block

The series of figures above illustrate some of the issues addressed by hierarchical designs. The top left hand figure represents an example of a single factor design in which there are three sites (replicates) of the treatment factor (Burnt or Unburnt) and within each site there is a single haphazardly positioned quadrat from which some response was observed. If the conditions within a site a highly heterogeneous (perhaps the sites are not simple homogeneous patches of forest but rather are mixtures of forest types over undulating terrain), then obviously the locations of the quadrats within the landscapes are likely to be very influential. For example, if we were observing the abundance of ground dwelling spiders (as measured by the number of ground spiders captured in a single pit trap per site), yet each site had a mixture of habitats, ground cover and soil types, then the measured abundances of ground spiders is likely to be depended on exactly where within a site the pit trap is located.

Given this heterogeneity and considering there are only three replicates of each treatment, the amount of noise (unexplained variation) is likely to be very large and make it difficult to detect an effect of treatment. One solution would be to use more sites (perhaps many more sites) so that once averaged over a much larger set, the signal (treatment effect) starts to become apparent. However, this would require much more effort and resources and may not even be possible (may not be possible to obtain more burnt sites).

Another option is to use sub-replicates within each site. That is, rather than deploy a single pit trap, deploy many more. The top right figure shows the same Burnt and Unburnt sites, yet with four quadrats instead of one. If these subreplicates are positioned randomly throughout the heterogeneous landscape (thereby covering a range of conditions) and we average the observations across the subreplicates within each site, then hopefully we will have reduced the noise (unexplained variability).

Now lets focus on the bottom left figure above. This depicts a two factor (shape and color) design with each shape/color combination replicated four times. Importantly, note that unlike the previous nested example, in this example, the scale of the treatments is at the scale of the sampling unit. For example, in the Burnt/Unburnt example, a forest fire is at a much large scale that the typical sampling unit (a pit trap). However in the current example, the treatment (such as bait type, shaded/sheltered or not etc) can be applied at the scale of the sampling unit (pit trap, stream leaf pack etc). Again note that the quadrats (sampling units) are randomly positioned across the landscape. If again the landscape in which the quadrats are positioned is heterogeneous, the treatment effects could be masked or otherwise swamped by the extra variation due to location. For example, if the quadrats were placed in a stream (which we know are highly heterogeneous), the observed responses might be more influenced by where they are measured than any applied treatments.

In this case by grouping together each treatment combination into a smaller spatial scale (see lower right figure above) in which the conditions are likely to be more homogeneous, we should substantially reduce the amount of unexplained variation and thus increase the power to detect treatment effects. Importantly, to avoid introducing other systematic confounding effects, it is important that the treatments themselves be randomized within each group (block).

Such nested, blocking, repeated measures designs (collectively referred to as hierarchical designs as a result of sampling units being grouped within other 'larger scale' spatial or temporal units) are popular ways to reduce unexplained variability and thus increase power.

Obviously it is important that the groups of sampling units not be so close to one another that they directly influence one another. Nevertheless, sampling units within the groups (=blocks) are not strictly independent of one another. The unique underlying characteristics of the individual subject or block means that the multiple observations will likely be more similar to one another than they are to the observations collected from other subjects or blocks. Each subject or block will have their own unique baseline (e.g. heat rate) upon which the treatments act and it is the same baseline for all observations within the group.

This dependency issue can be tackled in much the same way as a lack of independence due to temporal or spatial autocorrelation (see Tutorial 8.3a). That is, by incorporating a variance structure that approximates the patterns of dependency. Indeed when fitting a hierarchical model, we are essentially fitting a model that incorporates a separate compound symmetry correlation structure for each group. Yet in order to leverage some other statistical information, rather than interact directly with the correlation structure, we typically engage a procedure that allows us to define the hierarchical structure of the data and the model will take care of the correlation structure for us.

Central to any discussion of hierarchical models in a frequentist paradigm is the notion of fixed and random factors or effects.

  • Fixed factors - these are factors (continuous or categorical) whose measured levels you are specifically interested in. For example, we might have a treatment with levels 'high', 'medium' and 'low'. These are the only levels that we are interested in - we are not interested in extrapolating to other possible levels. Fixed effects are usually expressed as differences (or slopes).
  • Random factors - these are factors (usually categorical) that represent groups sampling units (such as sites, subjects etc). We are typically not only interested in the effects of any one level of a random factor (such as how one individual compares to another), as the levels of random factors (the selected subjects or blocks) are assumed to be random representatives of a larger population. Instead, we are more interested in estimating how variable the different groups are to one another (how varied was the landscape or subjects). Note, the individual replicates are always considered random factors (since they are intended to be randomly selected representatives of which we are more interested in their collective variability than individual differences).

Hierarchical models contain a mixture of 'fixed' and 'random' effects and are therefore also referred to as Mixed Effects Models. Recall that the typical linear model is of the form: $$ \begin{align} Y &\sim{} Normal(\mu, \sigma^2)\\ \mu &= \bf{X}\boldsymbol{\beta} \end{align} $$ where $\boldsymbol{\beta}$ represents the a vector of standard regression coefficients (such as intercept and slopes) being estimated and $\bf{X}$ represents the predictor (model) matrix.
The typical mixed effects model is of the form: $$ \begin{align} Y &\sim{} Normal(\nu, \sigma^2)\\ \mu & = \bf{X}\boldsymbol{\beta} + \bf{Z}\boldsymbol{\gamma}\\ \boldsymbol{\gamma} &\sim{} MVNormal(0, \Sigma)\\ \Sigma &= f(\theta) \end{align} $$ where $\boldsymbol{\gamma}$ represents the a vector of coefficients associated with the random effects and $\bf{Z}$ represents a matrix of the random terms.

Before proceeding any further, it might be a good idea to generate some data to assist us to visualize the concepts. We will generate a response (y) and continuous predictor (x) from 10 sampling units in each of 6 groups or blocks (block). A unit increase in x will be associated with a 1.5 times increase in y. The baseline of each block will be drawn from a normal distribution with a mean of 230 and a standard deviation of 20.

set.seed(1)
n.groups <- 6
n.sample <- 10
n <- n.groups * n.sample
block <- gl(n = n.groups, k = n.sample, lab = paste("Block",
    1:n.groups, sep = ""))
x <- runif(n, 0, 70)
mn <- mean(x)
sd <- sd(x)
cx <- (x - mn)  #/sd
Xmat <- model.matrix(~block * cx - 1 - cx)  #intercepts and slopes
Xmat <- model.matrix(~-1 + block + x)  #intercepts and slopes
intercept.mean <- 230
intercept.sd <- 20
slope.mean <- 1.5
# slope.sd <- 0.3
intercept.effects <- rnorm(n = n.groups, mean = intercept.mean,
    sd = intercept.sd)
# slope.effects <- rnorm(n=n.groups,
# mean=slope.mean, sd=slope.sd) #intercepts and
# slopes
slope.effects <- slope.mean
all.effects <- c(intercept.effects, slope.effects)
lin.pred <- Xmat[, ] %*% all.effects
eps <- rnorm(n = n, mean = 0, sd = 10)
y <- lin.pred + eps
data.hier <- data.frame(y = y, x = cx + mn, block = block)
head(data.hier)
         y        x  block
1 281.1091 18.58561 Block1
2 295.6535 26.04867 Block1
3 328.3234 40.09974 Block1
4 360.1672 63.57455 Block1
5 276.7050 14.11774 Block1
6 348.9709 62.88728 Block1

If we ignore the grouping dependency structure for a moment and proceed with simple linear regression (will will perform this via gls() so that we can directly compare it to models that incorporate correlation structures).

library(car)
scatterplot(y ~ x, data.hier)
plot of chunk tut9.1S7.2a
library(nlme)
data.hier.gls <- gls(y ~ x, data.hier, method = "REML")

As always, we will explore the residuals. Firstly, we will explore the usual residual plot, then we will plot residuals against block. If there was no dependency structure (observations in groups were no more correlated to one another than they are to units in other groups), the spread of residuals should be similarly centered around zero for each block.

plot(data.hier.gls)
plot of chunk tut9.1S7.2b
plot(residuals(data.hier.gls, type = "normalized") ~
    data.hier$block)
plot of chunk tut9.1S7.2b
Conclusions - whilst the residual plot does not immediately point at any obvious issues (although the residuals do show some degree of clumping which might suggest some form of dependency), the boxplots of residuals for each block indicate that some blocks are different to others.

Detecting such issues in the residuals is perhaps not surprising, after all, this design is employed in situations where there are expected to be substantial differences between blocks. The model fitted above was not appropriate for two reasons.

Firstly, it failed to account for the differences between different blocks and therefore the unexplained variation was inflated. We might be tempted to add the grouping variable (block) to the model in an attempt to account for some of this unexplained variability. In other words, could we not treat the analysis as a ANCOVA?

library(ggplot2)
ggplot(data.hier, aes(y = y, x = x, fill = block, color = block)) +
    geom_smooth(method = "lm") + geom_point() + theme_classic()
plot of chunk tut9.1S7.2c
After all, the figure above shows that the slopes are similar for each group - so it looks like x and block act additively on y.
data.hier.gls1 <- gls(y ~ x + block, data.hier, method = "REML")
plot(data.hier.gls1)
plot of chunk tut9.1S7.2d
plot(residuals(data.hier.gls1, type = "normalized") ~
    data.hier$block)
plot of chunk tut9.1S7.2d

At first glance, this does seem to account for the different blocks. Although it is worth pointing out that to accounting for the block effects in this way has come at a substantial degrees of freedom cost. Since there are six blocks, we have invested an additional five degrees of freedom.

Unfortunately, (and secondly) this modelling option would also completely neglect the issue of dependency. Observations within blocks would not be considered independent.

A more appropriate alternative is to model in a correlation structure that accounts for the expected dependency. For a simple example such as this, we can construct the model that incorporates a separate compound symmetry correlation structure for each block. The somewhat peculiar 1|block notation in the corCompSymm() function indicates that the compound symmetry correlation structure should be allowed to differ for each block.

data.hier.gls2 <- gls(y ~ x, data.hier, correlation = corCompSymm(form = ~1 |
    block), method = "REML")
plot(residuals(data.hier.gls2, type = "normalized") ~
    fitted(data.hier.gls2))
plot of chunk tut9.1S7.3a
plot(residuals(data.hier.gls2, type = "normalized") ~
    data.hier$block)
plot of chunk tut9.1S7.3a

Whilst the above technique is adequate for enabling appropriate estimates and tests of the main effect of interest ( the relationship between y and x), it does not allow us to explore the breakdown of variance across the different hierarchical levels. It also makes it hard to impose additional correlation structures (such as those used to address temporal or spatial autocorrelation) on the data within each group.

So could we not just fit an ANCOVA along with the correlation structure?

data.hier.gls3 <- gls(y ~ x + block, data.hier, correlation = corCompSymm(form = ~1 |
    block), method = "REML")

There are two issues with this model. Firstly, a degree of freedom is lost for each estimate associated with the block factor. Since we are not specifically interested in the 'fixed' effects of a block (that is we are not interested in how each block differs from one of these blocks), generating such estimates only serves to rob the model of valuable degrees of freedom. Rather than estimate separate treatment effects for the levels of block, a more sensible approach would be to just estimate the added variance (a single parameter) due to the blocking effect (treat it as a 'random' effect).

Secondly, by incorporating block in the model as above, the intercept will represent the value of Y when X is zero for the first block. Again, we typically wish to generalize across all possible blocks. Thus it would be more sensible for the intercept to represent the value of Y when X is zero for the average block. Although we could construct this by altering the scale of the blocking factor and switching to something like sum contrasts, it turns out that there is a more natural solution.

Optimization

All parameter estimations require optimization. Optimization is a process in which a function is (typically) minimized over the range of parameter values. For many statistical modelling purposes, the function that we wish to optimize over (the objective function) is a likelihood function, and rather than minimize this function, we seek to maximize it. All optimization techniques are defined as minimizers and thus will be described as such. That said, maximizing a function is just the same as minimizing the negative of the function.

The basic idea is that assuming the model is correct and there is sufficient data, the likelihood of a function will be maximized near to the true values of the parameters. Moreover, the log-likelihood should be approximately quadratic in close proximity to the true values. Hence, here optimizing involves finding the values of the parameters that maximize a log-likelihood function. For simple models, this is relatively straight forward. Indeed for simple gaussian models, there is a closed form solution that can be solved by linear algebra. However, if the model is false or sample sizes are small or the function has local optima or plateaus, things become increasingly difficult.

Consider for a moment a direct search method of optimizing a single parameter function. You might start by applying a vector of possible parameter values and see which one yields the minimum (or maximum) log-likelihood. Obviously, the accuracy of this is going to depend on the granularity of the vector. If the vector only has five values, then unless you happen to be fortunate enough for the minimum to fall right on one of those values, then your estimates are not going to be all that accurate. On the other hand, if your vector is a sequence of one million values, then you are likely to get a very accurate estimate - although it will take substantially longer to get this estimate since you have to evaluate the function one million times instead of five. Furthermore, if the range of values represented by the vector is too narrow, there is a change that the true minimum lies outside this range. Hence, the accuracy of the optimization will depend on the appropriate choice of range and resolution, but that improvements in either will reduce speed.

If we take this approach a step further, we could start with a relatively wide and low resolution search grid and then iteratively 'zoom' in to yield progressively more accurate estimates. This approach of optimization is relatively simple, yet very slow and when there are more than a couple of parameters, the number of parameter combinations to evaluate will be a power function of the number of number of combinations. This will of course compound speed and resources issues. Hence for high dimensional likelihoods, more sophisticated solutions are required.

If on the other hand, we assume that the likelihood function is relatively smooth and has only a single minimum, we can leverage certain geometric properties in more sophisticated optimizations. For example, the derivative (gradient) of a function is equal to zero at the function's minimum. Similarly, for a multi-parameter likelihood function, partial derivatives (derivatives of each single parameter holding all other parameters constant) with respect to each of the parameters is a vector of zeros. Finding the roots (value associated with the derivative of zero), is still an iterative process. The search can be made more efficient by using the second order derivatives.

The Newton (or Newton-Ralphson) method iteratively estimates the partial parameter values as a linear extrapolation ($x_1 = x_0 - f(x_0)/f'(x_0)$) of the likelihood and derivative until it gets close enough to zero. When there are multiple parameters, this can be extended by linear extrapolation of the first and second order derivatives. This approach works reasonably well provided the initial parameter estimates are not wildly inaccurate and the functions themselves are differentiable. There are also alternatives that replace the derivatives and reciprocal of the second derivative with the inverse of the Hessian matrix. There are also further alternatives that use approximations of the Hessian that are built up from changes in the gradient over iterations.

Some approaches constrain the parameter estimates to certain bounding boxes (lower and upper bounds on each parameter) in order to restrict the search space, and thus speed up optimization. Consequently, there are a large number of optimizes available - each designed to address different difficult use cases. Rather than go through all of the optimizes, I will focus on just the general optimizers that are believed to good compromises between accuracy, speed and complexity.

  • Nelder-Mead - is a direct search algorithm. Whilst it can be very slow, it is generally very reliable. Typically very slow when the number of parameters exceeds 10.
  • BFGS - is a Quasi-Newton (Hessian-based) algorithm originally devised by Broyden, Fletcher, Goldfarb and Shanno in 1970
  • L-BFGS-B - (Limited memory-BFGS-Box constraint) is a box constraint version of BFGS. It is most useful when the
  • nlminb - is a more modern Quasi-Newton (Hessian-based) algorithm that accommodates box constraints. It is thought to be more robust than L-BFGS-B, particularly in more marginal cases where L-BFGS-B fails to converge.
  • bobyqa - (Bound Optimization by Quadratic Approximation) is a quadratic approximation algorithm with support for box constraints that does not use derivatives.

There is no one best optimizer. An optimizer can fail for a number of reasons:

  • iterations might reach a singular gradient and be forced to stop
  • iterations exceed the maximum number of iterations without converging
  • all variances must be positive
  • covariance matrices must be positive definite

ML vs REML

Estimation for linear mixed effects models is via Maximum Likelihood (ML). That is, optimization finds the parameter values that maximize the (log) likelihood of the data. However, the ML method underestimates variance (random effects) parameters. The ML method yields biased estimates of random effects and unbiased estimates of fixed effects. Restricted Maximum Likelihood (REML) estimates of variance parameters are independent of the fixed effects estimates and therefore considered unbiased. In effect, they are the parameter values that minimize the (log) likelihood of the residuals. The REML method yields unbiased estimates of both fixed and random effects, yet are unsuitable for comparing fixed components across models.

  • When comparing two models that differ only in their fixed structure, it is important that these models are fit using ML.
  • When comparing two models that differ only in their random structure, it is important that these models are fit using REML.
  • When exploring parameter estimates of a final model, the model should be fit using REML.

Linear mixed effects models in R

It is here that linear mixed effects models (LMM) models come in. In the following sections, I will introduce three packages for performing LMM. These are not the only packages, yet they do represent the major streams of routines. Briefly, these are:

  • nlme package, originally writen for S, very widely used LMM implementation.
  • lme4 package, the more modern (G)LMM implementation. Faster and handles crossed random effects but no 'R-side effects' (no autocorrelation or variance structures).
  • glmmTMB package, an alternative LMM implementation.

Each of the above offer different underlying engines and capabilities and therefore choice of package, will dependend on the nature of the data and the desired model. Some of the major distinctions between these packages are summarised in the following table:

nlmelme4glmmTMB
REMLYYN
optimizer nlminb (default) or optim (BFGS or L-BFGS-B) bobyqa (default) or Nelder_Mead nlminb
spatial/temporal autocorrelation Y N Y
variance structures Y N Y
crossed random effects N Y Y


There are three main packages for LMM's in R that I intend to describe. Each have there own pros and cons and therefore, a working understanding of each implementation is still necessary.

We will start by fitting the linear mixed effects model.
data.hier.lme <- lme(y ~ x, random = ~1 | block, data.hier,
    method = "REML")

The hierarchical random effects structure is defined by the random= parameter. In this case, random=~1|block indicates that blocks are random effects and that the intercept should be allowed to vary per block. If we wished to allow the intercept and slope to vary for each block then the argument would have been something like random=~x|block. Note, that the constrained scatterplot above indicated that the slopes were similar for each group, so there was no real need for us to fit a random slope and intercepts model.

Nevertheless, we could explore whether there is a statistical basis to use the more complex model, by comparing the random intercept and random intercept/slope models via AIC.

data.hier.lme1 <- lme(y ~ x, random = ~x | block, data.hier, method = "REML")
Error in lme.formula(y ~ x, random = ~x | block, data.hier, method = "REML"): nlminb problem, convergence error code = 1
  message = iteration limit reached without convergence (10)
# Try the BFGS optimization algorithm instead
data.hier.lme1 <- lme(y ~ x, random = ~x | block, data.hier, method = "REML",
    control = lmeControl(opt = "optim", method = "BFGS"))
# compare the two models
anova(data.hier.lme, data.hier.lme1)
               Model df      AIC      BIC
data.hier.lme      1  4 458.9521 467.1938
data.hier.lme1     2  6 461.3996 473.7623
                  logLik   Test L.Ratio p-value
data.hier.lme  -225.4760                       
data.hier.lme1 -224.6998 1 vs 2 1.55245  0.4601
Conclusions -
  • actually, the more complex model using nlminb optimization failed to converge. That is, the iterative optimization process reached its maximum number of iterations and still failed to yield multiple outcomes that it considered similar enough to satisfy a convergence based stopping condition. The BFGS optimization seemed to have been successful..
  • there is little support to indicate that the more complex random intercept/slope model is better than the simpler random intercept model

After fitting a model (any model), the model should always be validated to ensure that it is appropriate and has fit the data well. At the very least, this should involve an exploration of the residuals.

plot(data.hier.lme)
plot of chunk tut9.1S7.bc

The fitted model is itself a list of class lme. Although it is possible to explore the contents of this list in order to extract the important components of perform additional analyses, it is usually more convenient to access the information via extractor functions and methods.

str(data.hier.lme)
List of 18
 $ modelStruct :List of 1
  ..$ reStruct:List of 1
  .. ..$ block:Classes 'pdLogChol', 'pdSymm', 'pdMat'  atomic [1:1] 0.71
  .. .. .. ..- attr(*, "formula")=Class 'formula'  language ~1
  .. .. .. .. .. ..- attr(*, ".Environment")=<environment: 0x74921e0> 
  .. .. .. ..- attr(*, "Dimnames")=List of 2
  .. .. .. .. ..$ : chr "(Intercept)"
  .. .. .. .. ..$ : chr "(Intercept)"
  .. ..- attr(*, "settings")= num [1:4] 1 1 0 4
  .. ..- attr(*, "class")= chr "reStruct"
  .. ..- attr(*, "plen")= Named int 1
  .. .. ..- attr(*, "names")= chr "block"
  ..- attr(*, "settings")= num [1:4] 1 0 1 4
  ..- attr(*, "class")= chr [1:3] "lmeStructInt" "lmeStruct" "modelStruct"
  ..- attr(*, "pmap")= logi [1, 1] TRUE
  .. ..- attr(*, "dimnames")=List of 2
  .. .. ..$ : NULL
  .. .. ..$ : chr "reStruct"
  ..- attr(*, "fixedSigma")= logi FALSE
 $ dims        :List of 5
  ..$ N    : int 60
  ..$ Q    : int 1
  ..$ qvec : Named num [1:3] 1 0 0
  .. ..- attr(*, "names")= chr [1:3] "block" "" ""
  ..$ ngrps: Named int [1:3] 6 1 1
  .. ..- attr(*, "names")= chr [1:3] "block" "X" "y"
  ..$ ncol : Named num [1:3] 1 2 1
  .. ..- attr(*, "names")= chr [1:3] "block" "" ""
 $ contrasts   : Named list()
 $ coefficients:List of 2
  ..$ fixed : Named num [1:2] 232.82 1.46
  .. ..- attr(*, "names")= chr [1:2] "(Intercept)" "x"
  ..$ random:List of 1
  .. ..$ block: num [1:6, 1] 26.15 1.01 7.5 -1.87 -29.01 ...
  .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. ..$ : chr [1:6] "Block1" "Block2" "Block3" "Block4" ...
  .. .. .. ..$ : chr "(Intercept)"
 $ varFix      : num [1:2, 1:2] 61.20549 -0.14586 -0.14586 0.00407
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr [1:2] "(Intercept)" "x"
  .. ..$ : chr [1:2] "(Intercept)" "x"
 $ sigma       : num 8.91
 $ apVar       : num [1:2, 1:2] 0.104932 -0.000227 -0.000227 0.009434
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr [1:2] "reStruct.block" "lSigma"
  .. ..$ : chr [1:2] "reStruct.block" "lSigma"
  ..- attr(*, "Pars")= Named num [1:2] 2.9 2.19
  .. ..- attr(*, "names")= chr [1:2] "reStruct.block" "lSigma"
  ..- attr(*, "natural")= logi TRUE
 $ logLik      : num -225
 $ numIter     : NULL
 $ groups      :'data.frame':	60 obs. of  1 variable:
  ..$ block: Factor w/ 6 levels "Block1","Block2",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ call        : language lme.formula(fixed = y ~ x, data = data.hier,      random = ~1 | block, method = "REML")
 $ terms       :Classes 'terms', 'formula'  language y ~ x
  .. ..- attr(*, "variables")= language list(y, x)
  .. ..- attr(*, "factors")= int [1:2, 1] 0 1
  .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. ..$ : chr [1:2] "y" "x"
  .. .. .. ..$ : chr "x"
  .. ..- attr(*, "term.labels")= chr "x"
  .. ..- attr(*, "order")= int 1
  .. ..- attr(*, "intercept")= int 1
  .. ..- attr(*, "response")= int 1
  .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv> 
  .. ..- attr(*, "predvars")= language list(y, x)
  .. ..- attr(*, "dataClasses")= Named chr [1:2] "numeric" "numeric"
  .. .. ..- attr(*, "names")= chr [1:2] "y" "x"
 $ method      : chr "REML"
 $ fitted      : num [1:60, 1:2] 260 271 291 326 253 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr [1:60] "1" "2" "3" "4" ...
  .. ..$ : chr [1:2] "fixed" "block"
 $ residuals   : num [1:60, 1:2] 21.2 24.8 37 34.6 23.3 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr [1:60] "1" "2" "3" "4" ...
  .. ..$ : chr [1:2] "fixed" "block"
  ..- attr(*, "std")= num [1:60] 8.91 8.91 8.91 8.91 8.91 ...
 $ fixDF       :List of 2
  ..$ X    : Named num [1:2] 53 53
  .. ..- attr(*, "names")= chr [1:2] "(Intercept)" "x"
  ..$ terms: Named num [1:2] 53 53
  .. ..- attr(*, "names")= chr [1:2] "(Intercept)" "x"
  ..- attr(*, "assign")=List of 2
  .. ..$ (Intercept): int 1
  .. ..$ x          : int 2
  ..- attr(*, "varFixFact")= num [1:2, 1:2] 7.4818 -2.2866 0 0.0638
 $ na.action   : NULL
 $ data        :'data.frame':	60 obs. of  3 variables:
  ..$ y    : num [1:60] 281 296 328 360 277 ...
  ..$ x    : num [1:60] 18.6 26 40.1 63.6 14.1 ...
  ..$ block: Factor w/ 6 levels "Block1","Block2",..: 1 1 1 1 1 1 1 1 1 1 ...
 - attr(*, "class")= chr "lme"
The most useful extractor functions and methods for a lme are:
Generic Method Package Description
ACF ACF.lme nlme Autocorrelation Function for lme Residuals
AIC AIC stats Akaike's An Information Criterion
AICc AICc bbmle Log likelihoods and model selection for mle2 objects
AICc AICc MuMIn Second-order Akaike Information Criterion
allEffects allEffects effects Functions For Constructing Effect Displays
allEffects allEffects RSiena Internal data frame used to construct effect objects.
anova anova.lme nlme Compare Likelihoods of Fitted Objects
augment augment.lme broom Tidying methods for mixed effects models
coef coef.lme nlme Extract lme Coefficients
Effect Effect.lme effects Functions For Constructing Effect Displays for Many Modeling Paradigms
fitted fitted.lme nlme Extract lme Fitted Values
fixef fixef.lme nlme Extract Fixed Effects
glance glance.lme broom Tidying methods for mixed effects models
glht glht lsmeans 'lsmeans' support for 'glht'
glht glht multcomp General Linear Hypotheses
intervals intervals.lme nlme Confidence Intervals on lme Parameters
logLik logLik.lme nlme Log-Likelihood of an lme Object
lsmeans lsmeans emmeans Wrappers for alternative naming of EMMs
lsmeans lsmeans lmerTest Calculates Least Squares Means and Confidence Intervals for the factors of a fixed part of mixed effects model of lmer object.
lsmeans lsmeans lsmeans Least-squares means (or predicted marginal means)
plot plot.lme nlme Plot an lme or nls object
predict predict.lme nlme Predictions from an lme Object
ranef ranef.lme nlme Extract lme Random Effects
residuals residuals.lme nlme Extract lme Residuals
r.squaredGLMM r.squaredGLMM.lme MuMIn Pseudo-R-squared for Generalized Mixed-Effect models
sigma sigma.lme nlme Fitted lme Object
summary summary.lme nlme Summarize an lme Object
tidy tidy.lme broom Tidying methods for mixed effects models
vcov vcov.lme stats Calculate Variance-Covariance Matrix for a Fitted Model Object

methodsInfo = function(class = "lm") {
    require(tidyverse)
    meth = attr(methods(class = class), "info")
    .generic = meth$generic
    nms = rownames(meth)
    dat1 = list()
    for (i in seq_along(nms)) {
        a = help.search(paste0("^", nms[i], "$"), ignore.case = FALSE)
        aa = a$matches
        if (nrow(aa) == 0) {
            ## get the namespace of the full function
            nmspc = gsub("namespace:", "", getAnywhere(nms[i])$where[2])
            a = help.search(paste0("^", .generic[i],
                "$"), ignore.case = FALSE, field = "alias")
            aa = a$match %>% filter(Entry == .generic[i],
                Package == nmspc)
        }
        d = aa %>% dplyr::select(Title, Package, Entry,
            Field)
        d = d %>% distinct() %>% mutate(Generic = .generic[i]) %>%
            dplyr::select(Package, Generic, Entry,
                Title)
        dat1[[i]] = d
    }
    d = do.call("rbind", dat1)
    rownames(d) = NULL
    d %>% arrange(Package)
}
b = methodsInfo("lme")
b
   Package          Generic                Entry
1    broom          augment          augment.lme
2    broom           glance           glance.lme
3    broom             tidy             tidy.lme
4      car            Anova            Anova.lme
5      car      deltaMethod      deltaMethod.lme
6      car linearHypothesis linearHypothesis.lme
7      car       matchCoefs       matchCoefs.lme
8  effects           Effect           Effect.lme
9  lsmeans        lsm.basis            lsm.basis
10 lsmeans     recover.data         recover.data
11    MASS       extractAIC       extractAIC.lme
12    MASS            terms            terms.lme
13   MuMIn           coeffs               coeffs
14   MuMIn        coefTable        coefTable.lme
15   MuMIn      getAllTerms          getAllTerms
16   MuMIn    r.squaredGLMM        r.squaredGLMM
17    nlme              ACF              ACF.lme
18    nlme            anova            anova.lme
19    nlme          augPred          augPred.lme
20    nlme             coef             coef.lme
21    nlme      comparePred      comparePred.lme
22    nlme           fitted           fitted.lme
23    nlme            fixef                fixef
24    nlme          getData          getData.lme
25    nlme getGroupsFormula getGroupsFormula.lme
26    nlme        getGroups        getGroups.lme
27    nlme      getResponse          getResponse
28    nlme        getVarCov        getVarCov.lme
29    nlme        intervals        intervals.lme
30    nlme           logLik           logLik.lme
31    nlme            pairs            pairs.lme
32    nlme             plot             plot.lme
33    nlme          predict          predict.lme
34    nlme           qqnorm           qqnorm.lme
35    nlme            ranef            ranef.lme
36    nlme        residuals        residuals.lme
37    nlme            sigma            sigma.lme
38    nlme         simulate         simulate.lme
39    nlme          summary          summary.lme
40    nlme           update           update.lme
41    nlme          VarCorr          VarCorr.lme
42    nlme        Variogram        Variogram.lme
43   stats             vcov             vcov.lme
                                                                                      Title
1                                                  Tidying methods for mixed effects models
2                                                  Tidying methods for mixed effects models
3                                                  Tidying methods for mixed effects models
4                                               Anova Tables for Various Statistical Models
5  Estimate and Standard Error of a Nonlinear Function of Estimated Regression Coefficients
6                                                                    Test Linear Hypothesis
7                                                                    Test Linear Hypothesis
8                    Functions For Constructing Effect Displays for Many Modeling Paradigms
9                                           Support functions for creating a reference grid
10                                          Support functions for creating a reference grid
11                                            Choose a model by AIC in a Stepwise Algorithm
12                                            Choose a model by AIC in a Stepwise Algorithm
13                                                                  Model utility functions
14                                                                  Model utility functions
15                                                                  Model utility functions
16                                     Pseudo-R-squared for Generalized Mixed-Effect models
17                                               Autocorrelation Function for lme Residuals
18                                                    Compare Likelihoods of Fitted Objects
19                                                                    Augmented Predictions
20                                                                 Extract lme Coefficients
21                                                                      Compare Predictions
22                                                                Extract lme Fitted Values
23                                                                    Extract Fixed Effects
24                                                                  Extract lme Object Data
25                                                                 Extract Grouping Formula
26                                                                Extract lme Object Groups
27                                                 Extract Response Variable from an Object
28                                                       Extract variance-covariance matrix
29                                                   Confidence Intervals on lme Parameters
30                                                          Log-Likelihood of an lme Object
31                                                              Pairs Plot of an lme Object
32                                                                Plot an lme or nls object
33                                                           Predictions from an lme Object
34                            Normal Plot of Residuals or Random Effects from an lme Object
35                                                               Extract lme Random Effects
36                                                                    Extract lme Residuals
37                                                                        Fitted lme Object
38                                                       Simulate Results from 'lme' Models
39                                                                  Summarize an lme Object
40                                                              Linear Mixed-Effects Models
41                                              Extract variance and correlation components
42                                Calculate Semi-variogram for Residuals from an lme Object
43                           Calculate Variance-Covariance Matrix for a Fitted Model Object

For the sake of comparison, lets explore the output from the gls() and lme() routines.

# output of the gls with correlation structure for
# comparison
summary(data.hier.gls2)
Generalized least squares fit by REML
  Model: y ~ x 
  Data: data.hier 
       AIC      BIC   logLik
  458.9521 467.1938 -225.476

Correlation Structure: Compound symmetry
 Formula: ~1 | block 
 Parameter estimate(s):
      Rho 
0.8052553 

Coefficients:
               Value Std.Error  t-value p-value
(Intercept) 232.8193  7.823394 29.75937       0
x             1.4591  0.063789 22.87392       0

 Correlation: 
  (Intr)
x -0.292

Standardized residuals:
        Min          Q1         Med          Q3 
-2.19174920 -0.59481155  0.05261311  0.59571239 
        Max 
 1.83321624 

Residual standard error: 20.18017 
Degrees of freedom: 60 total; 58 residual
# lme mixed effects model
summary(data.hier.lme)
Linear mixed-effects model fit by REML
 Data: data.hier 
       AIC      BIC   logLik
  458.9521 467.1938 -225.476

Random effects:
 Formula: ~1 | block
        (Intercept) Residual
StdDev:    18.10888 8.905485

Fixed effects: y ~ x 
               Value Std.Error DF  t-value
(Intercept) 232.8193  7.823393 53 29.75937
x             1.4591  0.063789 53 22.87392
            p-value
(Intercept)       0
x                 0
 Correlation: 
  (Intr)
x -0.292

Standardized Within-Group Residuals:
        Min          Q1         Med          Q3 
-2.09947262 -0.57994305 -0.04874031  0.56685096 
        Max 
 2.49464217 

Number of Observations: 60
Number of Groups: 6 
Conclusions - we see that although each routine yeilds a slightly different output, the estimations are identical.
  • the intercept represents the estimated value of y when x is equal to zero. As with regular regression, this is only really of some interest when the continuous predictor(s) are centered. Note that in the mixed effects context, this intercept represents the mean of the individual block intercepts (which we indicated could vary).
  • the slope represents the common estimated rate of change in y per unit of change in x across all blocks (both observed and unobserved).
  • the Correlation: the value of -0.292275 indicates that the approximate degree of correlation between the intercept and slope is -0.292275. Even the authors of the routine admit that the Correlation value is difficult to interpret, yet they suggest that it might sometimes speak to issues of multicollinearity when there are multiple predictors etc. or else that it might be difficult to interprete an effect of a predictor, if there is a high correlation between it and another estimate.
  • the lme summary includes estimates of the Random effects: These are the standard deviation of the individual block intercepts (between block variability) and the standard deviations of the observations around the individual block trends (within block effects, which in this case are the residuals). It is clear in this example, that there is more variability between blocks than within the blocks. Without the blocks, all the variability between the blocks would be unaccounted for and just considered residual unexplained noise. So the blocking has substantially reduced the unexplained variability and no doubt lead to more powerful fixed effects tests.

There are also a variety of other ways to summarize the linear mixed effects model. Below, I have included:

  • an anova table that partitions sums of squares into explained and unexplain components - not very useful
  • extract just the fixed effects coefficients
  • extract just the random effects coefficients. Note, these are how much each indivual block intercept differs from the overal intercept and must sum to zero
  • 95% confidence interval of the estimated parameters. These are very useful as they help when describing the magnitude of effects
  • variance components expressed as proportion of total variability c.f. to standard deviation units displayed in regular summary output above.
anova(data.hier.lme, type = "marginal")
            numDF denDF  F-value p-value
(Intercept)     1    53 885.6203  <.0001
x               1    53 523.2164  <.0001
fixef(data.hier.lme)
(Intercept)           x 
 232.819291    1.459101 
ranef(data.hier.lme)
       (Intercept)
Block1   26.149932
Block2    1.013458
Block3    7.499362
Block4   -1.867983
Block5  -29.007062
Block6   -3.787707
intervals(data.hier.lme)
Approximate 95% confidence intervals

 Fixed effects:
                 lower       est.      upper
(Intercept) 217.127551 232.819291 248.511031
x             1.331156   1.459101   1.587045
attr(,"label")
[1] "Fixed effects:"

 Random Effects:
  Level: block 
                   lower     est.    upper
sd((Intercept)) 9.597555 18.10888 34.16822

 Within-group standard error:
    lower      est.     upper 
 7.361789  8.905485 10.772878 
vc <- as.numeric(as.matrix(VarCorr(data.hier.lme))[,
    1])
vc/sum(vc)
[1] 0.8052553 0.1947447

Finally, there are also additional packages that offer simple and consistent output formats across a wide variety of tests.

library(broom)
tidy(data.hier.lme, effects = "fixed")
         term   estimate  std.error statistic
1 (Intercept) 232.819291 7.82339349  29.75937
2           x   1.459101 0.06378882  22.87392
       p.value
1 9.380593e-35
2 3.941855e-29
glance(data.hier.lme)
     sigma   logLik      AIC      BIC deviance
1 8.905485 -225.476 458.9521 467.1938       NA

In contrast to lme(), the lmer() routine does not specify separate fixed and random formulae. Instead the random effects are indicated as something like (1|block) for random intercept and (x|block) for random intercept/slope.
library(lme4)
data.hier.lmer <- lmer(y ~ x + (1 | block), data = data.hier,
    REML = TRUE)
data.hier.lmer1 <- lmer(y ~ x + (x | block), data = data.hier,
    REML = TRUE)
AIC(data.hier.lmer, data.hier.lmer1)
                df      AIC
data.hier.lmer   4 458.9521
data.hier.lmer1  6 461.2017
anova(data.hier.lmer, data.hier.lmer1)
Data: data.hier
Models:
data.hier.lmer: y ~ x + (1 | block)
data.hier.lmer1: y ~ x + (x | block)
                Df    AIC    BIC  logLik deviance
data.hier.lmer   4 461.04 469.42 -226.52   453.04
data.hier.lmer1  6 463.22 475.79 -225.61   451.22
                 Chisq Chi Df Pr(>Chisq)
data.hier.lmer                          
data.hier.lmer1 1.8206      2     0.4024
Conclusions - we would conclude so far that the more complex random intercept/slope is not an improvement and so we would go with the simpler random intercept only model.

After fitting a model (any model), the model should always be validated to ensure that it is appropriate and has fit the data well. At the very least, this should involve an exploration of the residuals.

plot(data.hier.lmer)
plot of chunk tut9.1S2.1a

The fitted model is itself a list of class lme. Although it is possible to explore the contents of this list in order to extract the important components of perform additional analyses, it is usually more convenient to access the information via extractor functions and methods.

str(data.hier.lmer)
Formal class 'lmerMod' [package "lme4"] with 13 slots
  ..@ resp   :Reference class 'lmerResp' [package "lme4"] with 9 fields
  .. ..$ Ptr    :<externalptr> 
  .. ..$ mu     : num [1:60] 286 297 317 352 280 ...
  .. ..$ offset : num [1:60] 0 0 0 0 0 0 0 0 0 0 ...
  .. ..$ sqrtXwt: num [1:60] 1 1 1 1 1 1 1 1 1 1 ...
  .. ..$ sqrtrwt: num [1:60] 1 1 1 1 1 1 1 1 1 1 ...
  .. ..$ weights: num [1:60] 1 1 1 1 1 1 1 1 1 1 ...
  .. ..$ wtres  : num [1:60] -4.98 -1.32 10.84 8.44 -2.86 ...
  .. ..$ y      : num [1:60] 281 296 328 360 277 ...
  .. ..$ REML   : int 2
  .. ..and 28 methods, of which 14 are  possibly relevant:
  .. ..  allInfo, copy#envRefClass, initialize,
  .. ..  initialize#lmResp, initializePtr,
  .. ..  initializePtr#lmResp, objective, ptr,
  .. ..  ptr#lmResp, setOffset, setResp,
  .. ..  setWeights, updateMu, wrss
  ..@ Gp     : int [1:2] 0 6
  ..@ call   : language lmer(formula = y ~ x + (1 | block), data = data.hier,      REML = TRUE)
  ..@ frame  :'data.frame':	60 obs. of  3 variables:
  .. ..$ y    : num [1:60] 281 296 328 360 277 ...
  .. ..$ x    : num [1:60] 18.6 26 40.1 63.6 14.1 ...
  .. ..$ block: Factor w/ 6 levels "Block1","Block2",..: 1 1 1 1 1 1 1 1 1 1 ...
  .. ..- attr(*, "terms")=Classes 'terms', 'formula'  language y ~ x + (1 + block)
  .. .. .. ..- attr(*, "variables")= language list(y, x, block)
  .. .. .. ..- attr(*, "factors")= int [1:3, 1:2] 0 1 0 0 0 1
  .. .. .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. .. .. ..$ : chr [1:3] "y" "x" "block"
  .. .. .. .. .. ..$ : chr [1:2] "x" "block"
  .. .. .. ..- attr(*, "term.labels")= chr [1:2] "x" "block"
  .. .. .. ..- attr(*, "order")= int [1:2] 1 1
  .. .. .. ..- attr(*, "intercept")= int 1
  .. .. .. ..- attr(*, "response")= int 1
  .. .. .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv> 
  .. .. .. ..- attr(*, "predvars")= language list(y, x, block)
  .. .. .. ..- attr(*, "dataClasses")= Named chr [1:3] "numeric" "numeric" "factor"
  .. .. .. .. ..- attr(*, "names")= chr [1:3] "y" "x" "block"
  .. .. .. ..- attr(*, "predvars.fixed")= language list(y, x)
  .. .. .. ..- attr(*, "varnames.fixed")= chr [1:2] "y" "x"
  .. .. .. ..- attr(*, "predvars.random")= language list(y, block)
  .. ..- attr(*, "formula")=Class 'formula'  language y ~ x + (1 | block)
  .. .. .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv> 
  ..@ flist  :List of 1
  .. ..$ block: Factor w/ 6 levels "Block1","Block2",..: 1 1 1 1 1 1 1 1 1 1 ...
  .. ..- attr(*, "assign")= int 1
  ..@ cnms   :List of 1
  .. ..$ block: chr "(Intercept)"
  ..@ lower  : num 0
  ..@ theta  : num 2.03
  ..@ beta   : num [1:2] 232.82 1.46
  ..@ u      : num [1:6] 12.86 0.498 3.688 -0.919 -14.265 ...
  ..@ devcomp:List of 2
  .. ..$ cmp : Named num [1:10] 22.5 10.2 4212.8 387 4599.8 ...
  .. .. ..- attr(*, "names")= chr [1:10] "ldL2" "ldRX2" "wrss" "ussq" ...
  .. ..$ dims: Named int [1:12] 60 60 2 58 6 1 1 1 0 2 ...
  .. .. ..- attr(*, "names")= chr [1:12] "N" "n" "p" "nmp" ...
  ..@ pp     :Reference class 'merPredD' [package "lme4"] with 18 fields
  .. ..$ Lambdat:Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
  .. .. .. ..@ i       : int [1:6] 0 1 2 3 4 5
  .. .. .. ..@ p       : int [1:7] 0 1 2 3 4 5 6
  .. .. .. ..@ Dim     : int [1:2] 6 6
  .. .. .. ..@ Dimnames:List of 2
  .. .. .. .. ..$ : NULL
  .. .. .. .. ..$ : NULL
  .. .. .. ..@ x       : num [1:6] 2.03 2.03 2.03 2.03 2.03 ...
  .. .. .. ..@ factors : list()
  .. ..$ LamtUt :Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
  .. .. .. ..@ i       : int [1:60] 0 0 0 0 0 0 0 0 0 0 ...
  .. .. .. ..@ p       : int [1:61] 0 1 2 3 4 5 6 7 8 9 ...
  .. .. .. ..@ Dim     : int [1:2] 6 60
  .. .. .. ..@ Dimnames:List of 2
  .. .. .. .. ..$ : NULL
  .. .. .. .. ..$ : chr [1:60] "1" "2" "3" "4" ...
  .. .. .. ..@ x       : num [1:60] 2.03 2.03 2.03 2.03 2.03 ...
  .. .. .. ..@ factors : list()
  .. ..$ Lind   : int [1:6] 1 1 1 1 1 1
  .. ..$ Ptr    :<externalptr> 
  .. ..$ RZX    : num [1:6, 1:2] 3.12 3.12 3.12 3.12 3.12 ...
  .. ..$ Ut     :Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
  .. .. .. ..@ i       : int [1:60] 0 0 0 0 0 0 0 0 0 0 ...
  .. .. .. ..@ p       : int [1:61] 0 1 2 3 4 5 6 7 8 9 ...
  .. .. .. ..@ Dim     : int [1:2] 6 60
  .. .. .. ..@ Dimnames:List of 2
  .. .. .. .. ..$ : chr [1:6] "Block1" "Block2" "Block3" "Block4" ...
  .. .. .. .. ..$ : chr [1:60] "1" "2" "3" "4" ...
  .. .. .. ..@ x       : num [1:60] 1 1 1 1 1 1 1 1 1 1 ...
  .. .. .. ..@ factors : list()
  .. ..$ Utr    : num [1:6] 6424 5916 5759 5795 5386 ...
  .. ..$ V      : num [1:60, 1:2] 1 1 1 1 1 1 1 1 1 1 ...
  .. ..$ VtV    : num [1:2, 1:2] 60 0 2151 98116
  .. ..$ Vtr    : num [1:2] 17107 642529
  .. ..$ X      : num [1:60, 1:2] 1 1 1 1 1 1 1 1 1 1 ...
  .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. ..$ : chr [1:60] "1" "2" "3" "4" ...
  .. .. .. ..$ : chr [1:2] "(Intercept)" "x"
  .. .. ..- attr(*, "assign")= int [1:2] 0 1
  .. .. ..- attr(*, "msgScaleX")= chr(0) 
  .. ..$ Xwts   : num [1:60] 1 1 1 1 1 1 1 1 1 1 ...
  .. ..$ Zt     :Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
  .. .. .. ..@ i       : int [1:60] 0 0 0 0 0 0 0 0 0 0 ...
  .. .. .. ..@ p       : int [1:61] 0 1 2 3 4 5 6 7 8 9 ...
  .. .. .. ..@ Dim     : int [1:2] 6 60
  .. .. .. ..@ Dimnames:List of 2
  .. .. .. .. ..$ : chr [1:6] "Block1" "Block2" "Block3" "Block4" ...
  .. .. .. .. ..$ : chr [1:60] "1" "2" "3" "4" ...
  .. .. .. ..@ x       : num [1:60] 1 1 1 1 1 1 1 1 1 1 ...
  .. .. .. ..@ factors : list()
  .. ..$ beta0  : num [1:2] 0 0
  .. ..$ delb   : num [1:2] 232.82 1.46
  .. ..$ delu   : num [1:6] 12.86 0.498 3.688 -0.919 -14.265 ...
  .. ..$ theta  : num 2.03
  .. ..$ u0     : num [1:6] 0 0 0 0 0 0
  .. ..and 45 methods, of which 31 are  possibly relevant:
  .. ..  b, beta, CcNumer, copy#envRefClass,
  .. ..  initialize, initializePtr, installPars,
  .. ..  L, ldL2, ldRX2, linPred, P, ptr, RX,
  .. ..  RXdiag, RXi, setBeta0, setDelb, setDelu,
  .. ..  setTheta, setZt, solve, solveU, sqrL, u,
  .. ..  unsc, updateDecomp, updateL,
  .. ..  updateLamtUt, updateRes, updateXwts
  ..@ optinfo:List of 7
  .. ..$ optimizer: chr "bobyqa"
  .. ..$ control  :List of 1
  .. .. ..$ iprint: int 0
  .. ..$ derivs   :List of 2
  .. .. ..$ gradient: num 3.58e-08
  .. .. ..$ Hessian : num [1, 1] 4.21
  .. ..$ conv     :List of 2
  .. .. ..$ opt : int 0
  .. .. ..$ lme4: list()
  .. ..$ feval    : int 44
  .. ..$ warnings : list()
  .. ..$ val      : num 2.03
The most useful extractor functions and methods for a lme are:
Generic Method Package Description
AIC AIC stats Akaike's An Information Criterion
AICc AICc bbmle Log likelihoods and model selection for mle2 objects
AICc AICc MuMIn Second-order Akaike Information Criterion
allEffects allEffects effects Functions For Constructing Effect Displays
allEffects allEffects RSiena Internal data frame used to construct effect objects.
anova anova.merMod lme4 Class "merMod" of Fitted Mixed-Effect Models
augment augment.merMod broom Tidying methods for mixed effects models
coef coef.merMod lme4 Class "merMod" of Fitted Mixed-Effect Models
confint confint.merMod lme4 Compute Confidence Intervals for Parameters of a [ng]lmer Fit
Effect Effect.merMod effects Functions For Constructing Effect Displays for Many Modeling Paradigms
fitted fitted.merMod lme4 Class "merMod" of Fitted Mixed-Effect Models
fixef fixef.merMod lme4 Extract fixed-effects estimates
glance glance.merMod broom Tidying methods for mixed effects models
glht glht lsmeans 'lsmeans' support for 'glht'
glht glht multcomp General Linear Hypotheses
logLik logLik.merMod lme4 Class "merMod" of Fitted Mixed-Effect Models
lsmeans lsmeans emmeans Wrappers for alternative naming of EMMs
lsmeans lsmeans lmerTest Calculates Least Squares Means and Confidence Intervals for the factors of a fixed part of mixed effects model of lmer object.
lsmeans lsmeans lsmeans Least-squares means (or predicted marginal means)
plot plot.merMod lme4 Diagnostic Plots for 'merMod' Fits
predict predict.merMod lme4 Predictions from a model at new data values
ranef ranef.merMod lme4 Extract the modes of the random effects
residuals residuals.merMod lme4 residuals of merMod objects
r.squaredGLMM r.squaredGLMM.merMod MuMIn Pseudo-R-squared for Generalized Mixed-Effect models
sigma sigma.merMod lme4 Extract Residual Standard Deviation 'Sigma'
summary summary.merMod lme4 Class "merMod" of Fitted Mixed-Effect Models
tidy tidy.merMod broom Tidying methods for mixed effects models
vcov vcov.merMod lme4 Class "merMod" of Fitted Mixed-Effect Models
methodsInfo = function(class = "lm") {
    require(tidyverse)
    meth = attr(methods(class = class), "info")
    .generic = meth$generic
    nms = rownames(meth)
    dat1 = list()
    for (i in seq_along(nms)) {
        a = help.search(paste0("^", nms[i], "$"), ignore.case = FALSE)
        aa = a$matches
        if (nrow(aa) == 0) {
            ## get the namespace of the full function
            nmspc = gsub("namespace:", "", getAnywhere(nms[i])$where[2])
            a = help.search(paste0("^", .generic[i],
                "$"), ignore.case = FALSE, field = "alias")
            aa = a$match %>% filter(Entry == .generic[i],
                Package == nmspc)
        }
        d = aa %>% dplyr::select(Title, Package, Entry,
            Field)
        d = d %>% distinct() %>% mutate(Generic = .generic[i]) %>%
            dplyr::select(Package, Generic, Entry,
                Title)
        dat1[[i]] = d
    }
    d = do.call("rbind", dat1)
    rownames(d) = NULL
    d %>% arrange(Package)
}
b = methodsInfo("merMod")
b
   Package          Generic
1      arm       extractAIC
2    broom          augment
3    broom           glance
4    broom             tidy
5      car            Anova
6      car      deltaMethod
7      car linearHypothesis
8      car       matchCoefs
9  effects           Effect
10    lme4            anova
11    lme4      as.function
12    lme4             coef
13    lme4          confint
14    lme4         deviance
15    lme4      df.residual
16    lme4            drop1
17    lme4       extractAIC
18    lme4           family
19    lme4           fitted
20    lme4            fixef
21    lme4          formula
22    lme4          fortify
23    lme4             getL
24    lme4            getME
25    lme4        hatvalues
26    lme4           isGLMM
27    lme4            isLMM
28    lme4           isNLMM
29    lme4           isREML
30    lme4           logLik
31    lme4      model.frame
32    lme4     model.matrix
33    lme4            ngrps
34    lme4             nobs
35    lme4             plot
36    lme4          predict
37    lme4            print
38    lme4          profile
39    lme4            ranef
40    lme4            refit
41    lme4          refitML
42    lme4        residuals
43    lme4             show
44    lme4            sigma
45    lme4         simulate
46    lme4          summary
47    lme4            terms
48    lme4           update
49    lme4          VarCorr
50    lme4             vcov
51    lme4          weights
52 lsmeans        lsm.basis
53 lsmeans     recover.data
54   MuMIn           coeffs
55   MuMIn    r.squaredGLMM
                     Entry
1        extractAIC.merMod
2           augment.merMod
3            glance.merMod
4              tidy.merMod
5             Anova.merMod
6       deltaMethod.merMod
7  linearHypothesis.merMod
8        matchCoefs.merMod
9            Effect.merMod
10            anova.merMod
11      as.function.merMod
12             coef.merMod
13          confint.merMod
14         deviance.merMod
15      df.residual.merMod
16            drop1.merMod
17       extractAIC.merMod
18           family.merMod
19           fitted.merMod
20            fixef.merMod
21          formula.merMod
22          fortify.merMod
23      getL,merMod-method
24            getME.merMod
25        hatvalues.merMod
26           isGLMM.merMod
27            isLMM.merMod
28           isNLMM.merMod
29           isREML.merMod
30           logLik.merMod
31      model.frame.merMod
32     model.matrix.merMod
33            ngrps.merMod
34             nobs.merMod
35             plot.merMod
36          predict.merMod
37            print.merMod
38          profile.merMod
39            ranef.merMod
40            refit.merMod
41          refitML.merMod
42        residuals.merMod
43      show,merMod-method
44            sigma.merMod
45         simulate.merMod
46          summary.merMod
47            terms.merMod
48           update.merMod
49          VarCorr.merMod
50             vcov.merMod
51          weights.merMod
52               lsm.basis
53            recover.data
54                  coeffs
55           r.squaredGLMM
                                                                                      Title
1                                                    Extract AIC and DIC from a 'mer' model
2                                                  Tidying methods for mixed effects models
3                                                  Tidying methods for mixed effects models
4                                                  Tidying methods for mixed effects models
5                                               Anova Tables for Various Statistical Models
6  Estimate and Standard Error of a Nonlinear Function of Estimated Regression Coefficients
7                                                                    Test Linear Hypothesis
8                                                                    Test Linear Hypothesis
9                    Functions For Constructing Effect Displays for Many Modeling Paradigms
10                                             Class "merMod" of Fitted Mixed-Effect Models
11                                             Class "merMod" of Fitted Mixed-Effect Models
12                                             Class "merMod" of Fitted Mixed-Effect Models
13                            Compute Confidence Intervals for Parameters of a [ng]lmer Fit
14                                             Class "merMod" of Fitted Mixed-Effect Models
15                                             Class "merMod" of Fitted Mixed-Effect Models
16                    Drop all possible single fixed-effect terms from a mixed effect model
17                                             Class "merMod" of Fitted Mixed-Effect Models
18                                             Class "merMod" of Fitted Mixed-Effect Models
19                                             Class "merMod" of Fitted Mixed-Effect Models
20                                                          Extract fixed-effects estimates
21                                             Class "merMod" of Fitted Mixed-Effect Models
22                                          add information to data based on a fitted model
23                  Extract or Get Generalized Components from a Fitted Mixed Effects Model
24                  Extract or Get Generalized Components from a Fitted Mixed Effects Model
25                                                      Diagonal elements of the hat matrix
26                                                          Check characteristics of models
27                                                          Check characteristics of models
28                                                          Check characteristics of models
29                                                          Check characteristics of models
30                                             Class "merMod" of Fitted Mixed-Effect Models
31                                             Class "merMod" of Fitted Mixed-Effect Models
32                                             Class "merMod" of Fitted Mixed-Effect Models
33                                             Class "merMod" of Fitted Mixed-Effect Models
34                                             Class "merMod" of Fitted Mixed-Effect Models
35                                                       Diagnostic Plots for 'merMod' Fits
36                                              Predictions from a model at new data values
37                                             Class "merMod" of Fitted Mixed-Effect Models
38                                                        Profile method for merMod objects
39                                                  Extract the modes of the random effects
40                                         Refit a (merMod) Model with a Different Response
41                                            Refit a Model by Maximum Likelihood Criterion
42                                                              residuals of merMod objects
43                                             Class "merMod" of Fitted Mixed-Effect Models
44                                              Extract Residual Standard Deviation 'Sigma'
45                                                  Simulate Responses From 'merMod' Object
46                                             Class "merMod" of Fitted Mixed-Effect Models
47                                             Class "merMod" of Fitted Mixed-Effect Models
48                                             Class "merMod" of Fitted Mixed-Effect Models
49                                              Extract Variance and Correlation Components
50                                             Class "merMod" of Fitted Mixed-Effect Models
51                                             Class "merMod" of Fitted Mixed-Effect Models
52                                          Support functions for creating a reference grid
53                                          Support functions for creating a reference grid
54                                                                  Model utility functions
55                                     Pseudo-R-squared for Generalized Mixed-Effect models

summary(data.hier.lmer)
Linear mixed model fit by REML ['lmerMod']
Formula: y ~ x + (1 | block)
   Data: data.hier

REML criterion at convergence: 451

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-2.09947 -0.57994 -0.04874  0.56685  2.49464 

Random effects:
 Groups   Name        Variance Std.Dev.
 block    (Intercept) 327.93   18.109  
 Residual              79.31    8.905  
Number of obs: 60, groups:  block, 6

Fixed effects:
             Estimate Std. Error t value
(Intercept) 232.81929    7.82339   29.76
x             1.45910    0.06379   22.87

Correlation of Fixed Effects:
  (Intr)
x -0.292

Conclusions - we see that although each routine yeilds a slightly different output, the estimations are identical.

  • the intercept represents the estimated value of y when x is equal to zero. As with regular regression, this is only really of some interest when the continuous predictor(s) are centered. Note that in the mixed effects context, this intercept represents the mean of the individual block intercepts (which we indicated could vary).
  • the slope represents the common estimated rate of change in y per unit of change in x across all blocks (both observed and unobserved).
  • the Correlation: the value of -0.292275 indicates that the approximate degree of correlation between the intercept and slope is -0.292275. Even the authors of the routine admit that the Correlation value is difficult to interpret, yet they suggest that it might sometimes speak to issues of multicollinearity when there are multiple predictors etc. or else that it might be difficult to interprete an effect of a predictor, if there is a high correlation between it and another estimate.
  • the lmer summary includes estimates of the Random effects: These are provided in units of both standard deviation and variance of the individual block intercepts (between block variability) and the standard deviations (and variance) of the observations around the individual block trends (within block effects, which in this case are the residuals). It is clear in this example, that there is more variability between blocks than within the blocks. Without the blocks, all the variability between the blocks would be unaccounted for and just considered residual unexplained noise. So the blocking has substantially reduced the unexplained variability and no doubt lead to more powerful fixed effects tests.

There are also a variety of other ways to summarize the linear mixed effects model. Below, I have included:

  • an anova table that partitions sums of squares into explained and unexplain components - not very useful
  • extract just the fixed effects coefficients
  • extract just the random effects coefficients. Note, these are how much each indivual block intercept differs from the overal intercept and must sum to zero
  • 95% confidence interval of the estimated parameters. These are very useful as they help when describing the magnitude of effects
  • variance components expressed as proportion of total variability c.f. to standard deviation units displayed in regular summary output above.
anova(data.hier.lmer, type = "marginal")
Analysis of Variance Table
  Df Sum Sq Mean Sq F value
x  1  41495   41495  523.22
fixef(data.hier.lmer)
(Intercept)           x 
 232.819291    1.459101 
ranef(data.hier.lmer)
$block
       (Intercept)
Block1   26.149932
Block2    1.013458
Block3    7.499362
Block4   -1.867983
Block5  -29.007062
Block6   -3.787707
confint(data.hier.lmer)
                 2.5 %     97.5 %
.sig01       10.000903  33.426060
.sigma        7.388234  10.789617
(Intercept) 216.495897 249.098759
x             1.332559   1.584745
vc <- as.data.frame(VarCorr(data.hier.lmer))[, "vcov"]
vc/sum(vc)
[1] 0.8052553 0.1947447

Finally, there are also additional packages that offer simple and consistent output formats across a wide variety of tests.

library(broom)
tidy(data.hier.lmer, effects = "fixed")
         term   estimate  std.error statistic
1 (Intercept) 232.819291 7.82339353  29.75937
2           x   1.459101 0.06378882  22.87392
glance(data.hier.lmer)
     sigma   logLik      AIC      BIC deviance
1 8.905485 -225.476 458.9521 467.3294 453.1143
  df.residual
1          56

You might have noticed an absence of p-values from the output above. This is not a mistake, it is deliberate. The lme4 package authors maintain that their is sufficient debate and disagreement over how to calculate degrees of freedom in the present of random effects, that they have elected not to provide residual degrees of freedom and therefore p-values.

As a pragmatic alternative, there is another package (lmerTest) that temporarily replaces the lmer() function as defined in lme4 with one of its own. Its version, has a numberof popular ways of estimating residual degrees of freedom and thus p-values. To take affect, the model must be run after the package is loaded.

library(lmerTest)
data.lmer = update(data.hier.lmer)
summary(data.hier.lmer)
Linear mixed model fit by REML ['lmerMod']
Formula: y ~ x + (1 | block)
   Data: data.hier

REML criterion at convergence: 451

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-2.09947 -0.57994 -0.04874  0.56685  2.49464 

Random effects:
 Groups   Name        Variance Std.Dev.
 block    (Intercept) 327.93   18.109  
 Residual              79.31    8.905  
Number of obs: 60, groups:  block, 6

Fixed effects:
             Estimate Std. Error t value
(Intercept) 232.81929    7.82339   29.76
x             1.45910    0.06379   22.87

Correlation of Fixed Effects:
  (Intr)
x -0.292

Mixed effects models can also be fit using the Template Model Builder automatic differentiation engine via the glmmTMB() function from a package with the same name. glmmTMB() is able to fit similar models to lmer(), yet can also incorporate more complex features such as zero inflation and temporal autocorrelation. Random effects are assumed to be Gaussian on the scale of the linear predictor and are integrated out via Laplace approximation. On the downsides, REML is not available for this technique yet and nor is Gauss-Hermite quadrature (which can be useful when dealing with small sample sizes and non-gaussian errors).

Importantly, glmmTMB prefers that for more complex models (multiple predictors or random intercept/slopes), the continuous predictors are first scaled. When scaling data, make sure you remember that you have scaled the data when interpreting the results.

data.hier = data.hier %>% mutate(cx = as.vector(scale(x)))
library(glmmTMB)
data.hier.glmmTMB <- glmmTMB(y ~ x + (1 | block), data = data.hier)
data.hier.glmmTMB1 <- glmmTMB(y ~ cx + (cx | block),
    data = data.hier)
AIC(data.hier.glmmTMB, data.hier.glmmTMB1)
                   df      AIC
data.hier.glmmTMB   4 461.0447
data.hier.glmmTMB1  6 463.2242
anova(data.hier.glmmTMB, data.hier.glmmTMB1)
Data: data.hier
Models:
data.hier.glmmTMB: y ~ x + (1 | block), zi=~0, disp=~1
data.hier.glmmTMB1: y ~ cx + (cx | block), zi=~0, disp=~1
                   Df    AIC    BIC  logLik
data.hier.glmmTMB   4 461.04 469.42 -226.52
data.hier.glmmTMB1  6 463.22 475.79 -225.61
                   deviance  Chisq Chi Df
data.hier.glmmTMB    453.04              
data.hier.glmmTMB1   451.22 1.8206      2
                   Pr(>Chisq)
data.hier.glmmTMB            
data.hier.glmmTMB1     0.4024
Conclusions - we would conclude so far that the more complex random intercept/slope is not an improvement and so we would go with the simpler random intercept only model.

After fitting a model (any model), the model should always be validated to ensure that it is appropriate and has fit the data well. At the very least, this should involve an exploration of the residuals.

ggplot(data = NULL) + geom_point(aes(y = residuals(data.hier.glmmTMB,
    type = "pearson", scaled = TRUE), x = fitted(data.hier.glmmTMB)))
plot of chunk tut9.1S3.1a

The fitted model is itself a list of class lme. Although it is possible to explore the contents of this list in order to extract the important components of perform additional analyses, it is usually more convenient to access the information via extractor functions and methods.

str(data.hier.glmmTMB)
List of 7
 $ obj      :List of 10
  ..$ par     : Named num [1:4] 1 1 0 0
  .. ..- attr(*, "names")= chr [1:4] "beta" "beta" "betad" "theta"
  ..$ fn      :function (x = last.par[-random], ...)  
  ..$ gr      :function (x = last.par[-random], ...)  
  ..$ he      :function (x = last.par[-random], ...)  
  ..$ hessian : logi FALSE
  ..$ method  : chr "BFGS"
  ..$ retape  :function ()  
  ..$ env     :<environment: 0x9824038> 
  ..$ report  :function (par = last.par)  
  ..$ simulate:function (par = last.par, complete = FALSE)  
 $ fit      :List of 7
  ..$ par        : Named num [1:4] 232.83 1.46 4.35 2.8
  .. ..- attr(*, "names")= chr [1:4] "beta" "beta" "betad" "theta"
  ..$ objective  : num 227
  ..$ convergence: int 0
  ..$ iterations : int 41
  ..$ evaluations: Named int [1:2] 46 42
  .. ..- attr(*, "names")= chr [1:2] "function" "gradient"
  ..$ message    : chr "relative convergence (4)"
  ..$ parfull    : Named num [1:10] 232.83 1.46 26.04 1.01 7.47 ...
  .. ..- attr(*, "names")= chr [1:10] "beta" "beta" "b" "b" ...
 $ sdr      :List of 10
  ..$ value          : num(0) 
  ..$ sd             : num(0) 
  ..$ cov            : logi[0 , 0 ] 
  ..$ par.fixed      : Named num [1:4] 232.83 1.46 4.35 2.8
  .. ..- attr(*, "names")= chr [1:4] "beta" "beta" "betad" "theta"
  ..$ cov.fixed      : num [1:4, 1:4] 51.75932 -0.14317 0.00266 -0.01236 -0.14317 ...
  .. ..- attr(*, "dimnames")=List of 2
  .. .. ..$ : chr [1:4] "beta" "beta" "betad" "theta"
  .. .. ..$ : chr [1:4] "beta" "beta" "betad" "theta"
  ..$ pdHess         : logi TRUE
  ..$ gradient.fixed : num [1:4] -2.73e-07 2.70e-05 4.72e-05 -2.15e-05
  ..$ par.random     : Named num [1:6] 26.04 1.01 7.47 -1.86 -28.88 ...
  .. ..- attr(*, "names")= chr [1:6] "b" "b" "b" "b" ...
  ..$ diag.cov.random: num [1:6] 51.9 51.7 51.8 51.6 52.1 ...
  ..$ env            :<environment: 0x97423b8> 
  ..- attr(*, "class")= chr "sdreport"
 $ call     : language glmmTMB(formula = y ~ x + (1 | block), data = data.hier,      ziformula = ~0, dispformula = ~1)
 $ frame    :'data.frame':	60 obs. of  3 variables:
  ..$ y    : num [1:60] 281 296 328 360 277 ...
  ..$ x    : num [1:60] 18.6 26 40.1 63.6 14.1 ...
  ..$ block: Factor w/ 6 levels "Block1","Block2",..: 1 1 1 1 1 1 1 1 1 1 ...
  ..- attr(*, "terms")=Classes 'terms', 'formula'  language y ~ x + (1 + block) + 0 + 1
  .. .. ..- attr(*, "variables")= language list(y, x, block)
  .. .. ..- attr(*, "factors")= int [1:3, 1:2] 0 1 0 0 0 1
  .. .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. .. ..$ : chr [1:3] "y" "x" "block"
  .. .. .. .. ..$ : chr [1:2] "x" "block"
  .. .. ..- attr(*, "term.labels")= chr [1:2] "x" "block"
  .. .. ..- attr(*, "order")= int [1:2] 1 1
  .. .. ..- attr(*, "intercept")= int 1
  .. .. ..- attr(*, "response")= int 1
  .. .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv> 
  .. .. ..- attr(*, "predvars")= language list(y, x, block)
  .. .. ..- attr(*, "dataClasses")= Named chr [1:3] "numeric" "numeric" "factor"
  .. .. .. ..- attr(*, "names")= chr [1:3] "y" "x" "block"
 $ modelInfo:List of 7
  ..$ nobs   : int 60
  ..$ respCol: Named int 1
  .. ..- attr(*, "names")= chr "y"
  ..$ grpVar : chr "block"
  ..$ family :List of 11
  .. ..$ family    : chr "gaussian"
  .. ..$ link      : chr "identity"
  .. ..$ linkfun   :function (mu)  
  .. ..$ linkinv   :function (eta)  
  .. ..$ variance  :function (mu)  
  .. ..$ dev.resids:function (y, mu, wt)  
  .. ..$ aic       :function (y, n, mu, wt, dev)  
  .. ..$ mu.eta    :function (eta)  
  .. ..$ initialize:  expression({     n <- rep.int(1, nobs)     if (is.null(etastart) && is.null(start) &&          is.null(mustart) && ((family$link ==          "inverse" && any(y == 0)) || (family$link ==          "log" && any(y <= 0))))          stop("cannot find valid starting values: please specify some")     mustart <- y })
  .. ..$ validmu   :function (mu)  
  .. ..$ valideta  :function (eta)  
  .. ..- attr(*, "class")= chr "family"
  ..$ reTrms :List of 2
  .. ..$ cond:List of 3
  .. .. ..$ cnms :List of 1
  .. .. .. ..$ block: chr "(Intercept)"
  .. .. ..$ flist:List of 1
  .. .. .. ..$ block: Factor w/ 6 levels "Block1","Block2",..: 1 1 1 1 1 1 1 1 1 1 ...
  .. .. .. ..- attr(*, "assign")= int 1
  .. .. ..$ terms:List of 1
  .. .. .. ..$ fixed:Classes 'terms', 'formula'  language y ~ x
  .. .. .. .. .. ..- attr(*, "variables")= language list(y, x)
  .. .. .. .. .. ..- attr(*, "factors")= int [1:2, 1] 0 1
  .. .. .. .. .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. .. .. .. .. ..$ : chr [1:2] "y" "x"
  .. .. .. .. .. .. .. ..$ : chr "x"
  .. .. .. .. .. ..- attr(*, "term.labels")= chr "x"
  .. .. .. .. .. ..- attr(*, "order")= int 1
  .. .. .. .. .. ..- attr(*, "intercept")= int 1
  .. .. .. .. .. ..- attr(*, "response")= int 1
  .. .. .. .. .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv> 
  .. .. .. .. .. ..- attr(*, "predvars")= language list(y, x)
  .. .. .. .. .. ..- attr(*, "dataClasses")= Named chr [1:2] "numeric" "numeric"
  .. .. .. .. .. .. ..- attr(*, "names")= chr [1:2] "y" "x"
  .. ..$ zi  :List of 1
  .. .. ..$ terms:List of 1
  .. .. .. ..$ fixed:Classes 'terms', 'formula'  language ~0
  .. .. .. .. .. ..- attr(*, "variables")= language list()
  .. .. .. .. .. ..- attr(*, "factors")= int(0) 
  .. .. .. .. .. ..- attr(*, "term.labels")= chr(0) 
  .. .. .. .. .. ..- attr(*, "order")= int(0) 
  .. .. .. .. .. ..- attr(*, "intercept")= int 0
  .. .. .. .. .. ..- attr(*, "response")= int 0
  .. .. .. .. .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv> 
  .. .. .. .. .. ..- attr(*, "predvars")= language list()
  .. .. .. .. .. ..- attr(*, "dataClasses")= Named chr(0) 
  .. .. .. .. .. .. ..- attr(*, "names")= chr(0) 
  ..$ reStruc:List of 2
  .. ..$ condReStruc:List of 1
  .. .. ..$ 1 | block:List of 4
  .. .. .. ..$ blockReps    : num 6
  .. .. .. ..$ blockSize    : num 1
  .. .. .. ..$ blockNumTheta: num 1
  .. .. .. ..$ blockCode    : Named num 1
  .. .. .. .. ..- attr(*, "names")= chr "us"
  .. ..$ ziReStruc  : list()
  ..$ allForm:List of 4
  .. ..$ combForm   :Class 'formula'  language y ~ x + (1 + block) + 0 + 1
  .. .. .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv> 
  .. ..$ formula    :Class 'formula'  language y ~ x + (1 | block)
  .. .. .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv> 
  .. ..$ ziformula  :Class 'formula'  language ~0
  .. .. .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv> 
  .. ..$ dispformula:Class 'formula'  language ~1
  .. .. .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv> 
 $ fitted   : NULL
 - attr(*, "class")= chr "glmmTMB"
The most useful extractor functions and methods for a lme are:
Generic Method Package Description
AIC AIC stats Akaike's An Information Criterion
confint confint.glmmTMB glmmTMB Calculate confidence intervals
fixef fixef.glmmTMB glmmTMB Extract fixed-effects estimates
predict predict.glmmTMB glmmTMB prediction
ranef ranef.glmmTMB glmmTMB Extract Random Effects
residuals residuals.glmmTMB glmmTMB Compute residuals for a glmmTMB object
sigma sigma.glmmTMB glmmTMB Extract residual standard deviation or dispersion parameter
vcov vcov.glmmTMB glmmTMB Calculate Variance-Covariance Matrix for a Fitted glmmTMB model
methodsInfo = function(class = "lm") {
    require(tidyverse)
    meth = attr(methods(class = class), "info")
    .generic = meth$generic
    nms = rownames(meth)
    dat1 = list()
    for (i in seq_along(nms)) {
        a = help.search(paste0("^", nms[i], "$"), ignore.case = FALSE)
        aa = a$matches
        if (nrow(aa) == 0) {
            ## get the namespace of the full function
            nmspc = gsub("namespace:", "", getAnywhere(nms[i])$where[2])
            a = help.search(paste0("^", .generic[i],
                "$"), ignore.case = FALSE, field = "alias")
            aa = a$match %>% filter(Entry == .generic[i],
                Package == nmspc)
        }
        d = aa %>% dplyr::select(Title, Package, Entry,
            Field)
        d = d %>% distinct() %>% mutate(Generic = .generic[i]) %>%
            dplyr::select(Package, Generic, Entry,
                Title)
        dat1[[i]] = d
    }
    d = do.call("rbind", dat1)
    rownames(d) = NULL
    d %>% arrange(Package)
}
b = methodsInfo("glmmTMB")
b
   Package   Generic             Entry
1  glmmTMB   confint   confint.glmmTMB
2  glmmTMB    family    family_glmmTMB
3  glmmTMB     fixef     fixef.glmmTMB
4  glmmTMB     getME     getME.glmmTMB
5  glmmTMB   predict   predict.glmmTMB
6  glmmTMB   profile   profile.glmmTMB
7  glmmTMB     ranef     ranef.glmmTMB
8  glmmTMB residuals residuals.glmmTMB
9  glmmTMB     sigma     sigma.glmmTMB
10 glmmTMB  simulate  simulate.glmmTMB
11 glmmTMB   VarCorr   VarCorr.glmmTMB
12 glmmTMB      vcov      vcov.glmmTMB
                                                                    Title
1                                          Calculate confidence intervals
2                                            Family functions for glmmTMB
3                                         Extract fixed-effects estimates
4  Extract or Get Generalize Components from a Fitted Mixed Effects Model
5                                                              prediction
6                          Compute likelihood profiles for a fitted model
7                                                  Extract Random Effects
8                                  Compute residuals for a glmmTMB object
9             Extract residual standard deviation or dispersion parameter
10                                   Simulate from a glmmTMB fitted model
11                            Extract variance and correlation components
12        Calculate Variance-Covariance Matrix for a Fitted glmmTMB model

summary(data.hier.glmmTMB)
 Family: gaussian  ( identity )
Formula:          y ~ x + (1 | block)
Data: data.hier

     AIC      BIC   logLik deviance df.resid 
   461.0    469.4   -226.5    453.0       56 

Random effects:

Conditional model:
 Groups   Name        Variance Std.Dev.
 block    (Intercept) 271.98   16.492  
 Residual              77.84    8.823  
Number of obs: 60, groups:  block, 6

Dispersion estimate for gaussian family (sigma^2): 77.8 

Conditional model:
            Estimate Std. Error z value Pr(>|z|)
(Intercept) 232.8302     7.1944   32.36   <2e-16
x             1.4588     0.0632   23.08   <2e-16
               
(Intercept) ***
x           ***
---
Signif. codes:  
0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Conclusions - we see that although each routine yeilds a slightly different output, the estimations are identical.

  • the intercept represents the estimated value of y when x is equal to zero. As with regular regression, this is only really of some interest when the continuous predictor(s) are centered. Note that in the mixed effects context, this intercept represents the mean of the individual block intercepts (which we indicated could vary).
  • the slope represents the common estimated rate of change in y per unit of change in x across all blocks (both observed and unobserved).
  • the Correlation: the value of -0.3148839 indicates that the approximate degree of correlation between the intercept and slope is -0.3148839. Even the authors of the routine admit that the Correlation value is difficult to interpret, yet they suggest that it might sometimes speak to issues of multicollinearity when there are multiple predictors etc. or else that it might be difficult to interprete an effect of a predictor, if there is a high correlation between it and another estimate.
  • the lmer summary includes estimates of the Random effects: These are provided in units of both standard deviation and variance of the individual block intercepts (between block variability) and the standard deviations (and variance) of the observations around the individual block trends (within block effects, which in this case are the residuals). It is clear in this example, that there is more variability between blocks than within the blocks. Without the blocks, all the variability between the blocks would be unaccounted for and just considered residual unexplained noise. So the blocking has substantially reduced the unexplained variability and no doubt lead to more powerful fixed effects tests.

There are also a variety of other ways to summarize the linear mixed effects model. Below, I have included:

  • an anova table that partitions sums of squares into explained and unexplain components - not very useful
  • extract just the fixed effects coefficients
  • extract just the random effects coefficients. Note, these are how much each indivual block intercept differs from the overal intercept and must sum to zero
  • 95% confidence interval of the estimated parameters. These are very useful as they help when describing the magnitude of effects
  • variance components expressed as proportion of total variability c.f. to standard deviation units displayed in regular summary output above.
fixef(data.hier.glmmTMB)
Conditional model:
(Intercept)            x  
    232.830        1.459  
ranef(data.hier.glmmTMB)
$block
       (Intercept)
Block1   26.037974
Block2    1.010065
Block3    7.465094
Block4   -1.859556
Block5  -28.880027
Block6   -3.773475
confint(data.hier.glmmTMB)
                                    2.5 %
cond.(Intercept)               218.729398
cond.x                           1.334932
cond.Std.Dev.block.(Intercept)   9.214285
sigma                            7.306332
                                   97.5 %
cond.(Intercept)               246.930908
cond.x                           1.582663
cond.Std.Dev.block.(Intercept)  29.517157
sigma                           10.654091
                                 Estimate
cond.(Intercept)               232.830153
cond.x                           1.458797
cond.Std.Dev.block.(Intercept)  16.491801
sigma                            8.822830
vc = VarCorr(data.hier.glmmTMB)[[1]]
vc = data.frame(attr(vc$block, "stddev"), attr(vc,
    "sc"))
vc/sum(vc)
            attr.vc.block...stddev..
(Intercept)                0.6514731
            attr.vc...sc..
(Intercept)      0.3485269


  Frequentist pooled variances t-test

        t.test(y~x, data.hier, var.equal=TRUE)
Error in t.test.formula(y ~ x, data.hier, var.equal = TRUE): grouping factor must have exactly 2 levels
        #OR
        data.hier.lm <- lm(y~x, data.hier)
        summary(data.hier.lm)
Call:
lm(formula = y ~ x, data = data.hier)

Residuals:
    Min      1Q  Median      3Q     Max 
-44.389 -10.359   1.126  10.674  37.272 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept) 235.1592     5.2948   44.41  < 2e-16
x             1.3938     0.1309   10.64 2.91e-15
               
(Intercept) ***
x           ***
---
Signif. codes:  
0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 18.98 on 58 degrees of freedom
Multiple R-squared:  0.6614,	Adjusted R-squared:  0.6556 
F-statistic: 113.3 on 1 and 58 DF,  p-value: 2.908e-15

End of instructions

  Frequentist pooled variances t-test

hello

End of instructions