Jump to main navigation


Tutorial 7.3a - Multiple linear regression

15 Jun 2019


Preliminaries

The following packages will be used in this tutorial:

library(car)
library(effects)
library(emmeans)
library(ggfortify)
library(ggeffects)
library(GGally)
library(MuMIn)
library(scales)
library(broom)
library(tidyverse)

Multiple and complex regression analyses can be useful for situations in which patterns in a response variable can not be adequately described by a single straight line resulting from a single predictor and/or a simple linear equation.

Overview

Multiple regression is an extension of simple linear regression whereby a response variable is modeled against a linear combination of two or more simultaneously measured continuous predictor variables. There are two main purposes of multiple linear regression:

  • To develop a better predictive model (equation) than is possible from models based on single independent variables.
  • To investigate the relative individual effects of each of the multiple independent variables above and beyond (standardized across) the effects of the other variables.

Although the relationship between response variable and the additive effect of all the predictor variables is represented overall by a single multidimensional plane (surface), the individual effects of each of the predictor variables on the response variable (standardized across the other variables) can be depicted by single partial regression lines. The slope of any single partial regression line (partial regression slope) thereby represents the rate of change or effect of that specific predictor variable (holding all the other predictor variables constant to their respective mean values) on the response variable. In essence, it is the effect of one predictor variable at one specific level (the means) of all the other predictor variables (i.e. when each of the other predictors are set to their averages).

Multiple regression models can be constructed additively (containing only the predictor variables themselves) or in a multiplicative design (which incorporate interactions between predictor variables in addition to the predictor variables themselves). Multiplicative models are used primarily for testing inferences about the effects of various predictor variables and their interactions on the response variable in much the same way as factorial ANOVA. Additive models by contrast are used for generating predictive models and estimating the relative importance of individual predictor variables more so than hypothesis testing.

Linear models

Additive model

$$y_i=\beta_0+\beta_1x_{i1}+\beta_2x_{i2}+...+\beta_jx_{ij}+\epsilon_i$$ where $\beta_0$ is the population y-intercept (value of $y$ when all partial slopes equal zero), $\beta_1$, $\beta_2$, etc are the partial population slopes of $Y$ on $X_1$, $X_2$, etc respectively holding the other $X$ constant. $\epsilon_i$ is the random unexplained error or residual component.

The additive model assumes that the effect of one predictor variable (partial slope) is independent of the levels of the other predictor variables.

Multiplicative model

$$y_i=\beta_0+\beta_1x_{i1}+\beta_2x_{i2}+\beta_3x_{i1}x_{i2}+...+\epsilon_i$$ where $\beta_3x_{i1}x_{i2}$ is the interactive effect of $X_1$ and $X_2$ on $Y$ and it examines the degree to which the effect of one of the predictor variables depends on the levels of the other predictor variable(s).

Since for gaussian (normal distribution) regression, the range of values for any given predictor (such as temperature) is generally substantially away from zero, the value of the y-intercept (value of y when x is equal to zero) is usually of little interest. It commonly represents a value of y for a value of x that may well be either impossible or else highly unlikely. However, if the predictor variable(s) are centered (the scale is shifted such that the mean of the predictor is zero), then the y-intercept represents the value of y in the middle of the range of x - and is thus more meaningful.

Null hypotheses

There are separate null hypotheses associated with each of the estimated parameters (including the intercept). $$H_0: \beta_0=0 \quad\mathsf{(the~population~y-intercept~equals~zero)}$$ This test is rarely of much interest as it only tests the likelihood that the background level of the response variable is equal to zero (rarely a ecologically meaningful comparison). $$H_0: \beta_1=0 \quad\mathsf{(the~partial~population~slope~of~X_1~on~Y~equals~zero)}$$ $$H_0: \beta_2=0 \quad\mathsf{(the~partial~population~slope~of~X_2~on~Y~equals~zero)}$$ Technically, these tests examine respectively whether or not there is sufficient data to conclude there is a relationship between the dependent and one of the independent variables (holding the other independent variables constant) in the population. Less formally, they are used to conclude whether or not there is evidence of the respective relationships.

For multiplicative models

$$H_0: \beta_3=0 \quad\mathsf{(the~partial~population~slope~of~the~interactive~effect~of~X_1~and~X_2~equals~zero)}$$ The estimated interaction parameter ($\beta_3$ in this case), represents the degree to which the partial slope of one of the covariates deviates throughtout the range of the other covariate(s). The assocatiate hypotheses tests thus examine whether or not the effect of one covariate (dependent variable) on the independent variable (holding others constant) is dependent on other independent variables.

Scenario and Data

Lets say we had set up a natural experiment in which we measured a response ($y$) from each of 20 sampling units ($n=20$) across a landscape. At the same time, we also measured two other continuous covariates ($x1$ and $x2$) from each of the sampling units. As this section is mainly about the generation of artificial data (and not specifically about what to do with the data), understanding the actual details are optional and can be safely skipped. Consequently, I have folded (toggled) this section away.

set.seed(3)
n = 100
intercept = 5
temp = runif(n)
nitro = runif(n) + 0.8 * temp
int.eff = 2
temp.eff <- 0.85
nitro.eff <- 0.5
res = rnorm(n, 0, 1)
coef <- c(int.eff, temp.eff, nitro.eff, int.eff)
mm <- model.matrix(~temp * nitro)

y <- t(coef %*% t(mm)) + res
data <- data.frame(y, x1 = temp, x2 = nitro, cx1 = scale(temp,
    scale = F), cx2 = scale(nitro, scale = F))

With these sort of data, we are primarily interested in investigating whether there is a relationship between the continuous response variable and the components linear predictor (continuous predictors). We could model the relationship via either:

  • an additive model in which the effects of each predictor contribute in an additive way to the response - we do not allow for an interaction as we consider an interaction either not of great importance or likely to be absent.
  • and multiplicative model in which the effects of each predictor and their interaction contribute to the response - we allow for the impact of one predictor to vary across the range of the other predictor.

Centering data

When a linear model contains a covariate (continuous predictor variable) in addition to another predictor (continuous or categorical), it is nearly always advisable that the continuous predictor variables be centered prior to the analysis. Centering is a process by which the mean of a variable is subtracted from each of the values such that the scale of the variable is shifted so as to be centered around 0. Hence the mean of the new centered variable will be 0, yet it will retain the same variance.

Raw dataCentered X
plot of chunk centering
plot of chunk centering1

There are multiple reasons for centering data:

  • Firstly, it provides some biological meaning to the y-intercept. Recall that the y-intercept is the value of Y when X is equal to zero. If X is centered, then the y-intercept represents the value of Y at the mid-point of the X range. The y-intercept of an un-centered X typically represents a un-real value of Y (as an X of 0 is often beyond the reasonable range of values).
  • Secondly, in multiplicative models (in which predictors and their interactions are included), main effects and interaction terms built from centered predictors will not be correlated to one another (see below)
  • Thirdly, for more complex models, centering the covariates can increase the likelihood that the modelling engine will converge (arrive at a numerically stable and reliable outcome).
Note, centering will not effect the (partial) slope estimates.

In R, centering is easily achieved with the scale function. Alternatively, each observation can be subtracted from the mean of the variable.

data <- within(data, {
    cx1 <- as.numeric(scale(x1, scale = FALSE))
    cx2 <- as.numeric(scale(x2, scale = FALSE))
})
head(data)
         y        x1        x2         cx1
1 3.513305 0.1680415 0.9007709 -0.31604197
2 5.090382 0.8075164 1.3281453  0.32343291
3 4.036943 0.3849424 0.5170847 -0.09914114
4 4.006436 0.3277343 0.9741312 -0.15634918
5 5.381677 0.6021007 1.0869787  0.11801718
6 4.530071 0.6043941 0.8240744  0.12031056
          cx2
1  0.02986272
2  0.45723717
3 -0.35382350
4  0.10322304
5  0.21607055
6 -0.04683372

Exploratory data analysis and initial assumption checking

The assumptions are:
  1. All of the observations are independent - this must be addressed at the design and collection stages or specifically accounted for in the variance-covariance structure.
  2. The response variable (and thus the residuals) should be normally distributed. A boxplot of the entire variable is usually useful for diagnosing major issues with normality.
  3. The response variable should be equally varied (variance should not be related to mean as these are supposed to be estimated separately). Scatterplots with linear smoothers can be useful for exploring the spread of observations around the trendline. The spread of observations around the trendline should not increase (or decrease) along its length.
  4. The predictor variables should be uniformly or normally distributed. Again, boxplots can be useful.
  5. The relationships between the linear predictors (right hand side of the regression formula) and the response variable should be linear. Scatterplots with smoothers can be useful for identifying possible non-linearity.
  6. (Multi)collinearity - see below
  7. The number of predictor variables must be less than the number of observations otherwise the linear model will be over-parameterized (more parameters to estimate than there are independent data from which estimations are calculated).
That is, if two or more predictors are correlated to one another, then it is difficult to estimate the 'effects' of any of these individual predictors. In essence, the rob one another in the model such that some will be underestimated and others with be overestimated.

(Multi)collinearity - a predictor variable must not be correlated to the combination of other predictor variables (known collectively as the linear predictor). Multicollinearity has major detrimental effects on model fitting:

  • instability of the estimated partial regression slopes (small changes in the data or variable inclusion can cause dramatic changes in parameter estimates)
  • inflated standard errors and confidence intervals of model parameters, thereby increasing the type II error rate (reducing power) of parameter hypothesis tests
Multicollinearity can be diagnosed with the following:
  • investigate pairwise correlations between all the predictor variables either by a correlation matrix or a scatterplot matrix
  • calculate tolerance ($1-r^2$) of the relationship between a predictor variable and all the other predictor variables) for each of the predictor variables. Tolerance is a measure of the degree of collinearity and values less $<0.2$ should be considered and values $<0.1$ given series attention. Variance inflation factor (VIF) are the inverse of tolerance and thus values greater than 5, or worse, 10 indicate collinearity. Many practitioners advocate a VIF cut off of 3 which equates to an $R^2$ of approximately 0.667.
  • PCA (principle components analysis) eigenvalues (from a correlation matrix for all the predictor variables) close to zero indicate collinearity and component loadings may be useful in determining which predictor variables cause collinearity.
There are several approaches to dealing with collinearity (however the first two of these are likely to result in biased parameter estimates):
  • remove the highly correlated predictor variable(s), starting with the least most biologically interesting variable(s)
  • PCA (principle components analysis) or similar multivariate regression - regress the response variable against the principal components resulting from a correlation matrix for all the predictor variables. Each of these principal components by definition are completely independent, but the resulting parameter estimates must be back-calculated in order to have any biological meaning.
  • apply a regression tree - regression trees recursively partitioning (subsetting) the data in accordance to individual variables that explain the greatest remaining variance. Since at each iteration, each predictor variable is effectively evaluated in isololation, (multi)collinearity is not an issue.

Normality, homogeneity of variance and linearity - scatterplot matrix

# define a boxplot panel function
panel.bxp <- function(x, ...) {
    usr <- par("usr")
    on.exit(par(usr))
    par(usr = c(usr[1:2], 0, 2))
    boxplot(x, add = TRUE, horizontal = T)
}
pairs(~y + cx1 + cx2, data = data, lower.panel = panel.smooth,
    diag.panel = panel.bxp, upper.panel = NULL, gap = 0)
plot of chunk ws7.3aQ3.1
# OR the ggplot way
library(GGally)
ggpairs(data[, c("y", "cx1", "cx2")], lower = list(continuous = "smooth"),
    diag = list(continuous = "density"), axisLabels = "show")
plot of chunk ws7.3aQ3.1a
  • There is no evidence of non-normality in either the response or the covariates (whilst the distribution of cx2 is a little strange, it is nonetheless symmetrical).
  • An increase in cx1 appears to be associated with a positive increase in $y$ and a linear fit would seem sensible (the smoother maintains a constant direction)
  • There is a fairly even spread of values around the smoother (suggesting homogeneity of variance holds).
  • An increase in cx2 does not appear to be associated with a consistent change in $y$, yet a linear fit is no less sensible than any other.
  • cx1 and cx2 do not appear to be strongly correlated to one another

(Multi)collinearity

In R, collinearity can be explored either by exploring the correlations between individual pairs of predictors (as in the scatterplot matrix above) or via the VIF() function (car package). Note, when a model includes multiple predictors and their interactions, the individual predictors will be correlated to their interactions.

library(car)
# additive model - scaled predictors
vif(lm(y ~ cx1 + cx2, data))
     cx1      cx2 
1.576619 1.576619 
# multiplicative model - raw predictors
vif(lm(y ~ x1 * x2, data))
       x1        x2     x1:x2 
 9.773606  4.962009 18.953891 
# multiplicative model - scaled predictors
vif(lm(y ~ cx1 * cx2, data))
     cx1      cx2  cx1:cx2 
1.589502 1.580201 1.024682 

  • For the additive model, the variance-inflation factors are less than 5, suggesting that collinearity is not an issue.
  • For the multiplicative model (with raw predictors), the variance-inflation factors are far greater than 5, suggesting that collinearity is likely to be a huge issue with the raw data fitted with a multiplicative model.
  • For the multiplicative model (with scaled predictors), the variance-inflation factors are less than 5 (or even the more conservative value of 3), suggesting that collinearity is not an issue.

Model fitting or statistical analysis

In R, multiple linear regression models are fit using the lm() function. The most important (=commonly used) parameters/arguments for simple regression are:

  • formula: the linear model relating the response variable to the linear predictor
  • data: the data frame containing the data

  • additive model
    data.add.lm <- lm(y ~ cx1 + cx2, data)
    
  • multiplicative model
    data.mult.lm <- lm(y ~ cx1 + cx2 + cx1:cx2, data)
    # OR
    data.mult.lm <- lm(y ~ cx1 * cx2, data)
    
There are a number of extractor functions (functions that extract or derive specific information from a model) available including:
ExtractorDescription
residuals()Extracts the residuals from the model
fitted()Extracts the predicted (expected) response values (on the link scale) at the observed levels of the linear predictor
predict()Extracts the predicted (expected) response values (on either the link, response or terms (linear predictor) scale)
coef()Extracts the model coefficients
summary()Summarizes the important output and characteristics of the model
anova()Computes an analysis of variance (variance partitioning) from the model
plot()Generates a series of diagnostic plots from the model
influence.measures()Calculates a range of regression diagnostics including leverage and Cook's D
allEffects()Estimates highest level interaction effects from a fitted model. From the effects package. Useful in combination with plot
AIC()Extracts the Akaike's Information Criterion

Model evaluation

Prior to exploring the model parameters, it is prudent to confirm that the model did indeed fit the assumptions and was an appropriate fit to the data.

As part of linear model fitting, a suite of diagnostic measures can be calculated each of which provide an indication of the appropriateness of the model for the data and the indication of each points influence (and outlyingness) of each point on resulting the model.

Leverage

Leverage is a measure of how much of an outlier each point is in x-space (on x-axis) and thus only applies to the predictor variable. Values greater than $2\times p/n$ (where p=number of model parameters ($p = 2$ for simple linear regression), and $n$ is the number of observations) should be investigated as potential outliers.

Residuals

As the residuals are the differences between the observed and predicted values along a vertical plane, they provide a measure of how much of an outlier each point is in y-space (on y-axis). Outliers are identified by relatively large residual values. Residuals can also standardized and studentized, the latter of which can be compared across different models and follow a t distribution enabling the probability of obtaining a given residual can be determined. The patterns of residuals against predicted y values (residual plot) are also useful diagnostic tools for investigating linearity and homogeneity of variance assumptions.

Cook's D

Cook's D statistic is a measure of the influence of each point on the fitted model (estimated slope) and incorporates both leverage and residuals. Values > 1 (or even approaching 1) correspond to highly influential observations.

Lets explore the diagnostics - particularly the residuals

# Additive model
par(mfrow = c(2, 3))
plot(data.add.lm, ask = F, which = 1:6)
plot of chunk tut7.3aS5.1
# Multiplicative model
par(mfrow = c(2, 3))
plot(data.mult.lm, ask = F, which = 1:6)
plot of chunk tut7.3aS5.1
## Additive model
autoplot(data.add.lm)
plot of chunk tut7.3aS5.1b
## Multiplicative model
autoplot(data.mult.lm)
plot of chunk tut7.3aS5.1b
# Additive model
influence.measures(data.add.lm)
Influence measures of
 lm(formula = y ~ cx1 + cx2, data = data) :

      dfb.1_   dfb.cx1   dfb.cx2    dffit cov.r   cook.d
1    0.05496 -0.080304  5.25e-02  0.09744 1.055 3.19e-03
2   -0.03139 -0.011902 -2.72e-02 -0.05475 1.061 1.01e-03
3    0.09076  0.033825 -9.75e-02  0.13691 1.029 6.26e-03
4    0.04603 -0.042590  3.72e-02  0.06433 1.045 1.39e-03
5    0.08067  0.002073  4.06e-02  0.09629 1.026 3.10e-03
6    0.03639  0.023196 -1.81e-02  0.04346 1.042 6.35e-04
7    0.06507  0.005618 -1.17e-01  0.15734 1.082 8.30e-03
8   -0.17037  0.051253  6.45e-02 -0.21453 0.960 1.51e-02
9   -0.08625 -0.100633  1.29e-01 -0.15784 1.044 8.33e-03
10   0.02363  0.018197 -1.40e-02  0.03007 1.047 3.04e-04
11  -0.05598 -0.024339  3.30e-02 -0.06522 1.036 1.43e-03
12  -0.04275 -0.041954  6.52e-02 -0.07806 1.061 2.05e-03
13  -0.05618 -0.050095  6.99e-02 -0.09019 1.049 2.73e-03
14   0.12664  0.119852 -1.55e-01  0.20305 1.009 1.37e-02
15  -0.03702 -0.000204 -6.53e-02 -0.09015 1.092 2.73e-03
16  -0.03224 -0.012583 -3.07e-02 -0.05926 1.064 1.18e-03
17   0.00152 -0.002631  1.73e-03  0.00304 1.075 3.12e-06
18   0.01556  0.002327  1.19e-02  0.02300 1.054 1.78e-04
19  -0.00605 -0.010142  5.22e-03 -0.01186 1.073 4.74e-05
20  -0.00308  0.001476  4.67e-04 -0.00382 1.048 4.91e-06
21  -0.13925  0.166109 -1.10e-01 -0.21708 0.997 1.56e-02
22   0.04341 -0.086055  4.82e-02  0.09650 1.079 3.13e-03
23   0.04617 -0.077863  5.30e-02  0.09082 1.066 2.77e-03
24  -0.00279  0.000998  3.39e-03 -0.00582 1.079 1.14e-05
25   0.06122 -0.055452  2.17e-02  0.08392 1.039 2.36e-03
26  -0.08076 -0.073832  7.47e-03 -0.11896 1.034 4.74e-03
27   0.02497 -0.001355  1.56e-02  0.03113 1.046 3.26e-04
28  -0.06588 -0.081909  5.70e-03 -0.11865 1.052 4.72e-03
29  -0.16460  0.116860 -2.51e-01 -0.30364 0.985 3.02e-02
30   0.08473  0.090886 -4.39e-02  0.12503 1.032 5.23e-03
31   0.05713  0.006562 -3.86e-02  0.07206 1.038 1.74e-03
32   0.00614 -0.001891 -1.54e-05  0.00658 1.043 1.46e-05
33   0.15662 -0.074310 -1.04e-01  0.25526 0.984 2.14e-02
34  -0.15974 -0.082679  1.59e-01 -0.22635 0.975 1.68e-02
35   0.00279 -0.003831  3.43e-03  0.00494 1.065 8.22e-06
36  -0.22468  0.113773 -3.46e-02 -0.25548 0.898 2.09e-02
37   0.10028  0.193093 -1.31e-01  0.21835 1.051 1.59e-02
38  -0.05517 -0.007116  8.37e-02 -0.11418 1.068 4.38e-03
39  -0.12895 -0.004978 -4.85e-02 -0.14432 0.993 6.90e-03
40  -0.07301  0.061080 -7.71e-03 -0.10202 1.035 3.49e-03
41  -0.10400  0.055209  6.10e-03 -0.12776 1.013 5.44e-03
42  -0.20164 -0.232856  1.03e-01 -0.31159 0.936 3.14e-02
43  -0.08698  0.108746 -5.48e-02 -0.13993 1.035 6.55e-03
44  -0.01900  0.012188 -2.78e-02 -0.03414 1.065 3.92e-04
45   0.07188 -0.071790  9.72e-02  0.12198 1.046 4.98e-03
46   0.04558 -0.002854 -4.09e-02  0.07034 1.050 1.66e-03
47  -0.09203  0.129092 -2.79e-02 -0.17061 1.041 9.72e-03
48  -0.05885  0.104474 -6.92e-02 -0.12015 1.065 4.85e-03
49   0.17006 -0.116546  5.90e-02  0.20666 0.959 1.40e-02
50   0.01065  0.015622 -1.03e-02  0.01893 1.065 1.21e-04
51  -0.12313  0.162253 -1.23e-01 -0.20613 1.014 1.41e-02
52   0.27403  0.033407 -3.99e-01  0.54950 0.865 9.46e-02
53   0.07540  0.006550  1.26e-01  0.18005 1.076 1.09e-02
54   0.06972  0.119259 -3.31e-02  0.14656 1.064 7.20e-03
55   0.10615  0.058275  8.04e-02  0.18903 1.030 1.19e-02
56   0.00181  0.002906 -1.24e-03  0.00349 1.071 4.09e-06
57  -0.00701 -0.004934  8.57e-03 -0.01108 1.058 4.13e-05
58  -0.06882 -0.013504  1.05e-01 -0.14029 1.061 6.60e-03
59  -0.09808  0.079713  2.97e-02 -0.15982 1.029 8.52e-03
60   0.24873  0.122348 -4.37e-01  0.53380 0.904 9.04e-02
61  -0.03663  0.008481 -7.02e-02 -0.08999 1.094 2.72e-03
62   0.11057 -0.127593 -6.97e-03  0.19917 1.028 1.32e-02
63   0.07126  0.078309 -2.45e-02  0.10970 1.040 4.03e-03
64   0.14691 -0.276780  2.00e-01  0.31599 1.015 3.29e-02
65   0.17537  0.018974  1.98e-01  0.31631 0.973 3.27e-02
66   0.05501 -0.060135  5.39e-02  0.08441 1.047 2.39e-03
67   0.12775 -0.024794  2.09e-01  0.27662 1.031 2.54e-02
68  -0.12072 -0.101373  1.36e-01 -0.18347 1.010 1.12e-02
69   0.04999 -0.050322  5.51e-02  0.07739 1.049 2.01e-03
70   0.13042 -0.150583  1.30e-02  0.22203 1.009 1.63e-02
71   0.01165  0.007906  1.77e-03  0.01631 1.052 8.95e-05
72  -0.07039 -0.110563  9.28e-02 -0.13505 1.055 6.11e-03
73   0.02971  0.025083  1.61e-02  0.05526 1.066 1.03e-03
74  -0.00544 -0.001608 -9.46e-03 -0.01427 1.108 6.86e-05
75  -0.06183 -0.046836  6.85e-02 -0.09255 1.043 2.87e-03
76   0.09134  0.005707  1.82e-02  0.09546 1.017 3.04e-03
77   0.20275 -0.253784  1.19e-01  0.32763 0.937 3.47e-02
78  -0.15914  0.239984 -1.62e-01 -0.28872 0.989 2.74e-02
79  -0.17534 -0.255279  1.77e-01 -0.31101 0.972 3.16e-02
80  -0.09973 -0.049729 -4.09e-02 -0.14266 1.022 6.79e-03
81   0.00252  0.003611 -2.47e-03  0.00442 1.064 6.58e-06
82  -0.10007 -0.084251  6.06e-02 -0.13137 1.018 5.75e-03
83   0.03255  0.015136 -4.10e-02  0.05371 1.057 9.71e-04
84   0.09210 -0.031634 -1.50e-01  0.23377 1.076 1.83e-02
85  -0.05641 -0.097669  3.86e-02 -0.11567 1.067 4.49e-03
86  -0.06549 -0.081108  2.68e-02 -0.10793 1.047 3.91e-03
87  -0.12962  0.200250 -1.69e-01 -0.24599 1.018 2.00e-02
88   0.04607  0.039184 -6.25e-02  0.07769 1.055 2.03e-03
89   0.17760  0.124772 -8.14e-02  0.21718 0.952 1.54e-02
90  -0.05092 -0.075904  2.27e-02 -0.09594 1.061 3.09e-03
91  -0.08708  0.103630  1.88e-02 -0.16964 1.048 9.62e-03
92  -0.07103  0.099397 -9.32e-02 -0.12905 1.051 5.58e-03
93   0.04831  0.011828 -3.04e-02  0.05767 1.039 1.12e-03
94   0.06870  0.049419  2.80e-02  0.11151 1.044 4.17e-03
95   0.26778  0.130837  2.62e-01  0.52231 0.871 8.57e-02
96  -0.11687  0.045321  5.09e-02 -0.15934 1.008 8.43e-03
97  -0.00518  0.006167 -6.37e-03 -0.00871 1.062 2.56e-05
98  -0.03021  0.043080 -4.64e-02 -0.05849 1.069 1.15e-03
99  -0.09996  0.061999  3.60e-02 -0.14951 1.024 7.45e-03
100 -0.08128 -0.079058  5.71e-02 -0.11399 1.031 4.35e-03
       hat inf
1   0.0314    
2   0.0304    
3   0.0228    
4   0.0195    
5   0.0142    
6   0.0143    
7   0.0585    
8   0.0159    
9   0.0335    
10  0.0162    
11  0.0136    
12  0.0333    
13  0.0258    
14  0.0257    
15  0.0593    
16  0.0338    
17  0.0401    
18  0.0218    
19  0.0385    
20  0.0154    
21  0.0243    
22  0.0494    
23  0.0387    
24  0.0436    
25  0.0188    
26  0.0217    
27  0.0155    
28  0.0324    
29  0.0340    
30  0.0218    
31  0.0159    
32  0.0115    
33  0.0266    
34  0.0201    
35  0.0314    
36  0.0129   *
37  0.0474    
38  0.0428    
39  0.0125    
40  0.0195    
41  0.0151    
42  0.0239    
43  0.0259    
44  0.0323    
45  0.0288    
46  0.0238    
47  0.0344    
48  0.0417    
49  0.0148    
50  0.0316    
51  0.0280    
52  0.0402   *
53  0.0570    
54  0.0442    
55  0.0317    
56  0.0371    
57  0.0250    
58  0.0416    
59  0.0266    
60  0.0461   *
61  0.0603   *
62  0.0324    
63  0.0237    
64  0.0463    
65  0.0325    
66  0.0235    
67  0.0469    
68  0.0231    
69  0.0240    
70  0.0290    
71  0.0196    
72  0.0368    
73  0.0346    
74  0.0689   *
75  0.0224    
76  0.0109    
77  0.0261    
78  0.0329    
79  0.0315    
80  0.0205    
81  0.0307    
82  0.0172    
83  0.0272    
84  0.0644    
85  0.0420    
86  0.0272    
87  0.0360    
88  0.0284    
89  0.0150    
90  0.0355    
91  0.0379    
92  0.0330    
93  0.0142    
94  0.0264    
95  0.0380   *
96  0.0186    
97  0.0283    
98  0.0375    
99  0.0224    
100 0.0197    
# Multiplicative model
influence.measures(data.mult.lm)
Influence measures of
 lm(formula = y ~ cx1 * cx2, data = data) :

      dfb.1_   dfb.cx1   dfb.cx2  dfb.c1.2    dffit cov.r
1    0.08238 -0.098107  0.068949 -0.043295  0.13160 1.060
2   -0.02099 -0.015304 -0.041455 -0.039682 -0.09580 1.072
3    0.08806  0.036729 -0.102033 -0.013545  0.14495 1.028
4    0.08072 -0.054767  0.054168 -0.049401  0.10249 1.052
5    0.09928  0.005977  0.048161 -0.040225  0.11699 1.024
6    0.06625  0.037504 -0.024731 -0.039350  0.07504 1.051
7   -0.00437 -0.001207 -0.016173  0.021433  0.02954 1.189
8   -0.14862  0.051463  0.065216  0.004465 -0.21802 0.938
9   -0.09088 -0.083703  0.097259  0.063936 -0.13840 1.070
10   0.05179  0.034457 -0.022949 -0.031718  0.06128 1.059
11  -0.05121 -0.021037  0.023813  0.028282 -0.05723 1.054
12  -0.03466 -0.029972  0.043143  0.019135 -0.05614 1.080
13  -0.05081 -0.038795  0.048894  0.030940 -0.07207 1.070
14   0.19262  0.151795 -0.176840 -0.124490  0.26961 0.986
15   0.03190  0.018158 -0.155239 -0.207408 -0.30827 1.132
16  -0.01645 -0.016734 -0.051077 -0.058656 -0.11949 1.077
17   0.01942 -0.027085  0.019025 -0.010610  0.03426 1.090
18   0.01119  0.001977  0.010675  0.001184  0.02071 1.065
19   0.00618  0.009451 -0.004605 -0.002862  0.01117 1.087
20  -0.00081  0.000422  0.000132  0.000092 -0.00112 1.059
21  -0.15092  0.145425 -0.105088  0.079312 -0.21533 1.003
22   0.04892 -0.097523  0.056066 -0.012538  0.11178 1.087
23   0.07762 -0.102375  0.074749 -0.045014  0.13253 1.073
24   0.01882  0.030843  0.073270 -0.122852 -0.16894 1.136
25   0.06948 -0.060454  0.025498 -0.020781  0.09696 1.043
26  -0.07278 -0.073133  0.006802  0.010916 -0.11724 1.039
27   0.03662 -0.000595  0.022148 -0.014064  0.04519 1.056
28  -0.04658 -0.092807  0.008385 -0.035310 -0.14400 1.054
29  -0.16561  0.107730 -0.246245  0.057425 -0.30009 0.976
30   0.11573  0.111914 -0.048868 -0.058686  0.15855 1.027
31   0.06255  0.008755 -0.042252 -0.016150  0.08210 1.043
32   0.01841 -0.004413  0.000339 -0.008005  0.01965 1.056
33   0.04735 -0.077683 -0.098933  0.131464  0.26241 1.000
34  -0.16728 -0.085046  0.148517  0.072010 -0.22765 0.970
35   0.03448 -0.032122  0.032011 -0.024819  0.05093 1.085
36  -0.23868  0.101319 -0.038471  0.100549 -0.26840 0.872
37   0.18057  0.257770 -0.161105 -0.135266  0.30992 1.038
38  -0.00569  0.000290  0.144957 -0.133689 -0.23170 1.078
39  -0.13516 -0.010299 -0.048045  0.063148 -0.14841 0.998
40  -0.06246  0.062428 -0.007826 -0.001319 -0.10450 1.039
41  -0.09562  0.052504  0.005169  0.017290 -0.12679 1.014
42  -0.21697 -0.232289  0.094774  0.098850 -0.31600 0.920
43  -0.08283  0.099258 -0.052551  0.025986 -0.13382 1.043
44  -0.01245  0.007480 -0.018080  0.003914 -0.02237 1.078
45   0.12395 -0.085904  0.130785 -0.083373  0.17985 1.045
46   0.02184 -0.003768 -0.033403  0.016642  0.05859 1.065
47  -0.05276  0.159987 -0.029643 -0.074587 -0.21663 1.034
48  -0.05420  0.076911 -0.054360  0.030191 -0.09678 1.084
49   0.20718 -0.118156  0.068838 -0.094258  0.24439 0.923
50   0.03966  0.046688 -0.027931 -0.026229  0.05998 1.081
51  -0.13723  0.134517 -0.112448  0.085385 -0.20048 1.030
52   0.03663  0.000954 -0.381573  0.329041  0.60274 0.867
53  -0.00932 -0.003098  0.043090  0.060856  0.08956 1.163
54   0.03745  0.101964 -0.030357  0.026719  0.13153 1.077
55   0.02933  0.040344  0.062201  0.083310  0.17670 1.054
56   0.00752  0.012394 -0.005119 -0.001979  0.01485 1.083
57   0.00494  0.003285 -0.005257 -0.002290  0.00731 1.072
58  -0.01705 -0.008225  0.159026 -0.126702 -0.24081 1.063
59  -0.04542  0.107133  0.041502 -0.105497 -0.22307 1.018
60   0.08006  0.094485 -0.419463  0.218064  0.54541 0.894
61   0.01089  0.032514 -0.150528 -0.150288 -0.25197 1.121
62   0.02319 -0.111151 -0.010246  0.097280  0.18766 1.057
63   0.08061  0.089413 -0.026166 -0.024899  0.12512 1.042
64   0.22687 -0.308988  0.240957 -0.152440  0.40011 0.982
65   0.07488  0.007001  0.177833  0.117264  0.31642 0.975
66   0.09805 -0.076394  0.076848 -0.064087  0.13235 1.051
67   0.02456 -0.031181  0.166663  0.120283  0.25777 1.062
68  -0.13371 -0.096308  0.115798  0.082014 -0.18121 1.026
69   0.09481 -0.065885  0.081749 -0.064296  0.12800 1.055
70   0.05898 -0.143413  0.008126  0.077114  0.21626 1.019
71   0.01182  0.009145  0.002079 -0.000936  0.01879 1.063
72  -0.06763 -0.076379  0.057596  0.054239 -0.10311 1.090
73   0.00102  0.003319  0.002190  0.005213  0.00987 1.095
74   0.08592  0.000247 -0.147223 -0.307552 -0.39291 1.231
75  -0.05821 -0.039144  0.051499  0.033110 -0.07902 1.062
76   0.12495  0.012482  0.024325 -0.064766  0.12912 1.010
77   0.20754 -0.260459  0.127618 -0.052088  0.34762 0.894
78  -0.17737  0.212606 -0.154484  0.098645 -0.28516 0.993
79  -0.21268 -0.243349  0.153181  0.146893 -0.31753 0.982
80  -0.08280 -0.050622 -0.041949 -0.008647 -0.14831 1.019
81   0.03024  0.034308 -0.021274 -0.020529  0.04487 1.083
82  -0.10730 -0.077515  0.048488  0.065721 -0.12994 1.036
83   0.02947  0.015886 -0.042736 -0.001347  0.05619 1.067
84  -0.01025 -0.006035 -0.016079  0.033204  0.04017 1.309
85  -0.04921 -0.097059  0.038225  0.003394 -0.11510 1.075
86  -0.05982 -0.076989  0.024350  0.014970 -0.10217 1.056
87  -0.15282  0.159347 -0.148686  0.111757 -0.23696 1.042
88   0.07151  0.054958 -0.080770 -0.037504  0.10932 1.061
89   0.24386  0.151791 -0.084195 -0.142089  0.28143 0.907
90  -0.04305 -0.079274  0.024057 -0.004234 -0.10118 1.069
91   0.00600  0.174162  0.038409 -0.217110 -0.33424 1.044
92  -0.06951  0.064238 -0.067281  0.051846 -0.10371 1.080
93   0.06544  0.017081 -0.036167 -0.028075  0.07657 1.045
94   0.03370  0.039461  0.022488  0.029899  0.10008 1.059
95   0.02392  0.088073  0.222652  0.340343  0.58503 0.880
96  -0.08969  0.050613  0.055503 -0.028703 -0.17196 0.998
97   0.02296 -0.018363  0.021359 -0.016483  0.03262 1.083
98  -0.00694  0.006215 -0.007512  0.005512 -0.01068 1.099
99  -0.06262  0.076671  0.044344 -0.064143 -0.18326 1.014
100 -0.08345 -0.068078  0.043390  0.052741 -0.10577 1.052
      cook.d    hat inf
1   4.35e-03 0.0352    
2   2.31e-03 0.0367    
3   5.26e-03 0.0230    
4   2.64e-03 0.0254    
5   3.43e-03 0.0162    
6   1.42e-03 0.0197    
7   2.20e-04 0.1234   *
8   1.16e-02 0.0159    
9   4.82e-03 0.0426    
10  9.47e-04 0.0221    
11  8.26e-04 0.0180    
12  7.95e-04 0.0377    
13  1.31e-03 0.0316    
14  1.80e-02 0.0327    
15  2.38e-02 0.1083   *
16  3.60e-03 0.0445    
17  2.96e-04 0.0444    
18  1.08e-04 0.0219    
19  3.15e-05 0.0412    
20  3.17e-07 0.0155    
21  1.15e-02 0.0281    
22  3.15e-03 0.0500    
23  4.42e-03 0.0437    
24  7.19e-03 0.0924   *
25  2.36e-03 0.0197    
26  3.45e-03 0.0219    
27  5.15e-04 0.0172    
28  5.21e-03 0.0345    
29  2.22e-02 0.0353    
30  6.29e-03 0.0252    
31  1.70e-03 0.0166    
32  9.75e-05 0.0138    
33  1.71e-02 0.0355    
34  1.28e-02 0.0223    
35  6.55e-04 0.0412    
36  1.73e-02 0.0150   *
37  2.39e-02 0.0586    
38  1.35e-02 0.0642    
39  5.48e-03 0.0153    
40  2.74e-03 0.0195    
41  4.02e-03 0.0154    
42  2.43e-02 0.0265    
43  4.49e-03 0.0269    
44  1.26e-04 0.0333    
45  8.10e-03 0.0367    
46  8.66e-04 0.0259    
47  1.17e-02 0.0390    
48  2.36e-03 0.0462    
49  1.46e-02 0.0173    
50  9.08e-04 0.0391    
51  1.00e-02 0.0342    
52  8.63e-02 0.0573   *
53  2.03e-03 0.1059   *
54  4.35e-03 0.0461    
55  7.83e-03 0.0408    
56  5.57e-05 0.0377    
57  1.35e-05 0.0277    
58  1.45e-02 0.0575    
59  1.24e-02 0.0342    
60  7.13e-02 0.0548    
61  1.59e-02 0.0937    
62  8.83e-03 0.0444    
63  3.93e-03 0.0247    
64  3.93e-02 0.0541    
65  2.46e-02 0.0377    
66  4.40e-03 0.0308    
67  1.66e-02 0.0599    
68  8.20e-03 0.0290    
69  4.12e-03 0.0321    
70  1.16e-02 0.0332    
71  8.91e-05 0.0196    
72  2.68e-03 0.0509    
73  2.46e-05 0.0480    
74  3.87e-02 0.1779   *
75  1.57e-03 0.0272    
76  4.16e-03 0.0146    
77  2.92e-02 0.0267    
78  2.01e-02 0.0374    
79  2.48e-02 0.0400    
80  5.50e-03 0.0205    
81  5.08e-04 0.0388    
82  4.23e-03 0.0232    
83  7.97e-04 0.0272    
84  4.08e-04 0.2035   *
85  3.34e-03 0.0421    
86  2.63e-03 0.0278    
87  1.40e-02 0.0463    
88  3.01e-03 0.0322    
89  1.92e-02 0.0201    
90  2.58e-03 0.0356    
91  2.78e-02 0.0656    
92  2.71e-03 0.0440    
93  1.48e-03 0.0165    
94  2.52e-03 0.0289    
95  8.16e-02 0.0575    
96  7.35e-03 0.0191    
97  2.69e-04 0.0380    
98  2.88e-05 0.0511    
99  8.37e-03 0.0255    
100 2.81e-03 0.0262    
Conclusions (for either the additive or multiplicative model):
  • there is no obvious patterns in the residuals (first plot), or at least there are no obvious trends remaining that would be indicative of non-linearity or non-homogeneity of variance
  • The data do not appear to deviate substantially from normality according to the Q-Q normal plot
  • None of the observations are considered overly influential (Cooks' D values not approaching 1).

In addition to the above residual plot, in which the residuals are plotted against the fitted (predicted) values, it is also a good idea to plot the residuals against each of the original covariates.

# Additive model
par(mfrow = c(1, 2))
plot(data$cx1, resid(data.add.lm))
plot(data$cx2, resid(data.add.lm))
plot of chunk tut7.3aS5.1d
# Multiplicative model
par(mfrow = c(1, 2))
plot(data$cx1, resid(data.mult.lm))
plot(data$cx2, resid(data.mult.lm))
plot of chunk tut7.3aS5.1d
## Additive model
data.aug = augment(data.add.lm, data)
head(data.aug)
# A tibble: 6 x 12
      y    x1    x2     cx1     cx2 .fitted .se.fit .resid
  <dbl> <dbl> <dbl>   <dbl>   <dbl>   <dbl>   <dbl>  <dbl>
1  3.51 0.168 0.901 -0.316   0.0299    2.91   0.201  0.604
2  5.09 0.808 1.33   0.323   0.457     5.44   0.197 -0.346
3  4.04 0.385 0.517 -0.0991 -0.354     3.03   0.171  1.00 
4  4.01 0.328 0.974 -0.156   0.103     3.49   0.158  0.513
5  5.38 0.602 1.09   0.118   0.216     4.48   0.135  0.901
6  4.53 0.604 0.824  0.120  -0.0468    4.12   0.135  0.408
# … with 4 more variables: .hat <dbl>, .sigma <dbl>,
#   .cooksd <dbl>, .std.resid <dbl>
data.aug %>% gather(key = Var, value = Value, cx1, cx2) %>% ggplot(aes(y = .resid,
    x = Value)) + geom_point() + facet_grid(~Var)
plot of chunk tut7.3aS5.1e
## Multiplicative model
augment(data.mult.lm, data) %>% gather(key = Var, value = Value,
    cx1, cx2) %>% ggplot(aes(y = .resid, x = Value)) + geom_point() +
    facet_grid(~Var)
plot of chunk tut7.3aS5.1e

Partial effects plots

Finally, we would usually like to visualize the trend(s) graphically so as to help assess the goodness of the model. Since the multiple regression estimates partial effects (effects of one term holding the others constant), graphical summaries are partial effects plots.

Within the effects package, there are a number of functions that can be used to generate partial effects:

  • effect(term=, mod=) is used to generate the partial effect of a single term
  • allEffect(term=, mod=) is used to generate the partial effects of all effects
  • Effect(focal.predictors=, mod=) is used to generate the partial effects of a single predictor across a range of values of the other main effects (thereby showing how the trend in one predictor alters according to the level of the other predictor(s).

Each of these functions returns an eff object that contains the predicted values of $y$ for a range of $x$ values. The effects package also contains a plotting function that takes the eff object and plots it.

The emmeans package also has routines for estimating and plotting the partial effects. Furthermore, there are routines for plotting the partial effects estimated by both emmeans and effects packages as well as the predict function in ggplot. Each of these will be demonstrated below for both the additive and multiplicative models.

  • additive model
    • single partial effect plot
      library(effects)
      plot(effect("cx1", data.add.lm, partial.residuals = TRUE))
      
      plot of chunk tut7.3aS7.1
      library(ggeffects)
      ggeffect(data.add.lm, ~cx1) %>% plot
      
      plot of chunk tut7.3aS7.1b
      library(ggeffects)
      ggemmeans(data.add.lm, ~cx1) %>% plot
      
      plot of chunk tut7.3aS7.1c
      library(ggeffects)
      ggpredict(data.add.lm, ~cx1) %>% plot(rawdata = TRUE)
      
      plot of chunk tut7.3aS7.1d
    • all partial effect plots
      library(effects)
      plot(allEffects(data.add.lm, partial.residuals = TRUE))
      
      plot of chunk tut7.3aS7.2
      library(ggeffects)
      library(gridExtra)
      grid.arrange(ggeffect(data.add.lm, ~cx1) %>% plot, ggeffect(data.add.lm,
          ~cx2) %>% plot, nrow = 1)
      
      plot of chunk tut7.3aS7.2b
      library(ggeffects)
      library(gridExtra)
      grid.arrange(ggemmeans(data.add.lm, ~cx1) %>% plot, ggemmeans(data.add.lm,
          ~cx2) %>% plot, nrow = 1)
      
      plot of chunk tut7.3aS7.2c
      library(ggeffects)
      library(gridExtra)
      grid.arrange(ggpredict(data.add.lm, ~cx1) %>% plot(rawdata = TRUE),
          ggpredict(data.add.lm, ~cx2) %>% plot(rawdata = TRUE), nrow = 1)
      
      plot of chunk tut7.3aS7.2d
  • multiplicative model
    • single partial effect plot
      library(effects)
      plot(effect("cx1", data.mult.lm, partial.residuals = TRUE))
      
      plot of chunk tut7.3aS7.3
      library(ggeffects)
      ggeffect(data.mult.lm, ~cx1) %>% plot
      
      plot of chunk tut7.3aS7.3b
      library(ggeffects)
      ggemmeans(data.mult.lm, ~cx1) %>% plot
      
      plot of chunk tut7.3aS7.3c
      library(ggeffects)
      ggpredict(data.mult.lm, ~cx1) %>% plot(rawdata = TRUE)
      
      plot of chunk tut7.3aS7.3d
    • all partial effect plots
      library(effects)
      plot(allEffects(data.mult.lm, xlevels = 5, partial.residuals = TRUE))
      
      plot of chunk tut7.3aS7.4
      library(ggeffects)
      ggeffect(data.mult.lm, terms = c("cx1", "cx2 [-1,-0.5,0,0.5,1]")) %>%
          plot
      
      plot of chunk tut7.3aS7.4b
      library(ggeffects)
      ggemmeans(data.mult.lm, terms = c("cx1", "cx2 [-1,-0.5,0,0.5,1]")) %>%
          plot
      
      plot of chunk tut7.3aS7.4c
      library(ggeffects)
      ggpredict(data.mult.lm, terms = c("cx1", "cx2 [-1,-0.5,0,0.5,1]")) %>%
          plot
      
      plot of chunk tut7.3aS7.4d
    • interactions in partial effects
      library(effects)
      plot(Effect(focal.predictors = c("cx1", "cx2"), data.mult.lm,
          partial.residuals = TRUE))
      
      plot of chunk tut7.3aS7.5
      library(ggeffects)
      ggeffect(data.mult.lm, ~cx1 + cx2) %>% plot
      
      plot of chunk tut7.3aS7.5b
      library(ggeffects)
      ggemmeans(data.mult.lm, ~cx1 + cx2) %>% plot
      
      plot of chunk tut7.3aS7.5c
      library(ggeffects)
      ggpredict(data.mult.lm, ~cx1 + cx2) %>% plot
      
      plot of chunk tut7.3aS7.5d

It is clear from the above figures, that the effect of cx1 on $y$ changes across the range of cx2 (suggestive of an interaction).

You might have noticed that the x-axis on all of the above plots reflects the centered data. Often we would prefer to visualize the effects in terms of the raw data range. That is we wish to back-transform the values into the original scale. Furthermore, they are not overly nice looking figures.

Recall that the purpose of the above figures is to assess the validity of the fitted model and thus scales and aesthetics are not overly critical. We will produce nicer looking figures in Section 8.

Exploring the model parameters, test hypotheses

If there was any evidence that the assumptions had been violated or the model was not an appropriate fit, then we would need to reconsider the model and start the process again. In this case, there is no evidence that the test will be unreliable so we can proceed to explore the test statistics. The main statistic of interest is the t-statistic ($t$) for each of the partial slope parameters.

  • additive model
    summary(data.add.lm)
    
    Call:
    lm(formula = y ~ cx1 + cx2, data = data)
    
    Residuals:
         Min       1Q   Median       3Q      Max 
    -2.45927 -0.78738 -0.00688  0.71051  2.88492 
    
    Coefficients:
                Estimate Std. Error t value Pr(>|t|)    
    (Intercept)   3.8234     0.1131  33.793  < 2e-16 ***
    cx1           3.0250     0.4986   6.067 2.52e-08 ***
    cx2           1.3878     0.4281   3.242  0.00163 ** 
    ---
    Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
    
    Residual standard error: 1.131 on 97 degrees of freedom
    Multiple R-squared:  0.5361,	Adjusted R-squared:  0.5265 
    F-statistic: 56.05 on 2 and 97 DF,  p-value: < 2.2e-16
    
    # OR via confidence intervals of the parameters
    confint(data.add.lm)
    
                    2.5 %   97.5 %
    (Intercept) 3.5988464 4.047962
    cx1         2.0353869 4.014675
    cx2         0.5381745 2.237488
    
    ## tidy version
    tidy(data.add.lm, conf.int = TRUE)
    
    # A tibble: 3 x 7
      term        estimate std.error statistic  p.value conf.low conf.high
      <chr>          <dbl>     <dbl>     <dbl>    <dbl>    <dbl>     <dbl>
    1 (Intercept)     3.82     0.113     33.8  1.86e-55    3.60       4.05
    2 cx1             3.03     0.499      6.07 2.52e- 8    2.04       4.01
    3 cx2             1.39     0.428      3.24 1.63e- 3    0.538      2.24
    

    Conclusions:

    • At the mean level of cx2 (and thus $x2$), an increase in cx1 (and thus $x1$) is associated with a significant linear increase in $y$. Every 1 unit increase in $x1$ results in a 3.03 unit decrease in $y$.
    • There is no inferential evidence of a relationship between $y$ and $x2$ (holding $x1 constant).
    • The 95% confidence intervals give an indication of the uncertainty in the effect sizes. Confidence bounds that do not contain 0 (zero) suggest evidence that the parameter is 'significantly' different from zero.
    • The $R^2$ value is 0.54 suggesting that approximately 54% of the total variance in $y$ can be explained by its linear relationship with $x1$ and $x2$. The $R^2$ value is thus a measure of the strength of the relationship.
    • The adjusted $R^2$ value ($R^2$ value penalized according to the number of predictor terms) is 0.53. In multiple linear regression, the adjusted $R^2$ is the more appropriate value for comparing models as it is a measure of the explanitory power given the model complexity.

  • multiplicative model
    summary(data.mult.lm)
    
    Call:
    lm(formula = y ~ cx1 * cx2, data = data)
    
    Residuals:
         Min       1Q   Median       3Q      Max 
    -2.34877 -0.85435  0.06905  0.71265  2.57068 
    
    Coefficients:
                Estimate Std. Error t value Pr(>|t|)    
    (Intercept)   3.6710     0.1315  27.924  < 2e-16 ***
    cx1           2.9292     0.4914   5.961 4.15e-08 ***
    cx2           1.3445     0.4207   3.196  0.00189 ** 
    cx1:cx2       2.6651     1.2305   2.166  0.03281 *  
    ---
    Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
    
    Residual standard error: 1.111 on 96 degrees of freedom
    Multiple R-squared:  0.5577,	Adjusted R-squared:  0.5439 
    F-statistic: 40.35 on 3 and 96 DF,  p-value: < 2.2e-16
    
    # OR via confidence intervals of the parameters
    confint(data.mult.lm)
    
                    2.5 %   97.5 %
    (Intercept) 3.4100621 3.931972
    cx1         1.9537887 3.904642
    cx2         0.5094571 2.179451
    cx1:cx2     0.2224680 5.107650
    
    ## tidy version
    tidy(data.mult.lm, conf.int = TRUE)
    
    # A tibble: 4 x 7
      term        estimate std.error statistic  p.value conf.low conf.high
      <chr>          <dbl>     <dbl>     <dbl>    <dbl>    <dbl>     <dbl>
    1 (Intercept)     3.67     0.131     27.9  7.06e-48    3.41       3.93
    2 cx1             2.93     0.491      5.96 4.15e- 8    1.95       3.90
    3 cx2             1.34     0.421      3.20 1.89e- 3    0.509      2.18
    4 cx1:cx2         2.67     1.23       2.17 3.28e- 2    0.222      5.11
    

    Conclusions:

    • There is inferential evidence of a significant interaction implying that the 'effects' of cx1 are non consistent across all levels of cx2 and vice verse. This consistent with what was suggestive in the partial regression plots above..
    • At the mean level of cx2 (and thus $x2$), an increase in cx1 (and thus $x1$) is associated with a significant linear increase in $y$. Every 1 unit increase in $x1$ results in a 2.93 unit decrease in $y$.
    • There is no inferential evidence of a relationship between $y$ and $x2$ (holding $x1$ constant).
    • The 95% confidence intervals give an indication of the uncertainty in the effect sizes. Confidence bounds that do not contain 0 (zero) suggest evidence that the parameter is 'significantly' different from zero.
    • The $R^2$ value is 0.56 suggesting that approximately 56% of the total variance in $y$ can be explained by its linear relationship with $x1$ and $x2$. The $R^2$ value is thus a measure of the strength of the relationship.
    • The adjusted $R^2$ value ($R^2$ value penalized according to the number of predictor terms) is 0.54. In multiple linear regression, the adjusted $R^2$ is the more appropriate value for comparing models as it is a measure of the explanitory power given the model complexity.

Exploring interactions (statistically)

The nature of significant interactions (e.g. cx1 and cx2 on Y) can be further explored by re-fitting the multiple linear model to explore the partial effects of one of the predictor variables (e.g. cx1 ) for a specific set of levels of the other interacting predictor variable(s) (e.g. the mean of cx2 as well as this mean $\pm$ 1 and or 2 standard deviations). For such subsequent main effects tests, ignore the effect of the interaction, which will be identical to that previously tested, and focus purely on the individual partial slope ($\beta_1$).

In the absence of any other criterion to split the analysis up, we will explore the effects of $cx1$ at 5 different levels of $cx2$ (the mean level as well as 1 and 2 standard deviations above and below the mean).

The first step in the process is to calculate a constant for each of the five levels that represents the value to add to the mean to center the data around the new level.

# 2 standard deviations below the mean
(M2 <- mean(data$cx2) - 2 * sd(data$cx2))
[1] -0.667053
# 1 standard deviation below the mean
(M1 <- mean(data$cx2) - 1 * sd(data$cx2))
[1] -0.3335265
# mean longitude
(M <- mean(data$cx2))
[1] -1.500048e-17
# 1 standard deviation above the mean
(P1 <- mean(data$cx2) + 1 * sd(data$cx2))
[1] 0.3335265
# 2 standard deviation below the mean longitude
(P2 <- mean(data$cx2) + 2 * sd(data$cx2))
[1] 0.667053
Next, we refit the model five times, each reflecting the different centering. From these models, we just want to extract the partial stats associated with the $cx1$ term.
(m2 <- summary(data.lmM2 <- lm(y ~ cx1 * c(cx2 - M2), data = data))$coef["cx1", ])
  Estimate Std. Error    t value   Pr(>|t|) 
 1.1514799  0.9939162  1.1585282  0.2495225 
(m1 <- summary(data.lmM2 <- lm(y ~ cx1 * c(cx2 - M1), data = data))$coef["cx1", ])
   Estimate  Std. Error     t value    Pr(>|t|) 
2.040347707 0.668005930 3.054385621 0.002918947 
(m <- summary(data.lmM2 <- lm(y ~ cx1 * c(cx2 - M), data = data))$coef["cx1", ])
    Estimate   Std. Error      t value     Pr(>|t|) 
2.929216e+00 4.914028e-01 5.960926e+00 4.149832e-08 
(p1 <- summary(data.lmM2 <- lm(y ~ cx1 * c(cx2 - P1), data = data))$coef["cx1", ])
    Estimate   Std. Error      t value     Pr(>|t|) 
3.818083e+00 6.112313e-01 6.246544e+00 1.145367e-08 
(p2 <- summary(data.lmM2 <- lm(y ~ cx1 * c(cx2 - P2), data = data))$coef["cx1", ])
    Estimate   Std. Error      t value     Pr(>|t|) 
4.706951e+00 9.179395e-01 5.127736e+00 1.521105e-06 
rbind(`-2sd` = m2, `-1sd` = m1, `Mean cx2` = m, `+1sd` = p1, `+2sd` = p2)
         Estimate Std. Error  t value     Pr(>|t|)
-2sd     1.151480  0.9939162 1.158528 2.495225e-01
-1sd     2.040348  0.6680059 3.054386 2.918947e-03
Mean cx2 2.929216  0.4914028 5.960926 4.149832e-08
+1sd     3.818083  0.6112313 6.246544 1.145367e-08
+2sd     4.706951  0.9179395 5.127736 1.521105e-06
emtrends(data.mult.lm, ~cx1 | cx2, at = list(cx2 = c(M2, M1, M, P1, P2)), var = "cx1") %>% as.data.frame
           cx1           cx2 cx1.trend        SE df   lower.CL upper.CL
1 7.233797e-18 -6.670530e-01  1.151480 0.9939162 96 -0.8214281 3.124388
2 7.233797e-18 -3.335265e-01  2.040348 0.6680059 96  0.7143664 3.366329
3 7.233797e-18 -1.500048e-17  2.929216 0.4914028 96  1.9537887 3.904642
4 7.233797e-18  3.335265e-01  3.818083 0.6112313 96  2.6047988 5.031368
5 7.233797e-18  6.670530e-01  4.706951 0.9179395 96  2.8848557 6.529047

Graphical Summary

  • additive model
    ## define the new prediction values (cx1)
    data.list = with(data, list(cx1 = seq(min(cx1), max(cx1), len = 100),
        cx2 = 0))
    data.grid = ref_grid(data.add.lm, ~cx1, cov.reduce = FALSE, at = data.list)
    newdata = confint(data.grid)
    ## back transform the focal predictor
    newdata = newdata %>% mutate(x1 = cx1 + mean(data$x1))
    ## Calculate the partial residuals
    presid = ref_grid(data.add.lm, ~cx1, cov.reduce = function(x) x,
        at = list(cx2 = 0)) %>% predict + as.vector(resid(data.add.lm))
    ## convert to data frame and backtransform predictor
    partial.obs = data.frame(presid = presid, x1 = recover_data(data.add.lm)$cx1 +
        mean(data$x1))
    library(ggplot2)
    p1 <- ggplot(newdata, aes(y = prediction, x = x1)) + geom_point(data = partial.obs,
        aes(y = presid), color = "grey") + geom_ribbon(aes(ymin = lower.CL,
        ymax = upper.CL), fill = "blue", alpha = 0.2) + geom_line() +
        scale_y_continuous("Y") + scale_x_continuous("X1") + theme_classic() +
        theme(axis.title.y = element_text(vjust = 1, size = rel(1.25)),
            axis.title.x = element_text(vjust = -1, size = rel(1.25)),
            plot.margin = unit(c(0.5, 0.5, 1, 1), "lines"))
    
    ## same for cx2
    data.list = with(data, list(cx2 = seq(min(cx2), max(cx2), len = 100),
        cx1 = 0))
    data.grid = ref_grid(data.add.lm, ~cx2, cov.reduce = FALSE, at = data.list)
    newdata = confint(data.grid)
    ## back transform the focal predictor
    newdata = newdata %>% mutate(x2 = cx2 + mean(data$x2))
    ## Calculate the partial residuals
    presid = ref_grid(data.add.lm, ~cx2, cov.reduce = function(x) x,
        at = list(cx1 = 0)) %>% predict + as.vector(resid(data.add.lm))
    ## convert to data frame and backtransform predictor
    partial.obs = data.frame(presid = presid, x2 = recover_data(data.add.lm)$cx2 +
        mean(data$x2))
    library(ggplot2)
    p2 <- ggplot(newdata, aes(y = prediction, x = x2)) + geom_point(data = partial.obs,
        aes(y = presid), color = "grey") + geom_ribbon(aes(ymin = lower.CL,
        ymax = upper.CL), fill = "blue", alpha = 0.2) + geom_line() +
        scale_y_continuous("Y") + scale_x_continuous("X2") + theme_classic() +
        theme(axis.title.y = element_text(vjust = 1, size = rel(1.25)),
            axis.title.x = element_text(vjust = -1, size = rel(1.25)),
            plot.margin = unit(c(0.5, 0.5, 1, 1), "lines"))
    
    library(gridExtra)
    grid.arrange(p1, p2, ncol = 2)
    
    plot of chunk tut7.3aS8.1a
    # for cx1. Generate a sequence of cx1 values to predict and
    # set cx2 = 0 since this is its mean
    newdata <- data.frame(cx1 = seq(min(data$cx1), max(data$cx1),
        len = 100), cx2 = 0)
    fit <- predict(data.add.lm, newdata = newdata, interval = "confidence")
    fit <- cbind(newdata, fit)
    fit$x1 <- fit$cx1 + mean(data$x1)
    head(fit)
    
             cx1 cx2      fit      lwr      upr          x1
    1 -0.4755165   0 2.384952 1.863528 2.906376 0.008566967
    2 -0.4655705   0 2.415039 1.902481 2.927597 0.018512978
    3 -0.4556245   0 2.445126 1.941398 2.948855 0.028458988
    4 -0.4456785   0 2.475213 1.980276 2.970150 0.038404998
    5 -0.4357325   0 2.505300 2.019114 2.991486 0.048351008
    6 -0.4257865   0 2.535387 2.057909 3.012864 0.058297018
    
    # calculated the partial observed values
    partial.obs <- data.frame(cx1 = data$cx1, cx2 = 0)
    partial.obs <- cbind(partial.obs, fit = predict(data.add.lm,
        newdata = partial.obs))
    partial.obs$fit <- partial.obs$fit + resid(data.add.lm)
    partial.obs$x1 <- partial.obs$cx1 + mean(data$x1)
    head(partial.obs)
    
              cx1 cx2      fit        x1
    1 -0.31604197   0 3.471860 0.1680415
    2  0.32343291   0 4.455814 0.8075164
    3 -0.09914114   0 4.527990 0.3849424
    4 -0.15634918   0 3.863180 0.3277343
    5  0.11801718   0 5.081808 0.6021007
    6  0.12031056   0 4.595068 0.6043941
    
    library(ggplot2)
    p1 <- ggplot(fit, aes(y = fit, x = x1)) + geom_point(data = partial.obs,
        color = "grey") + geom_ribbon(aes(ymin = lwr, ymax = upr),
        fill = "blue", alpha = 0.2) + geom_line() + scale_y_continuous("Y") +
        scale_x_continuous("X1") + theme_classic() + theme(axis.title.y = element_text(vjust = 1,
        size = rel(1.25)), axis.title.x = element_text(vjust = -1,
        size = rel(1.25)), plot.margin = unit(c(0.5, 0.5, 1, 1),
        "lines"))
    
    # for cx2. Generate a sequence of cx2 values to predict and
    # set cx1 = 0 since this is its mean
    newdata <- data.frame(cx2 = seq(min(data$cx2), max(data$cx2),
        len = 100), cx1 = 0)
    fit <- predict(data.add.lm, newdata = newdata, interval = "confidence")
    fit <- cbind(newdata, fit)
    fit$x2 <- fit$cx2 + mean(data$x2)
    head(fit)
    
             cx2 cx1      fit      lwr      upr        x2
    1 -0.7657351   0 2.760693 2.072418 3.448968 0.1051730
    2 -0.7499253   0 2.782634 2.107043 3.458226 0.1209829
    3 -0.7341154   0 2.804576 2.141639 3.467513 0.1367927
    4 -0.7183056   0 2.826517 2.176203 3.476831 0.1526026
    5 -0.7024957   0 2.848459 2.210734 3.486183 0.1684125
    6 -0.6866859   0 2.870400 2.245230 3.495570 0.1842223
    
    # calculated the partial observed values
    partial.obs <- data.frame(cx2 = data$cx2, cx1 = 0)
    partial.obs <- cbind(partial.obs, fit = predict(data.add.lm,
        newdata = partial.obs))
    partial.obs$fit <- partial.obs$fit + resid(data.add.lm)
    partial.obs$x2 <- partial.obs$cx2 + mean(data$x2)
    head(partial.obs)
    
              cx2 cx1      fit        x2
    1  0.02986272   0 4.469341 0.9007709
    2  0.45723717   0 4.111988 1.3281453
    3 -0.35382350   0 4.336848 0.5170847
    4  0.10322304   0 4.479398 0.9741312
    5  0.21607055   0 5.024672 1.0869787
    6 -0.04683372   0 4.166128 0.8240744
    
    library(ggplot2)
    p2 <- ggplot(fit, aes(y = fit, x = x2)) + geom_point(data = partial.obs,
        color = "grey") + geom_ribbon(aes(ymin = lwr, ymax = upr),
        fill = "blue", alpha = 0.2) + geom_line() + scale_y_continuous("Y") +
        scale_x_continuous("X2") + theme_classic() + theme(axis.title.y = element_text(vjust = 1,
        size = rel(1.25)), axis.title.x = element_text(vjust = -1,
        size = rel(1.25)), plot.margin = unit(c(0.5, 0.5, 1, 1),
        "lines"))
    
    library(gridExtra)
    grid.arrange(p1, p2, ncol = 2)
    
    plot of chunk tut7.3aS8.1
    # for cx1. Generate a sequence of cx1 values to predict and
    # set cx2 = 0 since this is its mean
    newdata <- data.frame(cx1 = seq(min(data$cx1), max(data$cx1),
        len = 100), cx2 = 0)
    Xmat = model.matrix(~cx1 + cx2, data = newdata)
    coefs = coef(data.add.lm)
    fit = as.vector(coefs %*% t(Xmat))
    se = sqrt(diag(Xmat %*% vcov(data.add.lm) %*% t(Xmat)))
    q = qt(0.975, df = df.residual(data.add.lm))
    fit = data.frame(fit = fit, lwr = fit - q * se, upr = fit + q *
        se) %>% mutate(x1 = newdata$cx1 + mean(data$x1))
    ## partial residuals
    newdata <- data.frame(cx1 = data$cx1, cx2 = 0)
    Xmat = model.matrix(~cx1 + cx2, data = newdata)
    coefs = coef(data.add.lm)
    presid = as.vector(coefs %*% t(Xmat))
    partial.obs = data.frame(presid = presid + resid(data.add.lm)) %>%
        mutate(x1 = newdata$cx1 + mean(data$x1))
    
    library(ggplot2)
    p1 <- ggplot(fit, aes(y = fit, x = x1)) + geom_point(data = partial.obs,
        aes(y = presid), color = "grey") + geom_ribbon(aes(ymin = lwr,
        ymax = upr), fill = "blue", alpha = 0.2) + geom_line() +
        scale_y_continuous("Y") + scale_x_continuous("X1") + theme_classic() +
        theme(axis.title.y = element_text(vjust = 1, size = rel(1.25)),
            axis.title.x = element_text(vjust = -1, size = rel(1.25)),
            plot.margin = unit(c(0.5, 0.5, 1, 1), "lines"))
    
    ## now for cx2
    newdata <- data.frame(cx2 = seq(min(data$cx2), max(data$cx2),
        len = 100), cx1 = 0)
    Xmat = model.matrix(~cx1 + cx2, data = newdata)
    coefs = coef(data.add.lm)
    fit = as.vector(coefs %*% t(Xmat))
    se = sqrt(diag(Xmat %*% vcov(data.add.lm) %*% t(Xmat)))
    q = qt(0.975, df = df.residual(data.add.lm))
    fit = data.frame(fit = fit, lwr = fit - q * se, upr = fit + q *
        se) %>% mutate(x2 = newdata$cx2 + mean(data$x2))
    ## partial residuals
    newdata <- data.frame(cx2 = data$cx2, cx1 = 0)
    Xmat = model.matrix(~cx1 + cx2, data = newdata)
    coefs = coef(data.add.lm)
    presid = as.vector(coefs %*% t(Xmat))
    partial.obs = data.frame(presid = presid + resid(data.add.lm)) %>%
        mutate(x2 = newdata$cx2 + mean(data$x2))
    
    library(ggplot2)
    p2 <- ggplot(fit, aes(y = fit, x = x2)) + geom_point(data = partial.obs,
        aes(y = presid), color = "grey") + geom_ribbon(aes(ymin = lwr,
        ymax = upr), fill = "blue", alpha = 0.2) + geom_line() +
        scale_y_continuous("Y") + scale_x_continuous("X2") + theme_classic() +
        theme(axis.title.y = element_text(vjust = 1, size = rel(1.25)),
            axis.title.x = element_text(vjust = -1, size = rel(1.25)),
            plot.margin = unit(c(0.5, 0.5, 1, 1), "lines"))
    
    library(gridExtra)
    grid.arrange(p1, p2, ncol = 2)
    
    plot of chunk tut7.3aS8.1b
  • multiplicative model
    ## for cx1. Generate a sequence of cx1 values to predict and
    ## set cx2 to be equal to the mean and 1 and 2 standard
    ## deviations above and below the mean
    data.list = with(data, list(cx1 = seq(min(cx1), max(cx1), len = 100),
        cx2 = mean(data$cx2) + sd(data$cx2) * c(-2, -1, 0, 1, 2)))
    data.grid = ref_grid(data.mult.lm, ~cx1 | cx2, at = data.list,
        cov.reduce = function(x) x)
    newdata = confint(data.grid)
    ## back transform the focal predictor
    newdata = newdata %>% mutate(x1 = cx1 + mean(data$x1))
    ## Calculate the partial residuals
    presid = ref_grid(data.mult.lm, ~cx1, cov.reduce = function(x) x,
        at = list(cx2 = mean(data$cx2) + sd(data$cx2) * c(-2, -1,
            0, 1, 2))) %>% summary %>% mutate(presid = prediction +
        as.vector(resid(data.mult.lm)))
    ## convert to data frame and backtransform predictor
    partial.obs = presid %>% mutate(presid = presid + resid(data.mult.lm),
        x1 = cx1 + mean(data$x1), x2 = cx2 + mean(data$x2))
    
    library(ggplot2)
    p1 <- ggplot(newdata, aes(y = prediction, x = x1, group = factor(cx2),
        fill = factor(cx2), color = factor(cx2))) + geom_point(data = partial.obs,
        aes(y = presid), alpha = 0.2) + geom_ribbon(aes(ymin = lower.CL,
        ymax = upper.CL), color = NA, alpha = 0.2) + geom_line() +
        scale_y_continuous("Y") + scale_x_continuous("X1") + scale_fill_discrete("X2",
        breaks = mean(data$cx2) + sd(data$cx2) * c(-2, -1, 0, 1,
            2), labels = lapply(-2:2, function(i) bquote(.(i) * sigma))) +
        scale_color_discrete("X2", breaks = mean(data$cx2) + sd(data$cx2) *
            c(-2, -1, 0, 1, 2), labels = lapply(-2:2, function(i) bquote(.(i) *
            sigma))) + theme_classic() + theme(axis.title.y = element_text(vjust = 1,
        size = rel(1.25)), axis.title.x = element_text(vjust = -1,
        size = rel(1.25)), plot.margin = unit(c(0.5, 0.5, 1, 1),
        "lines"))
    
    # for cx2. Generate a sequence of cx2 values to predict and
    # set cx1 = 0 since this is its mean
    newdata$ccx2 <- factor(newdata$cx2, labels = paste("X2:~", c(-2,
        -1, 0, 1, 2), "*sigma"))
    partial.obs$ccx2 <- factor(partial.obs$cx2, labels = paste("X2:~",
        c(-2, -1, 0, 1, 2), "*sigma"))
    p2 <- ggplot(newdata, aes(y = prediction, x = x1)) + geom_point(data = partial.obs,
        aes(y = presid)) + geom_ribbon(aes(ymin = lower.CL, ymax = upper.CL),
        alpha = 0.2, color = NA) + geom_line() + scale_y_continuous("Y") +
        scale_x_continuous("X1") + facet_grid(~ccx2, labeller = label_parsed) +
        theme_classic() + theme(axis.title.y = element_text(vjust = 1,
        size = rel(1.25)), axis.title.x = element_text(vjust = -1,
        size = rel(1.25)), plot.margin = unit(c(0.5, 0.5, 1, 1),
        "lines"), strip.background = element_blank())
    
    library(gridExtra)
    grid.arrange(p1, p2, ncol = 1)
    
    plot of chunk tut7.3aS8.2b
    ## for cx1. Generate a sequence of cx1 values to predict and
    ## set cx2 to be equal to the mean and 1 and 2 standard
    ## deviations above and below the mean
    newdata <- expand.grid(cx1 = seq(min(data$cx1), max(data$cx1),
        len = 100), cx2 = mean(data$cx2) + sd(data$cx2) * c(-2, -1,
        0, 1, 2))
    fit <- predict(data.mult.lm, newdata = newdata, interval = "confidence")
    fit <- cbind(newdata, fit)
    fit$x1 <- fit$cx1 + mean(data$x1)
    head(fit)
    
             cx1       cx2      fit      lwr      upr          x1
    1 -0.4755165 -0.667053 2.226647 1.357498 3.095796 0.008566967
    2 -0.4655705 -0.667053 2.238100 1.384078 3.092122 0.018512978
    3 -0.4556245 -0.667053 2.249552 1.410471 3.088634 0.028458988
    4 -0.4456785 -0.667053 2.261005 1.436668 3.085342 0.038404998
    5 -0.4357325 -0.667053 2.272458 1.462658 3.082257 0.048351008
    6 -0.4257865 -0.667053 2.283910 1.488429 3.079391 0.058297018
    
    # calculated the partial observed values
    partial.obs <- expand.grid(cx1 = data$cx1, cx2 = mean(data$cx2) +
        sd(data$cx2) * c(-2, -1, 0, 1, 2))
    partial.obs <- cbind(partial.obs, fit = predict(data.mult.lm,
        newdata = partial.obs))
    partial.obs$fit <- partial.obs$fit + resid(data.mult.lm)
    partial.obs$x1 <- partial.obs$cx1 + mean(data$x1)
    head(partial.obs)
    
              cx1       cx2      fit        x1
    1 -0.31604197 -0.667053 3.163325 0.1680415
    2  0.32343291 -0.667053 2.609724 0.8075164
    3 -0.09914114 -0.667053 3.698581 0.3849424
    4 -0.15634918 -0.667053 3.291794 0.3277343
    5  0.11801718 -0.667053 3.916596 0.6021007
    6  0.12031056 -0.667053 3.497351 0.6043941
    
    library(ggplot2)
    p1 <- ggplot(fit, aes(y = fit, x = x1, group = factor(cx2), fill = factor(cx2),
        color = factor(cx2))) + geom_point(data = partial.obs, alpha = 0.2) +
        geom_ribbon(aes(ymin = lwr, ymax = upr), alpha = 0.2, color = NA) +
        geom_line() + scale_y_continuous("Y") + scale_x_continuous("X1") +
        scale_fill_discrete("X2", breaks = mean(data$cx2) + sd(data$cx2) *
            c(-2, -1, 0, 1, 2), labels = lapply(-2:2, function(i) bquote(.(i) *
            sigma))) + scale_color_discrete("X2", breaks = mean(data$cx2) +
        sd(data$cx2) * c(-2, -1, 0, 1, 2), labels = lapply(-2:2,
        function(i) bquote(.(i) * sigma))) + theme_classic() + theme(axis.title.y = element_text(vjust = 1,
        size = rel(1.25)), axis.title.x = element_text(vjust = -1,
        size = rel(1.25)), plot.margin = unit(c(0.5, 0.5, 1, 1),
        "lines"))
    
    # for cx2. Generate a sequence of cx2 values to predict and
    # set cx1 = 0 since this is its mean
    fit$ccx2 <- factor(fit$cx2, labels = paste("X2:~", c(-2, -1,
        0, 1, 2), "*sigma"))
    partial.obs$ccx2 <- factor(partial.obs$cx2, labels = paste("X2:~",
        c(-2, -1, 0, 1, 2), "*sigma"))
    p2 <- ggplot(fit, aes(y = fit, x = x1)) + geom_point(data = partial.obs) +
        geom_ribbon(aes(ymin = lwr, ymax = upr), alpha = 0.2, color = NA) +
        geom_line() + scale_y_continuous("Y") + scale_x_continuous("X1") +
        facet_grid(~ccx2, labeller = label_parsed) + theme_classic() +
        theme(axis.title.y = element_text(vjust = 1, size = rel(1.25)),
            axis.title.x = element_text(vjust = -1, size = rel(1.25)),
            plot.margin = unit(c(0.5, 0.5, 1, 1), "lines"), strip.background = element_blank())
    
    library(gridExtra)
    grid.arrange(p1, p2, ncol = 1)
    
    plot of chunk tut7.3aS8.2
    # for cx1. Generate a sequence of cx1 values to predict and
    # set cx2 = 0 since this is its mean
    newdata <- expand.grid(cx1 = seq(min(data$cx1), max(data$cx1),
        len = 100), cx2 = mean(data$cx2) + sd(data$cx2) * c(-2, -1,
        0, 1, 2))
    Xmat = model.matrix(~cx1 * cx2, data = newdata)
    coefs = coef(data.mult.lm)
    fit = as.vector(coefs %*% t(Xmat))
    se = sqrt(diag(Xmat %*% vcov(data.mult.lm) %*% t(Xmat)))
    q = qt(0.975, df = df.residual(data.mult.lm))
    fit = data.frame(fit = fit, lwr = fit - q * se, upr = fit + q *
        se) %>% mutate(cx2 = newdata$cx2, x1 = newdata$cx1 + mean(data$x1),
        x2 = newdata$cx2 + mean(data$x2))
    ## partial residuals
    newdata <- expand.grid(cx1 = data$cx1, cx2 = mean(data$cx2) +
        sd(data$cx2) * c(-2, -1, 0, 1, 2))
    Xmat = model.matrix(~cx1 * cx2, data = newdata)
    coefs = coef(data.mult.lm)
    presid = as.vector(coefs %*% t(Xmat))
    partial.obs = data.frame(presid = presid + rep(resid(data.mult.lm),
        5)) %>% mutate(cx2 = newdata$cx2, x1 = newdata$cx1 + mean(data$x1),
        x2 = newdata$cx2 + mean(data$x2))
    
    library(ggplot2)
    p1 <- ggplot(fit, aes(y = fit, x = x1, group = factor(cx2), fill = factor(cx2),
        color = factor(cx2))) + geom_point(data = partial.obs, aes(y = presid),
        alpha = 0.2) + geom_ribbon(aes(ymin = lwr, ymax = upr), alpha = 0.2,
        color = NA) + geom_line() + scale_y_continuous("Y") + scale_x_continuous("X1") +
        scale_fill_discrete("X2", breaks = mean(data$cx2) + sd(data$cx2) *
            c(-2, -1, 0, 1, 2), labels = lapply(-2:2, function(i) bquote(.(i) *
            sigma))) + scale_color_discrete("X2", breaks = mean(data$cx2) +
        sd(data$cx2) * c(-2, -1, 0, 1, 2), labels = lapply(-2:2,
        function(i) bquote(.(i) * sigma))) + theme_classic() + theme(axis.title.y = element_text(vjust = 1,
        size = rel(1.25)), axis.title.x = element_text(vjust = -1,
        size = rel(1.25)), plot.margin = unit(c(0.5, 0.5, 1, 1),
        "lines"))
    
    # for cx2. Generate a sequence of cx2 values to predict and
    # set cx1 = 0 since this is its mean
    fit$ccx2 <- factor(fit$cx2, labels = paste("X2:~", c(-2, -1,
        0, 1, 2), "*sigma"))
    partial.obs$ccx2 <- factor(partial.obs$cx2, labels = paste("X2:~",
        c(-2, -1, 0, 1, 2), "*sigma"))
    p2 <- ggplot(fit, aes(y = fit, x = x1)) + geom_point(data = partial.obs,
        aes(y = presid)) + geom_ribbon(aes(ymin = lwr, ymax = upr),
        alpha = 0.2, color = NA) + geom_line() + scale_y_continuous("Y") +
        scale_x_continuous("X1") + facet_grid(~ccx2, labeller = label_parsed) +
        theme_classic() + theme(axis.title.y = element_text(vjust = 1,
        size = rel(1.25)), axis.title.x = element_text(vjust = -1,
        size = rel(1.25)), plot.margin = unit(c(0.5, 0.5, 1, 1),
        "lines"), strip.background = element_blank())
    
    library(gridExtra)
    grid.arrange(p1, p2, ncol = 1)
    
    plot of chunk tut7.3aS8.2c

Model Selection

Not all the predictor variables in a multiple linear model necessarily contribute substantially to explaining variation in the response variable. Those that do not, are unlikely to have much biological impact on the response and therefore could be ommitted from the final regression equation (along with all the other unmeasured variables).

Furthermore, we may wish to determine which of a range of linear and non-linear models best fits the collected data. For the purpose of explaining a response variable, the 'best' regression model is arguably the model that contains only a subset combination of important predictor variables and is therefore the model that explains the most amount of response variability with the fewest predictor terms (parsimony). Typically, the 'best' model will contain the fewest predictor variables as greater numbers of extraneous predictor variables increases the model complexity and sources of uncertainty and thus decreases the precision of resulting predictions.

There are several criteria that can be used to assess the efficiency or fit of a model that are penalized by the number of predictor terms. These criteria are calculated and compared for a set of competing models thereby providing an objective basis on which to select the 'best' regression model.

$MS_{residuals}$
represents the mean amount of variation unexplained by the model, and therefore the lowest value indicates the best fit.
$Adjusted~r^2$
(the proportion of mean amount of variation in response variable explained and is therefore adjusted for both sample by the model) is calculated as $adj. r_2 = \frac{MS_{residual}}{MS_{total}}$ size and the number of terms. Larger values indicate better fit. Adjusted $r_2$ and $MS_{residuals}$ should not be used to compare between linear and non-linear models
$Mallow's~C_p$
is an index resulting from the comparison of the specific model to a model that contain all the possible terms. Models with the lowest value and/or values closest to their respective $p$ (the number of model terms, including the y-intercept) indicate best fit.
$Akaike~Information~Criteria~(AIC)$
there are several different versions of AIC, each of which adds a different constant (designed to penalize according to the number of parameters and sample size) to a likelhood function to produce a relative measure of the information content of a model. Smaller values indicate more parsimonious models. As a rule of thumb, if the difference between two AIC values (delta AIC) is greater than 2, the lower AIC is a significant improvement in parsimony.
$Schwarz~Bayesian~Information~Criteria~(BIC~or~SIC)$
is outwardly similar to AIC. The constant added to the likelihood function penalizes models with more predictor terms more heavily (and thus select more simple models) than AIC. It is for this reason that BIC is favored by many workers, however, others argue strongly in favor of AIC claiming that the theoretical basis for BIC may not be relevant for most biological applications. The original basis for BIC was for situations in which there were either no effects or else there were a mixture of major and no effects with no intermediate or tapering effects. Furthermore, it assumes that the true model (against which all others are compared) is among the set being assessed.
Traditionally, the set of competing linear models were generated by stepwise procedures in which terms were progressively added or dropped from a model on the basis of importance (as assessed via p-values of partial slopes). Whilst such procedures reduce the number of models that are assessed and compared (it is for the associated reductions in computational intensity that such procedures where originally developed), it is possible that the 'best' model is never assessed. Modern computing now allows all combinations to be assessed rapidly thereby voiding the need for such selection procedures.

We will use the dredge() function within the MuMIn package to fit all combinations of models and then rank all the models on the basis of AICc. AICc is AIC corrected for small sample sizes. Note, for more recent versions of MuMIn, it is necessary that the model be fit with na.action=na.fail.

library(MuMIn)
data.mult.lm <- lm(y ~ cx1 * cx2, data, na.action = na.fail)
dredge(data.mult.lm, rank = "AICc", trace = TRUE)
0 : lm(formula = y ~ 1, data = data, na.action = na.fail)
1 : lm(formula = y ~ cx1 + 1, data = data, na.action = na.fail)
2 : lm(formula = y ~ cx2 + 1, data = data, na.action = na.fail)
3 : lm(formula = y ~ cx1 + cx2 + 1, data = data, na.action = na.fail)
7 : lm(formula = y ~ cx1 + cx2 + cx1:cx2 + 1, data = data, na.action = na.fail)
Global model call: lm(formula = y ~ cx1 * cx2, data = data, na.action = na.fail)
---
Model selection table 
  (Int)   cx1   cx2 cx1:cx2 df   logLik  AICc delta weight
8 3.671 2.929 1.344   2.665  5 -150.334 311.3  0.00  0.779
4 3.823 3.025 1.388          4 -152.719 313.9  2.55  0.217
2 3.823 4.003                3 -157.863 322.0 10.67  0.004
3 3.823       2.958          3 -168.803 343.9 32.55  0.000
1 3.823                      2 -191.124 386.4 75.07  0.000
Models ranked by AICc(x) 

The model with the lowest (=best) AICc was the full model (model containing $cx1$, $cx2$ as well as the interaction). The next best model (comprising the additive effects of $cx1$ and $cx2$), is more than 2 AICc units greater than the best model, and thus we would consider the full model the best.

The model selection table also includes delta and weight columns that indicate the difference in AICc between each successive model and the best model and weigths that can be used in model averaging.

Exhaustive model exploration is increasingly critisized as a 'fishing experdition'. In particular, it is argued that mathematically, some models will come out as 'statistically' more parsimonious than other models. This does not guarantee that these models are ecologically important. Rather, they could be just an artifact of the specific combination of (perhaps slightly unusual) data and simplistic the statistical model. Recall that all models are gross oversimplifactions.

The more modern approach is to proffer up a select (<: 15) candidate models that each encapsulate some plausible response-predictor mechanisms and explore (via information criterion) the relative statistical strength of each of these candidates. This helps ensure that the selected models do have some ecological grounding. It is an acknowledgement of whilst no model is correct, some might provide some useful insighs into a complex ecological system.

Model averaging

Typically, there are multiple plausible alternative models that incorporate different combinations of predictor variables and that yield similar degrees of fit (based on AIC, AICc, QAIC, BIC, etc). Each alternative model will result in different parameter estimates for the predictor variables. Furthermore, conclusions about the relative importance of each of the predictor variables is likely to be dependent on which model is selected.

Model averaging is a technique that calculates weighted averages of the parameter estimates for each predictor variable across a range of possible models (usually models yielding deltas within a certain range - typically 4 units of AIC). In so doing, model selection uncertainty can be incorporated into estimates of parameter precision. Furthermore, through model averaging, we are able to obtain an estimate the relative importance of each of the predictor variables on the the response.

library(MuMIn)
data.mult.lm <- lm(y ~ cx1 * cx2, data = data, na.action = na.fail)
ma <- model.avg(get.models(dredge(data.mult.lm, rank = "AICc"), subset = delta <= 4))
ma
Call:
model.avg(object = get.models(dredge(data.mult.lm, rank = "AICc"), 
    subset = delta <= 4))

Component models: 
'123' '12' 

Coefficients: 
       (Intercept)      cx1      cx2  cx1:cx2
full      3.704258 2.950116 1.353916 2.083722
subset    3.704258 2.950116 1.353916 2.665059
importance(ma)
                     cx1  cx2  cx1:cx2
Sum of weights:      1.00 1.00 0.78   
N containing models:    2    2    1   
summary(ma)
Call:
model.avg(object = get.models(dredge(data.mult.lm, rank = "AICc"), 
    subset = delta <= 4))

Component model call: 
lm(formula = y ~ <2 unique rhs>, data = data, na.action = na.fail)

Component models: 
    df  logLik   AICc delta weight
123  5 -150.33 311.31  0.00   0.78
12   4 -152.72 313.86  2.55   0.22

Term codes: 
    cx1     cx2 cx1:cx2 
      1       2       3 

Model-averaged coefficients:  
(full average) 
            Estimate Std. Error Adjusted SE z value Pr(>|z|)    
(Intercept)   3.7043     0.1424      0.1438  25.756  < 2e-16 ***
cx1           2.9501     0.4946      0.5008   5.890  < 2e-16 ***
cx2           1.3539     0.4227      0.4280   3.163  0.00156 ** 
cx1:cx2       2.0837     1.5477      1.5575   1.338  0.18093    
 
(conditional average) 
            Estimate Std. Error Adjusted SE z value Pr(>|z|)    
(Intercept)   3.7043     0.1424      0.1438  25.756  < 2e-16 ***
cx1           2.9501     0.4946      0.5008   5.890  < 2e-16 ***
cx2           1.3539     0.4227      0.4280   3.163  0.00156 ** 
cx1:cx2       2.6651     1.2305      1.2462   2.138  0.03248 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

With a delta of $<=4$, only the first two models are included for this fabricated example. The above output produces two alternative model averaging outcomes:

  • full average in which averages are generated assuming that the term is present in all models. When it is not present, it is given an estimate of 0. Hence, in the above example, the model average estimate for the interaction (cx1:cs2) is weighted.mean(c(2.665,0), w=c(0.779,0.217)). full average is considered a form of shrinkage estimator.
  • conditional average in which averages for a model term are only calculated from instances where it is present. Conditional averages are thought to bias the estimates away from zero

Candidate models

The above approaches of naively exploring all possible combinations of models have been criticized as 'fishing' expeditions. It is argued that the very nature of frequentist statistics means that with enough tests, at least some combinations of predictors will appear to fit the data well (or even significantly). As this is just an inevitable artifact of the fact that there are a large number of possible model combinations, then the entire exercise is fundamentally flawed.

Opponents argue that a better approach is to propose a set of up to 10-15 a priori candidate models each of which represents a sensible ecological interpretation. These and only these models are compared. Furthermore, rather than necessarily isolating a single 'best' model, this approach advocates potentially discussing the 'best' couple or few models - particularly if they are distinct from one another in their predictor constituents.




Worked Examples

Multiple and non-linear regression references
  • Logan (2010) - Chpt 9
  • Quinn & Keough (2002) - Chpt 6

Multiple Linear Regression

Paruelo & Lauenroth (1996) analyzed the geographic distribution and the effects of climate variables on the relative abundance of a number of plant functional types (PFT's) including shrubs, forbs, succulents (e.g. cacti), C3 grasses and C4 grasses. They used data from 73 sites across temperate central North America (see pareulo.syd) and calculated the relative abundance of C3 grasses at each site as a response variable

Download Paruelo data set
Format of paruelo.csv data file
C3LATLONGMAPJJAMAPDJFMAP
............

C3Relative abundance of C3 grasses at each site - response variable
LATLatitude in centesimal degrees - predictor variable
LONGLongitude in centesimal degrees - predictor variable
MAPMean annual precipitation (mm) - predictor variable
MATMean annual temperature (0C) - predictor variable
JJAMAPProportion of MAP that fell in June, July and August - predictor variable
DJFMAPProportion of MAP that fell in December, January and Febrary - predictor variable
Saltmarsh

Open
the paruelo data file. HINT.
Show code
paruelo <- read.table("../downloads/data/paruelo.csv", header = T, sep = ",", strip.white = T)
head(paruelo)
    C3   LAT   LONG MAP  MAT JJAMAP DJFMAP
1 0.65 46.40 119.55 199 12.4   0.12   0.45
2 0.65 47.32 114.27 469  7.5   0.24   0.29
3 0.76 45.78 110.78 536  7.2   0.24   0.20
4 0.75 43.95 101.87 476  8.2   0.35   0.15
5 0.33 46.90 102.82 484  4.8   0.40   0.14
6 0.03 38.87  99.38 623 12.0   0.40   0.11
  1. In the table below, list the assumptions of multiple linear regression along with how violations of each assumption are diagnosed and/or the risks of violations are minimized.
    AssumptionDiagnostic/Risk Minimization
    I.
    II.
    III.
    IV.
    V.


    Q1-2. Construct a scatterplot matrix
    to investigate these assumptions(HINT)
    Show scatterplotMatrix (car) code
    library(car)
    scatterplotMatrix(~C3 + LAT + LONG + MAP + MAT + JJAMAP + DJFMAP,
        data = paruelo, diagonal = list(method = "boxplot"))
    
    plot of chunk ws7_3a-Q1-2
    Show splom (lattice) code
    library(lattice)
    splom.lat <- splom(paruelo, type = c("p", "r"))
    print(splom.lat)
    
    plot of chunk ws7_3a-Q1-2a
    Show ggpairs (GGally) code
    library(GGally)
    ggpairs(paruelo, lower = list(continuous = "smooth"), diag = list(continuous = "density"),
        axisLabels = "none")
    
    plot of chunk ws7_3a-Q1-2b
    1. Focussing on the assumptions of Normality, Homogeneity of Variance and Linearity, is there evidence of violations of these assumptions (y or n)?
    2. C3 abundance is clearly non-normal. Since C3 abundance is relative abundance (which logically must range from 0 to 1), arguably, the most appropriate approach would be to model these data with a binomial (or perhaps beta) distribution. Indeed, this is the approach that we will take in Tutorial 10.4 and Tutorial 10.5a A more simplistic approach that can be applied within simple OLS regression, is to attempt to normalize the response variable via a scale transformation.

      Since the C3 relative abundances have values of zero, the authors elected to perform a square-root transformation. Generally speaking, this can be a very dangerous course of action if back-transformations from the fitted model are required due to the nature of squaring sets of numbers that are a mixture of negatives and positives or even less than 1 and greater than 1.

      This example therefore potentially serves as a good example of the dangers of root transformations. Try applying a temporary square root transformation (HINT). Does this improve some of these specific assumptions (y or n)?

      . Whilst in many model fitting and graphing routines are able to perform transformation inline, for more complex examples, it is often advisable to also create transformed versions of variables.
      Show scatterplotMatrix (car) code
      library(car)
      scatterplotMatrix(~sqrt(C3)+LAT+LONG+MAP+MAT+JJAMAP+log10(DJFMAP), data=paruelo, diagonal=list(method='boxplot'))
      
      plot of chunk Q1-2a
      Show ggpairs (GGally) code
      library(GGally)
      paruelo = paruelo %>% mutate(sqrtC3=sqrt(C3), lDJFMAP=log10(DJFMAP))
      ggpairs(subset(paruelo, select=c('sqrtC3','LAT','LONG','MAP','MAT','lDJFMAP')),lower=list(continuous='smooth'),
      diagonal=list(continuous="density"), axisLabels="none")
      
      plot of chunk Q1-2ab
    3. Is there any evidence that the assumption of Collinearity is likely to be violated (y or n)?
    4. (Multi)collinearity
      occurs when one or more of the predictor variables are correlated to the other predictor variables, therefore this assumption can also be examined by calculating the pairwise correlation coefficients between all the predictor variables (HINT). Which predictor variables are highly correlated?
      Show code
      cor(paruelo[, -1])
      
                      LAT        LONG          MAP          MAT      JJAMAP       DJFMAP       sqrtC3
      LAT      1.00000000  0.09655281 -0.246505816 -0.838590413  0.07417497 -0.065124848  0.685561202
      LONG     0.09655281  1.00000000 -0.733687027 -0.213109100 -0.49155774  0.770743994 -0.023618820
      MAP     -0.24650582 -0.73368703  1.000000000  0.355090766  0.11225905 -0.404512409  0.001697706
      MAT     -0.83859041 -0.21310910  0.355090766  1.000000000 -0.08077131  0.001478037 -0.536007103
      JJAMAP   0.07417497 -0.49155774  0.112259049 -0.080771307  1.00000000 -0.791540381 -0.053650973
      DJFMAP  -0.06512485  0.77074399 -0.404512409  0.001478037 -0.79154038  1.000000000 -0.090151384
      sqrtC3   0.68556120 -0.02361882  0.001697706 -0.536007103 -0.05365097 -0.090151384  1.000000000
      lDJFMAP -0.07720843  0.76030635 -0.399637550 -0.003065519 -0.79235477  0.984628714 -0.071967657
                   lDJFMAP
      LAT     -0.077208433
      LONG     0.760306349
      MAP     -0.399637550
      MAT     -0.003065519
      JJAMAP  -0.792354765
      DJFMAP   0.984628714
      sqrtC3  -0.071967657
      lDJFMAP  1.000000000
      

  2. Whilst the above diagnostics are useful at identifying potential (Multi)collinearity issues, they do not examine collinearity directly. (Multi)collinearity can more be diagnosed directly via tolerance and variance inflation factor (VIF) measures.

    1. Calculate the VIF values for each of the predictor variables (HINT).
    2. Calculate the tolerance values for each of the predictor variables (HINT).
      Show code
      library(car)
      vif(lm(sqrt(paruelo$C3) ~ LAT + LONG + MAP + MAT + JJAMAP + log10(DJFMAP), data = paruelo))
      
                LAT          LONG           MAP           MAT        JJAMAP log10(DJFMAP) 
           3.560103      4.988318      2.794157      3.752353      3.194724      5.467330 
      
      library(car)
      1/vif(lm(sqrt(paruelo$C3) ~ LAT + LONG + MAP + MAT + JJAMAP + log10(DJFMAP), data = paruelo))
      
                LAT          LONG           MAP           MAT        JJAMAP log10(DJFMAP) 
          0.2808908     0.2004684     0.3578897     0.2664995     0.3130161     0.1829046 
      
      PredictorVIFTolerance
      LAT
      LONG
      MAP
      MAT
      JJAMAP
      log10(DJFMAP)
    3. Is there any evidence of (multi)collinearity, and if so, which variables are responsible for violations of this assumption?
  3. We obviously cannot easily incorporate all 6 predictors into the one model, because of the collinearity problem. Paruelo and Lauenroth (1996) separated the predictors into two groups for their analyses. One group included LAT and LONG and the other included MAP, MAT, JJAMAP and DJFMAP. We will focus on the relationship between the square root relative abundance of C3 plants and latitude and longitude. This relationship will investigate the geographic pattern in abundance of C3 plants.

  4. Just like Paruelo and Lauenroth (1996), we will fit the multiplicative model for LAT and LONG.
    1. Write out the full multiplicative model
    2. Check the assumptions of this linear model. In particular, check collinearity. HINT
      Show scatterplotMatrix (car) code
      # via car's scatterplotMatrix function
      library(car)
      scatterplotMatrix(~sqrt(C3) + LAT + LONG, data = paruelo, diagonal = list(method = "boxplot"))
      
      plot of chunk Q1-4b
      vif(lm(sqrt(C3) ~ LAT + LONG, data = paruelo))
      
          LAT    LONG 
      1.00941 1.00941 
      
      Show ggpairs (GGally) code
      ggpairs(subset(paruelo, select = c("sqrtC3", "LAT", "LONG")),
          lower = list(continuous = "smooth"), diag = list(continuous = "density"),
          axisLabels = "none")
      
      plot of chunk Q1-4bb
      vif(lm(sqrt(C3) ~ LAT + LONG, data = paruelo))
      
          LAT    LONG 
      1.00941 1.00941 
      
    3. Obviously, this model will violate collinearity. It is highly likely that LAT and LONG will be related to the LAT:LONG interaction term. It turns out that if we center the variables, then the individual terms will no longer be correlated to the interaction.

      We can either generate separate centered versions of the LAT and LONG variables ( HINT) and (HINT) or, we can perform this operation inline when fitting the model.

      The advantage of centering the variables inline, is that the model will be aware of the scaling procedure and will capture the information necessary to perform the back-transformations. On the downside, this additional information can cause problems for some routines. Consequently, we will generate separate centered versions, yet hope to not have to use them..

      Show code
      paruelo$cLAT <- as.vector(scale(paruelo$LAT, scale = F))
      paruelo$cLONG <- as.vector(scale(paruelo$LONG, scale = F))
      
  5. Fit a linear multiplicative model on the centered LAT and LONG (HINT)
    Show code
    paruelo.lm <- lm(sqrt(C3) ~ scale(LAT, scale = FALSE) * scale(LONG,
        scale = FALSE), data = paruelo)
    paruelo.lm1 <- lm(sqrt(C3) ~ cLAT * cLONG, data = paruelo)
    # library(sjmisc) paruelo.lm1 <- lm(sqrt(C3) ~
    # center(LAT)*center(LONG), data=paruelo)
    
    1. Examine the diagnostic plots (HINT) specially the residual plot to confirm no further evidence of violations of the analysis assumptions
      Show base code
      # setup a 2x6 plotting device with small margins
      par(mfrow = c(2, 3), oma = c(0, 0, 2, 0))
      plot(paruelo.lm, ask = F, which = 1:6)
      
      plot of chunk Q1-5a
      influence.measures(paruelo.lm)
      
      Influence measures of
       lm(formula = sqrt(C3) ~ scale(LAT, scale = FALSE) * scale(LONG,      scale = FALSE), data = paruelo) :
      
           dfb.1_ dfb.s.LAs.F dfb.s.LOs.F dfb.ss.Fs.F    dffit cov.r   cook.d    hat inf
      1  -0.02402   -0.083059   -0.084079   -0.121467 -0.15232 1.379 5.88e-03 0.2347   *
      2  -0.02221   -0.064489   -0.040666   -0.073805 -0.09560 1.227 2.32e-03 0.1389   *
      3   0.08130    0.134833    0.066766    0.117108  0.18920 1.083 9.00e-03 0.0556    
      4   0.18842    0.102837   -0.149242   -0.074613  0.27640 0.957 1.87e-02 0.0317    
      5  -0.06669   -0.070090    0.046655    0.023640 -0.11683 1.091 3.45e-03 0.0448    
      6  -0.14663    0.020076    0.158421    0.006884 -0.21885 1.001 1.19e-02 0.0305    
      7  -0.06266    0.095315   -0.006186    0.047476 -0.11206 1.102 3.17e-03 0.0510    
      8  -0.14641    0.034868    0.221772   -0.071379 -0.30117 1.016 2.25e-02 0.0520    
      9  -0.03667    0.026450    0.024624   -0.005908 -0.05703 1.088 8.24e-04 0.0317    
      10 -0.14355   -0.014721    0.043712    0.016311 -0.15090 0.989 5.65e-03 0.0153    
      11 -0.06802   -0.077774    0.052091    0.030960 -0.12872 1.101 4.19e-03 0.0534    
      12 -0.09834    0.126501    0.005986    0.046997 -0.15816 1.064 6.29e-03 0.0388    
      13  0.11115   -0.093743   -0.111372    0.092157  0.24977 1.061 1.56e-02 0.0580    
      14  0.08474    0.046585   -0.103485   -0.077405  0.16432 1.104 6.81e-03 0.0622    
      15 -0.10466    0.005614    0.159085    0.003719 -0.19200 1.063 9.25e-03 0.0460    
      16  0.02997    0.006031   -0.022676   -0.007923  0.03832 1.082 3.72e-04 0.0234    
      17 -0.22231   -0.242042   -0.256005   -0.259471 -0.46135 0.867 5.07e-02 0.0464    
      18 -0.15718   -0.195191   -0.167613   -0.197664 -0.33555 0.980 2.77e-02 0.0483    
      19 -0.03827    0.067695    0.013203   -0.004632 -0.08826 1.133 1.97e-03 0.0703    
      20  0.14881    0.082543   -0.079749   -0.027279  0.19334 0.993 9.27e-03 0.0238    
      21  0.02694   -0.052154    0.013621   -0.042509  0.06315 1.184 1.01e-03 0.1064   *
      22  0.01333    0.036162    0.002225    0.019532  0.03980 1.165 4.02e-04 0.0911    
      23  0.03874   -0.041008   -0.026165    0.016163  0.07567 1.106 1.45e-03 0.0475    
      24  0.18953   -0.068780   -0.263543    0.120259  0.39649 0.949 3.83e-02 0.0522    
      25 -0.30655   -0.141051   -0.348442   -0.174237 -0.50904 0.721 5.92e-02 0.0333   *
      26 -0.01163    0.020739   -0.004288    0.014771 -0.02450 1.151 1.52e-04 0.0797    
      27 -0.06640   -0.035776    0.047618    0.021757 -0.09331 1.073 2.20e-03 0.0287    
      28  0.15820   -0.100195    0.156872   -0.087023  0.24687 1.004 1.51e-02 0.0371    
      29  0.23271   -0.199428    0.178733   -0.171942  0.36130 0.913 3.16e-02 0.0383    
      30 -0.05886   -0.051722    0.023280    0.001329 -0.08413 1.075 1.79e-03 0.0278    
      31  0.18206   -0.039000    0.294547   -0.009475  0.34876 0.977 2.98e-02 0.0503    
      32  0.04158    0.010092    0.004277    0.002215  0.04358 1.068 4.81e-04 0.0147    
      33 -0.11614    0.057069   -0.084317    0.045564 -0.15421 1.033 5.95e-03 0.0260    
      34  0.17140   -0.010972    0.149470    0.000787  0.22867 0.961 1.29e-02 0.0241    
      35  0.16157   -0.063790    0.207065   -0.047488  0.27074 0.999 1.81e-02 0.0405    
      36 -0.16599    0.051605    0.046986    0.028324 -0.17882 0.964 7.89e-03 0.0163    
      37 -0.09520    0.123426   -0.073860    0.112439 -0.18180 1.101 8.32e-03 0.0642    
      38 -0.12248    0.136168    0.016438    0.044325 -0.18258 1.034 8.33e-03 0.0325    
      39  0.21589    0.005332    0.135038    0.008527  0.25689 0.889 1.59e-02 0.0190    
      40 -0.15732   -0.003358   -0.079732   -0.002338 -0.17776 0.972 7.81e-03 0.0172    
      41 -0.08534    0.010451   -0.044041    0.007825 -0.09659 1.047 2.35e-03 0.0177    
      42 -0.03339    0.004442   -0.028421    0.002067 -0.04413 1.081 4.93e-04 0.0240    
      43  0.11153   -0.008546    0.093501   -0.001341  0.14632 1.030 5.36e-03 0.0234    
      44  0.06621    0.080305   -0.003529    0.027570  0.10631 1.074 2.85e-03 0.0321    
      45  0.14322    0.175378   -0.007707    0.060327  0.23128 0.999 1.33e-02 0.0324    
      46  0.00242    0.002970   -0.000135    0.001018  0.00392 1.096 3.89e-06 0.0325    
      47  0.04762    0.057578   -0.003518    0.018818  0.07632 1.084 1.47e-03 0.0321    
      48  0.09643    0.105320   -0.013197    0.027624  0.14620 1.048 5.37e-03 0.0294    
      49  0.21782    0.063120   -0.324557   -0.197238  0.43369 0.971 4.59e-02 0.0654    
      50 -0.00148    0.001425    0.002076   -0.003405 -0.00566 1.218 8.12e-06 0.1296   *
      51 -0.00449    0.005425    0.006061   -0.012462 -0.01985 1.262 1.00e-04 0.1598   *
      52  0.02355   -0.011871   -0.037630    0.035697  0.06961 1.160 1.23e-03 0.0885    
      53 -0.00616    0.005595    0.008294   -0.011850 -0.02099 1.190 1.12e-04 0.1093   *
      54  0.11316   -0.020147    0.059511   -0.015058  0.12914 1.024 4.18e-03 0.0181    
      55  0.08401   -0.006408    0.050234   -0.003848  0.09836 1.049 2.44e-03 0.0188    
      56  0.01263    0.000401    0.008831    0.000749  0.01556 1.081 6.14e-05 0.0203    
      57  0.23558    0.012600    0.162103    0.017720  0.28935 0.857 2.00e-02 0.0201    
      58  0.15162    0.005056    0.085545    0.005265  0.17570 0.979 7.64e-03 0.0181    
      59 -0.10917    0.059550   -0.109799    0.050149 -0.16684 1.050 6.98e-03 0.0349    
      60 -0.01928    0.010518   -0.019393    0.008857 -0.02947 1.097 2.20e-04 0.0349    
      61  0.07137   -0.085274   -0.078631    0.125075  0.23769 1.156 1.42e-02 0.1077    
      62 -0.05053    0.001789   -0.075856   -0.007046 -0.09174 1.096 2.13e-03 0.0434    
      63  0.00374    0.000609   -0.000752   -0.000272  0.00388 1.076 3.83e-06 0.0148    
      64 -0.03243   -0.005963    0.008964    0.002938 -0.03431 1.072 2.98e-04 0.0155    
      65 -0.17360   -0.059882    0.041757    0.006784 -0.19020 0.951 8.89e-03 0.0164    
      66 -0.21254   -0.161201    0.312816    0.383538 -0.59693 1.136 8.80e-02 0.1617    
      67  0.05944    0.032095   -0.092820   -0.099439  0.15603 1.217 6.16e-03 0.1367   *
      68 -0.03086    0.033212   -0.039691    0.034038 -0.06400 1.142 1.04e-03 0.0743    
      69 -0.08462    0.114424   -0.120547    0.127015 -0.20622 1.172 1.07e-02 0.1129    
      70 -0.12444    0.139820   -0.161230    0.144966 -0.26402 1.097 1.75e-02 0.0789    
      71 -0.09511    0.010274    0.128949   -0.001656 -0.16322 1.063 6.70e-03 0.0398    
      72  0.01563    0.003182   -0.030673   -0.027216  0.04310 1.251 4.71e-04 0.1534   *
      73 -0.06408   -0.180207    0.005698   -0.072658 -0.19407 1.154 9.51e-03 0.0995    
      
      Show autoplot code
      autoplot(paruelo.lm, which = 1:6)
      
      plot of chunk Q1-5ab
    2. All the diagnostics seem reasonable and present no indications that the model fitting and resulting hypothesis tests are likely to be unreliable. Complete the following table (HINT).
      Show code
      summary(paruelo.lm)
      
      Call:
      lm(formula = sqrt(C3) ~ scale(LAT, scale = FALSE) * scale(LONG, 
          scale = FALSE), data = paruelo)
      
      Residuals:
           Min       1Q   Median       3Q      Max 
      -0.51312 -0.13427 -0.01134  0.14086  0.38940 
      
      Coefficients:
                                                             Estimate Std. Error t value Pr(>|t|)    
      (Intercept)                                           0.4282658  0.0234347  18.275  < 2e-16 ***
      scale(LAT, scale = FALSE)                             0.0436937  0.0048670   8.977 3.28e-13 ***
      scale(LONG, scale = FALSE)                           -0.0028773  0.0036842  -0.781   0.4375    
      scale(LAT, scale = FALSE):scale(LONG, scale = FALSE)  0.0022824  0.0007471   3.055   0.0032 ** 
      ---
      Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
      
      Residual standard error: 0.1991 on 69 degrees of freedom
      Multiple R-squared:  0.5403,	Adjusted R-squared:  0.5203 
      F-statistic: 27.03 on 3 and 69 DF,  p-value: 1.128e-11
      
      confint(paruelo.lm)
      
                                                                   2.5 %      97.5 %
      (Intercept)                                           0.3815148539 0.475016788
      scale(LAT, scale = FALSE)                             0.0339842271 0.053403155
      scale(LONG, scale = FALSE)                           -0.0102270077 0.004472489
      scale(LAT, scale = FALSE):scale(LONG, scale = FALSE)  0.0007920205 0.003772861
      
      tidy(paruelo.lm)
      
      # A tibble: 4 x 5
        term                                                 estimate std.error statistic  p.value
        <chr>                                                   <dbl>     <dbl>     <dbl>    <dbl>
      1 (Intercept)                                           0.428    0.0234      18.3   3.79e-28
      2 scale(LAT, scale = FALSE)                             0.0437   0.00487      8.98  3.28e-13
      3 scale(LONG, scale = FALSE)                           -0.00288  0.00368     -0.781 4.37e- 1
      4 scale(LAT, scale = FALSE):scale(LONG, scale = FALSE)  0.00228  0.000747     3.06  3.20e- 3
      
      CoefficientEstimatet-valueP-value
      Intercept
      cLAT
      cLONG
      cLAT:cLONG
  6. Examine the partial regression plots of LAT and LONG (HINT).
    Show added variable plots (car) code
    # via the car package
    avPlots(paruelo.lm, ask = F, indentify.points = F)
    
    plot of chunk Q1-6
    avPlots(paruelo.lm1, ask = F, indentify.points = F)
    
    plot of chunk Q1-6
    Show effects (effects) code
    library(effects)
    plot(allEffects(paruelo.lm, xlevels = 5, partial.residuals = TRUE), ask = F)
    
    plot of chunk Q1-6a
    plot(allEffects(paruelo.lm1, xlevels = 5, partial.residuals = TRUE), ask = F)
    
    plot of chunk Q1-6a
    Show ggeffect (ggeffects) code
    library(ggeffects)
    ## Unfortunately, ggeffect is one of the functions that is unable to properly deal with inline
    ## centering, hence the need for fitting the model on pre-centered variables.  It does however mean,
    ## that our axes are presented on the centered scaled...
    ggeffect(paruelo.lm1, ~cLAT | cLONG) %>% plot(facet = TRUE)
    
    plot of chunk Q1-6b
    Show ggpredict (ggeffects) code
    library(ggeffects)
    ## Unfortunately, ggpredict is one of the functions that is unable to properly deal with inline
    ## centering, hence the need for fitting the model on pre-centered variables.  It does however mean,
    ## that our axes are presented on the centered scaled...
    ggpredict(paruelo.lm1, ~cLAT | cLONG) %>% plot(facet = TRUE)
    
    plot of chunk Q1-6c
    Show ggemmeans (ggeffects) code
    library(ggeffects)
    ggemmeans(paruelo.lm, terms = c("LAT [29:53]", "LONG [90:120 by=7]")) %>% plot(facet = TRUE)
    
    plot of chunk Q1-6d
    ## OR
    ggemmeans(paruelo.lm1, terms = c("cLAT", "cLONG [n=2]")) %>% plot(facet = TRUE)
    
    plot of chunk Q1-6d

    There is clearly an interaction between LAT and LONG. This indicates that the degree to which latitude effects the relative abundance of C3 plants depends on longitude and vice versa. The effects plots expressed for Latitude suggest that the relationship between relative C3 abundance and Latitude increases with increasing Longitude. The effects plots expressed for Longitude suggest that the relationship between relative C3 abundance and Longitude switch from negative at low Latitudes to positive at high Latitudes.

  7. To investigate these patterns further, we will examine the simple effects of latitude at a specific range of longitudes. The levels of longitude that we will use are the mean longitude value as well as the mean plus or minus 1 SD and plus or minus 2 SD.
    1. Calculate the five levels of longitude on which the simple effects of latitude will be investigated. (HINT, HINT, etc).
      Show code
      # 2 standard deviations below the mean longitude
      LongM2 <- mean(paruelo$cLONG) - 2 * sd(paruelo$cLONG)
      # 1 standard deviation below the mean longitude
      LongM1 <- mean(paruelo$cLONG) - 1 * sd(paruelo$cLONG)
      # mean longitude
      LongM <- mean(paruelo$cLONG)
      # 1 standard deviation above the mean longitude
      LongP1 <- mean(paruelo$cLONG) + 1 * sd(paruelo$cLONG)
      # 2 standard deviation below the mean longitude
      LongP2 <- mean(paruelo$cLONG) + 2 * sd(paruelo$cLONG)
      
  8. Investigate the simple effects of latitude on the relative abundance of C3 plants for each of these levels of longitude. (HINT), (HINT), etc).
    Show emtrends code
    emtrends(paruelo.lm1, ~cLAT | cLONG, at = list(cLONG = c(LongM2,
        LongM1, LongM, LongP1, LongP2)), var = "cLAT") %>% as.data.frame
    
             cLAT         cLONG cLAT.trend          SE df     lower.CL   upper.CL
    1 3.89291e-16 -1.287083e+01 0.01431678 0.008837028 69 -0.003312612 0.03194616
    2 3.89291e-16 -6.435417e+00 0.02900523 0.005270175 69  0.018491523 0.03951894
    3 3.89291e-16 -6.424270e-15 0.04369369 0.004867032 69  0.033984227 0.05340316
    4 3.89291e-16  6.435417e+00 0.05838215 0.008113745 69  0.042195671 0.07456863
    5 3.89291e-16  1.287083e+01 0.07307061 0.012418103 69  0.048297168 0.09784404
    
    Show manual code
    LatM2 <- summary(paruelo.lmM2 <- lm(sqrt(C3) ~ cLAT * c(cLONG - LongM2), data = paruelo))$coef["cLAT",
        ]
    LatM1 <- summary(paruelo.lmM2 <- lm(sqrt(C3) ~ cLAT * c(cLONG - LongM1), data = paruelo))$coef["cLAT",
        ]
    LatM <- summary(paruelo.lmM2 <- lm(sqrt(C3) ~ cLAT * c(cLONG - LongM), data = paruelo))$coef["cLAT",
        ]
    LatP1 <- summary(paruelo.lmM2 <- lm(sqrt(C3) ~ cLAT * c(cLONG - LongP1), data = paruelo))$coef["cLAT",
        ]
    LatP2 <- summary(paruelo.lmM2 <- lm(sqrt(C3) ~ cLAT * c(cLONG - LongP2), data = paruelo))$coef["cLAT",
        ]
    rbind(`Lattitude at Long-2sd` = LatM2, `Lattitude at Long-1sd` = LatM1, `Lattitude at mean Long` = LatM,
        `Lattitude at Long+1sd` = LatP1, `Lattitude at Long+2sd` = LatP2)
    
                             Estimate  Std. Error  t value     Pr(>|t|)
    Lattitude at Long-2sd  0.01431678 0.008837028 1.620090 1.097748e-01
    Lattitude at Long-1sd  0.02900523 0.005270175 5.503657 5.926168e-07
    Lattitude at mean Long 0.04369369 0.004867032 8.977481 3.279384e-13
    Lattitude at Long+1sd  0.05838215 0.008113745 7.195463 5.877401e-10
    Lattitude at Long+2sd  0.07307061 0.012418103 5.884200 1.301574e-07
    
    # or via confidence intervals
    LatM2 <- confint(paruelo.lmM2 <- lm(sqrt(C3) ~ cLAT * c(cLONG - LongM2), data = paruelo))["cLAT", ]
    LatM1 <- confint(paruelo.lmM2 <- lm(sqrt(C3) ~ cLAT * c(cLONG - LongM1), data = paruelo))["cLAT", ]
    LatM <- confint(paruelo.lmM2 <- lm(sqrt(C3) ~ cLAT * c(cLONG - LongM), data = paruelo))["cLAT", ]
    LatP1 <- confint(paruelo.lmM2 <- lm(sqrt(C3) ~ cLAT * c(cLONG - LongP1), data = paruelo))["cLAT", ]
    LatP2 <- confint(paruelo.lmM2 <- lm(sqrt(C3) ~ cLAT * c(cLONG - LongP2), data = paruelo))["cLAT", ]
    rbind(`Lattitude at Long-2sd` = LatM2, `Lattitude at Long-1sd` = LatM1, `Lattitude at mean Long` = LatM,
        `Lattitude at Long+1sd` = LatP1, `Lattitude at Long+2sd` = LatP2)
    
                                  2.5 %     97.5 %
    Lattitude at Long-2sd  -0.003312612 0.03194616
    Lattitude at Long-1sd   0.018491523 0.03951894
    Lattitude at mean Long  0.033984227 0.05340316
    Lattitude at Long+1sd   0.042195671 0.07456863
    Lattitude at Long+2sd   0.048297168 0.09784404
    
  9. Finally, lets have a summary figure (or two).
    Show predict code
    newdata <- expand.grid(LAT = seq(min(paruelo$LAT), max(paruelo$LAT),
        len = 100), LONG = mean(paruelo$LONG) + sd(paruelo$LONG) *
        c(-2, -1, 0, 1, 2))
    fit <- predict(paruelo.lm, newdata = newdata, interval = "confidence")  #^2
    fit <- cbind(newdata, fit)
    head(fit)
    
           LAT    LONG       fit        lwr       upr
    1 29.00000 93.5293 0.3063215 0.09947146 0.5131716
    2 29.23364 93.5293 0.3096665 0.10635971 0.5129732
    3 29.46727 93.5293 0.3130114 0.11322587 0.5127969
    4 29.70091 93.5293 0.3163563 0.12006878 0.5126438
    5 29.93455 93.5293 0.3197012 0.12688716 0.5125153
    6 30.16818 93.5293 0.3230461 0.13367966 0.5124126
    
    # calculated the partial observed values
    partial.obs <- expand.grid(LAT = paruelo$LAT, LONG = mean(paruelo$LONG) +
        sd(paruelo$LONG) * c(-2, -1, 0, 1, 2))
    partial.obs <- cbind(partial.obs, fit = predict(paruelo.lm, newdata = partial.obs))
    partial.obs$fit <- (partial.obs$fit + resid(paruelo.lm))  #^2
    head(partial.obs)
    
        LAT    LONG       fit
    1 46.40 93.5293 0.5071849
    2 47.32 93.5293 0.5243126
    3 45.78 93.5293 0.6979391
    4 43.95 93.5293 0.8168116
    5 46.90 93.5293 0.4570809
    6 38.87 93.5293 0.2065210
    
    fit$lLONG <- factor(fit$LONG, labels = paste("Longitude:~", c(-2,
        -1, 0, 1, 2), "*sigma"))
    partial.obs$lLONG <- factor(partial.obs$LONG, labels = paste("Longitude:~",
        c(-2, -1, 0, 1, 2), "*sigma"))
    
    library(ggplot2)
    ggplot(fit, aes(y = fit, x = LAT)) + geom_point(data = partial.obs) +
        geom_ribbon(aes(ymin = lwr, ymax = upr), alpha = 0.2, color = NA) +
        geom_line() + scale_y_continuous(expression(Abundance ~ of ~
        C3 ~ grasses ~ (sqrt(phantom(x))))) + scale_x_continuous(expression(Latitude ~
        (degree * S))) + facet_grid(~lLONG, labeller = label_parsed) +
        theme_classic() + theme(axis.title.y = element_text(vjust = 1,
        size = rel(1.25)), axis.title.x = element_text(vjust = -1,
        size = rel(1.25)), plot.margin = unit(c(0.5, 0.5, 1, 1),
        "lines"), strip.background = element_blank())
    
    plot of chunk Q1-8
    Show emmeans code
    paruelo.list = with(paruelo, list(LAT = seq(min(LAT), max(LAT),
        len = 100), LONG = mean(LONG) + sd(LONG) * c(-2, -1, 0, 1,
        2)))
    paruelo.grid = ref_grid(paruelo.lm, ~LAT | LONG, at = paruelo.list,
        cov.reduce = function(x) x)
    newdata = confint(paruelo.grid)
    ## calculate the partial residuals
    partial.obs = ref_grid(paruelo.lm, ~LAT, cov.reduce = function(x) x,
        at = list(LONG = mean(paruelo$LONG) + sd(paruelo$LONG) *
            c(-2, -1, 0, 1, 2))) %>% summary %>% mutate(presid = prediction +
        as.vector(resid(paruelo.lm)))
    
    newdata$lLONG <- factor(newdata$LONG, labels = paste("Longitude:~",
        c(-2, -1, 0, 1, 2), "*sigma"))
    partial.obs$lLONG <- factor(partial.obs$LONG, labels = paste("Longitude:~",
        c(-2, -1, 0, 1, 2), "*sigma"))
    
    ggplot(newdata, aes(y = prediction, x = LAT)) + geom_line() +
        geom_ribbon(aes(ymin = lower.CL, ymax = upper.CL), color = "grey",
            alpha = 0.3) + facet_grid(~lLONG, labeller = label_parsed) +
        scale_y_continuous(expression(Abundance ~ of ~ C3 ~ grasses ~
            (sqrt(phantom(x))))) + scale_x_continuous(expression(Latitude ~
        (degree * S))) + geom_point(data = partial.obs, aes(y = presid)) +
        theme_classic() + theme(axis.title.y = element_text(vjust = 1,
        size = rel(1.25)), axis.title.x = element_text(vjust = -1,
        size = rel(1.25)), plot.margin = unit(c(0.5, 0.5, 1, 1),
        "lines"), strip.background = element_blank())
    
    plot of chunk Q1-8b
    Show manual code
    newdata = with(paruelo, expand.grid(LAT = seq(min(LAT), max(LAT),
        len = 100), LONG = mean(LONG) + sd(LONG) * c(-2, -1, 0, 1,
        2)))
    Xmat = model.matrix(~scale(LAT, scale = FALSE) * scale(LONG,
        scale = FALSE), data = newdata)
    coefs = coef(paruelo.lm)
    fit = as.vector(coefs %*% t(Xmat))
    se = sqrt(diag(Xmat %*% vcov(paruelo.lm) %*% t(Xmat)))
    q = qt(0.975, df = df.residual(paruelo.lm))
    newdata = newdata %>% bind_cols(data.frame(fit = fit, lwr = fit -
        q * se, upr = fit + q * se))
    ## partial residuals
    partial.obs <- with(paruelo, expand.grid(LAT = LAT, LONG = mean(LAT) +
        sd(LAT) * c(-2, -1, 0, 1, 2)))
    Xmat = model.matrix(~scale(LAT, scale = FALSE) * scale(LONG,
        scale = FALSE), data = partial.obs)
    presid = as.vector(coefs %*% t(Xmat))
    partial.obs = partial.obs %>% bind_cols(data.frame(presid = presid +
        rep(resid(paruelo.lm), 5)))
    
    newdata$lLONG <- factor(newdata$LONG, labels = paste("Longitude:~",
        c(-2, -1, 0, 1, 2), "*sigma"))
    partial.obs$lLONG <- factor(partial.obs$LONG, labels = paste("Longitude:~",
        c(-2, -1, 0, 1, 2), "*sigma"))
    
    ggplot(newdata, aes(y = fit, x = LAT)) + geom_line() + geom_ribbon(aes(ymin = lwr,
        ymax = upr), color = "grey", alpha = 0.3) + facet_grid(~lLONG,
        labeller = label_parsed) + scale_y_continuous(expression(Abundance ~
        of ~ C3 ~ grasses ~ (sqrt(phantom(x))))) + scale_x_continuous(expression(Latitude ~
        (degree * S))) + geom_point(data = partial.obs, aes(y = presid)) +
        theme_classic() + theme(axis.title.y = element_text(vjust = 1,
        size = rel(1.25)), axis.title.x = element_text(vjust = -1,
        size = rel(1.25)), plot.margin = unit(c(0.5, 0.5, 1, 1),
        "lines"), strip.background = element_blank())
    
    plot of chunk Q1-8c
  10. Note the figures above are on a square-root scale AND the some trends and confidence intervals extend into negative square-root C3 abundances. Clearly this is not logical and is an artifact of applying a square-root transformation to normalize the data when there were numerous response values close to zero.

    If we were to back transform these relationships, we would be confronted with a parabola-like relationship. This is obviously an artefact of the fitting a model to root transformed data and is highly unlikely to reflect any genuine ecological trends. To illustrate, lets repeat the above and apply the back transformations.

    Show predict code
    newdata <- expand.grid(LAT = seq(min(paruelo$LAT), max(paruelo$LAT),
        len = 100), LONG = mean(paruelo$LONG) + sd(paruelo$LONG) *
        c(-2, -1, 0, 1, 2))
    fit <- predict(paruelo.lm, newdata = newdata, interval = "confidence")^2
    fit <- cbind(newdata, fit)
    head(fit)
    
           LAT    LONG        fit         lwr       upr
    1 29.00000 93.5293 0.09383288 0.009894572 0.2633451
    2 29.23364 93.5293 0.09589331 0.011312387 0.2631415
    3 29.46727 93.5293 0.09797612 0.012820099 0.2629606
    4 29.70091 93.5293 0.10008130 0.014416512 0.2628037
    5 29.93455 93.5293 0.10220886 0.016100351 0.2626719
    6 30.16818 93.5293 0.10435880 0.017870251 0.2625667
    
    # calculated the partial observed values
    partial.obs <- expand.grid(LAT = paruelo$LAT, LONG = mean(paruelo$LONG) +
        sd(paruelo$LONG) * c(-2, -1, 0, 1, 2))
    partial.obs <- cbind(partial.obs, fit = predict(paruelo.lm, newdata = partial.obs))
    partial.obs$fit <- (partial.obs$fit + resid(paruelo.lm))^2
    head(partial.obs)
    
        LAT    LONG        fit
    1 46.40 93.5293 0.25723651
    2 47.32 93.5293 0.27490373
    3 45.78 93.5293 0.48711906
    4 43.95 93.5293 0.66718116
    5 46.90 93.5293 0.20892295
    6 38.87 93.5293 0.04265093
    
    fit$lLONG <- factor(fit$LONG, labels = paste("Longitude:~", c(-2,
        -1, 0, 1, 2), "*sigma"))
    partial.obs$lLONG <- factor(partial.obs$LONG, labels = paste("Longitude:~",
        c(-2, -1, 0, 1, 2), "*sigma"))
    
    library(ggplot2)
    ggplot(fit, aes(y = fit, x = LAT)) + geom_point(data = partial.obs) +
        geom_ribbon(aes(ymin = lwr, ymax = upr), alpha = 0.2, color = NA) +
        geom_line() + scale_y_continuous(expression(Abundance ~ of ~
        C3 ~ grasses)) + scale_x_continuous(expression(Latitude ~
        (degree * S))) + facet_grid(~lLONG, labeller = label_parsed) +
        theme_classic() + theme(axis.title.y = element_text(vjust = 1,
        size = rel(1.25)), axis.title.x = element_text(vjust = -1,
        size = rel(1.25)), plot.margin = unit(c(0.5, 0.5, 1, 1),
        "lines"), strip.background = element_blank())
    
    plot of chunk ws7_3aQ1-9
    Show emmeans code
    paruelo.list = with(paruelo, list(LAT = seq(min(LAT), max(LAT),
        len = 100), LONG = mean(LONG) + sd(LONG) * c(-2, -1, 0, 1,
        2)))
    paruelo.grid = ref_grid(paruelo.lm, ~LAT | LONG, at = paruelo.list,
        cov.reduce = function(x) x)
    newdata = confint(paruelo.grid) %>% mutate_at(vars(prediction,
        lower.CL, upper.CL), function(x) x^2)
    
    ## calculate the partial residuals
    partial.obs = ref_grid(paruelo.lm, ~LAT, cov.reduce = function(x) x,
        at = list(LONG = mean(paruelo$LONG) + sd(paruelo$LONG) *
            c(-2, -1, 0, 1, 2))) %>% summary %>% mutate(presid = (prediction +
        as.vector(resid(paruelo.lm)))^2)
    
    newdata$lLONG <- factor(newdata$LONG, labels = paste("Longitude:~",
        c(-2, -1, 0, 1, 2), "*sigma"))
    partial.obs$lLONG <- factor(partial.obs$LONG, labels = paste("Longitude:~",
        c(-2, -1, 0, 1, 2), "*sigma"))
    
    ggplot(newdata, aes(y = prediction, x = LAT)) + geom_line() +
        geom_ribbon(aes(ymin = lower.CL, ymax = upper.CL), color = "grey",
            alpha = 0.3) + facet_grid(~lLONG, labeller = label_parsed) +
        scale_y_continuous(expression(Abundance ~ of ~ C3 ~ grasses)) +
        scale_x_continuous(expression(Latitude ~ (degree * S))) +
        geom_point(data = partial.obs, aes(y = presid)) + theme_classic() +
        theme(axis.title.y = element_text(vjust = 1, size = rel(1.25)),
            axis.title.x = element_text(vjust = -1, size = rel(1.25)),
            plot.margin = unit(c(0.5, 0.5, 1, 1), "lines"), strip.background = element_blank())
    
    plot of chunk ws7_3aQ1-9b
    Show manual code
    newdata = with(paruelo, expand.grid(LAT = seq(min(LAT), max(LAT),
        len = 100), LONG = mean(LONG) + sd(LONG) * c(-2, -1, 0, 1,
        2)))
    Xmat = model.matrix(~scale(LAT, scale = FALSE) * scale(LONG,
        scale = FALSE), data = newdata)
    coefs = coef(paruelo.lm)
    fit = as.vector(coefs %*% t(Xmat))
    se = sqrt(diag(Xmat %*% vcov(paruelo.lm) %*% t(Xmat)))
    q = qt(0.975, df = df.residual(paruelo.lm))
    newdata = newdata %>% bind_cols(data.frame(fit = fit^2, lwr = (fit -
        q * se)^2, upr = (fit + q * se)^2))
    ## partial residuals
    partial.obs <- with(paruelo, expand.grid(LAT = LAT, LONG = mean(LAT) +
        sd(LAT) * c(-2, -1, 0, 1, 2)))
    Xmat = model.matrix(~scale(LAT, scale = FALSE) * scale(LONG,
        scale = FALSE), data = partial.obs)
    presid = as.vector(coefs %*% t(Xmat))
    partial.obs = partial.obs %>% bind_cols(data.frame(presid = (presid +
        rep(resid(paruelo.lm), 5))^2))
    
    newdata$lLONG <- factor(newdata$LONG, labels = paste("Longitude:~",
        c(-2, -1, 0, 1, 2), "*sigma"))
    partial.obs$lLONG <- factor(partial.obs$LONG, labels = paste("Longitude:~",
        c(-2, -1, 0, 1, 2), "*sigma"))
    
    ggplot(newdata, aes(y = fit, x = LAT)) + geom_line() + geom_ribbon(aes(ymin = lwr,
        ymax = upr), color = "grey", alpha = 0.3) + facet_grid(~lLONG,
        labeller = label_parsed) + scale_y_continuous(expression(Abundance ~
        of ~ C3 ~ grasses)) + scale_x_continuous(expression(Latitude ~
        (degree * S))) + geom_point(data = partial.obs, aes(y = presid)) +
        theme_classic() + theme(axis.title.y = element_text(vjust = 1,
        size = rel(1.25)), axis.title.x = element_text(vjust = -1,
        size = rel(1.25)), plot.margin = unit(c(0.5, 0.5, 1, 1),
        "lines"), strip.background = element_blank())
    
    plot of chunk ws7_3aQ1-9c

  11. Whilst this was not actually a spatial analysis, the predictors were of a spatial nature. It might therefore be interesting to visualize the predictions as a heat map projected onto lat/long space. We could even overlay this onto a map of North America so as to visualize the spatial patterns in C3 grass abundances.
    Show predict code
    ## Lets start by getting the base map of USA
    library(mapdata)
    library(maps)
    library(sf)
    ## Get a map of the USA
    usa = map("usa", plot = FALSE)
    ## convert it to standard features
    usa = st_as_sf(usa)
    plot(usa)
    
    plot of chunk ws7_3aQ1-10a
    newdata <- with(paruelo, expand.grid(LAT = seq(min(LAT), max(LAT),
        len = 100), LONG = seq(min(LONG), max(LONG), len = 100)))
    ## The above grid is defined by the maximum bounding
    ## latitude/longitude bounding box.  However, the sampling
    ## design was not a perfect rectangle and therefore this
    ## bounding box would involve predictions outside the spatial
    ## sampling domain.  We should probably restrict the grid to a
    ## polygon defined by a convex hull around the points.  We can
    ## do this either by cropping/masking out the grid we just
    ## made or generate a regular grid from within the convex
    ## hull.
    paruelo.sf = paruelo %>% mutate(LONG = LONG * -1) %>% st_as_sf(coords = c("LONG",
        "LAT"))
    newdata = paruelo.sf %>% st_union %>% st_convex_hull %>% st_set_crs(st_crs(usa))
    ggplot() + geom_sf(data = usa) + geom_sf(data = newdata, fill = NA)
    
    plot of chunk ws7_3aQ1-10a
    ## We can see that the prediction grid protrudes outside of
    ## the USA, so we might want to crop it so that it is
    ## contained in the USA polygon
    newdata = newdata %>% st_set_crs(st_crs(usa)) %>% st_intersection(st_buffer(usa,
        0))  #st_buffer needed as some usa polygons intersect
    ggplot() + geom_sf(data = usa) + geom_sf(data = newdata)
    
    plot of chunk ws7_3aQ1-10a
    newdata = newdata %>% st_set_crs(NA) %>% st_sample(size = 10000,
        type = "regular", exact = TRUE) %>% st_set_crs(st_crs(usa)) %>%
        st_coordinates %>% as.data.frame %>% dplyr::rename(LAT = Y,
        LONG = X)
    
    
    fit <- predict(paruelo.lm, newdata = newdata, interval = "confidence")
    fit <- cbind(newdata, fit)
    head(fit)
    
           LONG      LAT      fit      lwr      upr
    1 -98.38595 29.15257 5.657915 2.183853 9.131978
    2 -98.18389 29.15257 5.652283 2.181692 9.122874
    3 -97.98184 29.15257 5.646651 2.179531 9.113771
    4 -97.77979 29.15257 5.641019 2.177370 9.104668
    5 -97.57774 29.15257 5.635387 2.175210 9.095565
    6 -97.37569 29.15257 5.629755 2.173049 9.086462
    
    ## Note, the Longitude are positive (and thus Easterly), we
    ## need to multiply them by -1
    
    ggplot() + geom_sf(data = usa) + geom_tile(data = fit, aes(y = LAT,
        x = LONG, fill = fit)) + scale_fill_gradientn(colors = heat.colors(12)) +
        theme_bw()
    
    plot of chunk ws7_3aQ1-10a
  12. Notice the hook in the predictions associated with low Latitude. This is an artifact of the back-transformation.

    Arguably, it would have been more appropriate to fit a generalized linear model with a binomial distribution. We will therefore return to this example in Tutorial 10.5a.

Multiple Linear Regression

Loyn (1987) modeled the abundance of forest birds with six predictor variables (patch area, distance to nearest patch, distance to nearest larger patch, grazing intensity, altitude and years since the patch had been isolated).

Download Loyn data set
Format of loyn.csv data file
ABUNDDISTLDISTAREAGRAZEALTYR.ISOL
..............

ABUNDAbundance of forest birds in patch- response variable
DISTDistance to nearest patch - predictor variable
LDISTDistance to nearest larger patch - predictor variable
AREASize of the patch - predictor variable
GRAZEGrazing intensity (1 to 5, representing light to heavy) - predictor variable
ALTAltitude - predictor variable
YR.ISOLNumber of years since the patch was isolated - predictor variable
Saltmarsh

Open the loyn data file. HINT.
Show code
loyn <- read.table("../downloads/data/loyn.csv", header = T, sep = ",", strip.white = T)
head(loyn)
  ABUND AREA YR.ISOL DIST LDIST GRAZE ALT
1   5.3  0.1    1968   39    39     2 160
2   2.0  0.5    1920  234   234     5  60
3   1.5  0.5    1900  104   311     5 140
4  17.1  1.0    1966   66    66     3 160
5  13.8  1.0    1918  246   246     5 140
6  14.1  1.0    1965  234   285     3 130
  1. In the table below, list the assumptions of multiple linear regression along with how violations of each assumption are diagnosed and/or the risks of violations are minimized.
    AssumptionDiagnostic/Risk Minimization
    I.
    II.
    III.
    IV.
    V.
  2. Construct a scatterplot matrix
    to investigate these assumptions(HINT)
    Show scatterplotMatrix (car) code
    library(car)
    scatterplotMatrix(~ABUND + DIST + LDIST + AREA + GRAZE + ALT +
        YR.ISOL, data = loyn, diagonal = list(method = "boxplot"))
    
    plot of chunk ws7_3a-Q2-2a
    Show splom (lattice) code
    library(lattice)
    splom.lat <- splom(loyn, type = c("p", "r"))
    print(splom.lat)
    
    plot of chunk ws7_3a-Q2-2b
    Show ggpairs (GGally) code
    library(GGally)
    ggpairs(loyn, lower = list(continuous = "smooth"), diag = list(continuous = "density"),
        axisLabels = "none")
    
    plot of chunk ws7_3a-Q2-2c

    1. Focussing on the assumptions of Normality, Homogeneity of Variance and Linearity, is there evidence of violations of these assumptions (y or n)?
    2. Try applying a temporary log10 transformation to the skewed variables(HINT).
      Show scatterplotMatrix (car) code
      library(car)
      scatterplotMatrix(~ABUND + log10(DIST) + log10(LDIST) + log10(AREA) +
          GRAZE + ALT + YR.ISOL, data = loyn, diagonal = list(margin = "boxplot"))
      
      plot of chunk Q2-2aa
      Show ggpairs (GGally) code
      library(GGally)
      ggpairs(with(loyn, data.frame(logDIST = log10(DIST), logLDIST = log(LDIST),
          logAREA = log10(AREA), GRAZE, ALT, YR.ISOL)), lower = list(continuous = "smooth"),
          diag = list(continuous = "density"), axisLabels = "none")
      
      plot of chunk Q2-2ab
      Does this improve some of these specific assumptions (y or n)?
    3. Is there any evidence that the assumption of Collinearity is likely to be violated (y or n)?
      Show code
      loyn.lm <- lm(ABUND ~ log10(DIST) + log10(LDIST) + log10(AREA) + GRAZE + ALT + YR.ISOL, data = loyn)
      vif(loyn.lm)
      
       log10(DIST) log10(LDIST)  log10(AREA)        GRAZE          ALT      YR.ISOL 
          1.654553     2.009749     1.911514     2.524814     1.467937     1.804769 
      
      1/vif(loyn.lm)
      
       log10(DIST) log10(LDIST)  log10(AREA)        GRAZE          ALT      YR.ISOL 
         0.6043930    0.4975746    0.5231454    0.3960688    0.6812282    0.5540876 
      

      PredictorVIFTolerance
      log10(DIST)
      log10(LDIST)
      log10(AREA)
      GRAZE
      ALT
      YR.ISOL
    4. Collinearity typically occurs when one or more of the predictor variables are correlated, therefore this assumption can also be examined by calculating the pairwise correlation coefficients between all the predictor variables (use transformed versions for those that required it) ( HINT).
      Show code
      cor(loyn.lm$model[, 2:7])
      
                   log10(DIST) log10(LDIST) log10(AREA)       GRAZE        ALT     YR.ISOL
      log10(DIST)   1.00000000   0.60386637   0.3021666 -0.14263922 -0.2190070 -0.01957223
      log10(LDIST)  0.60386637   1.00000000   0.3824795 -0.03399082 -0.2740438 -0.16111611
      log10(AREA)   0.30216662   0.38247952   1.0000000 -0.55908864  0.2751428  0.27841452
      GRAZE        -0.14263922  -0.03399082  -0.5590886  1.00000000 -0.4071671 -0.63556710
      ALT          -0.21900701  -0.27404380   0.2751428 -0.40716705  1.0000000  0.23271541
      YR.ISOL      -0.01957223  -0.16111611   0.2784145 -0.63556710  0.2327154  1.00000000
      
      None of the predictor variables are overly correlated to any of the other predictor variables.
  3. Since none of the predictor variables are highly correlated to one another, we can include all in the linear model fitting.

    In this question, we are not so interested in hypothesis testing. That is, we are not trying to determine if there is a significant effect of one or more predictor variables on the abundance of forest birds. Rather, we wish to either (or both);

    • Construct a predictive model for relating habitat variables to forest bird abundances
    • Determine which are the most important (influential) predictor variables of forest bird abundances
    As a result, a multiplicative model is unnecessary and overly complicated. For this purpose, an additive model (in which each of the predictors is included in the model without interaction terms) is appropriate.

  4. Fit an additive linear model
    relating ABUND to each of the predictor variables, but no interactions ( HINT)
    Show code
    loyn.lm <- lm(ABUND ~ log10(DIST) + log10(LDIST) + log10(AREA) + GRAZE + ALT + YR.ISOL, data = loyn)
    
    1. Examine the diagnostic plots (HINT) specially the residual plot to confirm no further evidence of violations of the analysis assumptions
      Show base code
      # setup a 2x6 plotting device with small margins
      par(mfrow = c(2, 3), oma = c(0, 0, 2, 0))
      plot(loyn.lm, ask = F, which = 1:6)
      
      plot of chunk Q2-3a
      influence.measures(loyn.lm)
      
      Influence measures of
       lm(formula = ABUND ~ log10(DIST) + log10(LDIST) + log10(AREA) +      GRAZE + ALT + YR.ISOL, data = loyn) :
      
            dfb.1_ dfb.l10.D dfb.l10.L dfb.l10.A  dfb.GRAZ   dfb.ALT  dfb.YR.I   dffit cov.r   cook.d
      1  -0.024547   0.08371 -0.022664  0.325348  0.219000 -0.005547  0.008468 -0.4206 1.395 2.55e-02
      2  -0.017509  -0.01656  0.020997  0.012653  0.003658  0.037247  0.016013 -0.0657 1.319 6.29e-04
      3  -0.058912   0.01045 -0.016321  0.048309  0.012241 -0.021952  0.060904 -0.1103 1.288 1.77e-03
      4  -0.024649  -0.01083 -0.015504 -0.047360 -0.005965  0.010247  0.028327  0.0998 1.217 1.45e-03
      5   0.064514   0.17236 -0.075678 -0.091673  0.105181  0.101385 -0.078406  0.3575 1.036 1.81e-02
      6  -0.013955   0.01154  0.003907 -0.027075 -0.003667  0.000920  0.014184  0.0385 1.243 2.16e-04
      7   0.152729  -0.14223  0.018225  0.056318 -0.149535  0.022629 -0.148486 -0.2837 1.250 1.16e-02
      8  -0.012535   0.00936 -0.059652  0.050580 -0.001515  0.058081  0.012466 -0.1520 1.289 3.36e-03
      9   0.240434  -0.10307  0.040989  0.142365 -0.165009 -0.019501 -0.243854 -0.4029 0.937 2.27e-02
      10 -0.049652  -0.03654 -0.001350  0.035981 -0.001217 -0.021708  0.054131 -0.1064 1.286 1.65e-03
      11  0.326311  -0.29680  0.393485 -0.464543 -0.259851  0.555537 -0.347149  0.9263 0.653 1.13e-01
      12 -0.318813  -0.04143  0.184910 -0.047393  0.001434 -0.058520  0.324669 -0.4586 1.179 3.00e-02
      13 -0.003880  -0.00858 -0.003634 -0.013295 -0.006816  0.013654  0.004853  0.0364 1.302 1.93e-04
      14  0.049590  -0.19049 -0.118128  0.391650  0.193189 -0.285935 -0.033841 -0.5334 1.236 4.06e-02
      15  0.028372   0.01892 -0.019252  0.001533  0.002733 -0.003094 -0.029406  0.0490 1.295 3.50e-04
      16 -0.112157   0.03131 -0.058961  0.040297 -0.017552 -0.062221  0.119657 -0.2278 1.199 7.50e-03
      17  0.089920  -0.36895 -0.418556  0.197950 -0.137942 -0.554335 -0.016929  0.9038 0.798 1.10e-01
      18  0.003210   0.44969  0.471153 -0.189555  0.055517  0.413749 -0.081604 -1.0674 0.459 1.43e-01
      19 -0.014516  -0.05622 -0.077388  0.014877  0.026729  0.051520  0.021493  0.2163 1.162 6.75e-03
      20  0.031138   0.01561 -0.029390 -0.019603 -0.059824 -0.066354 -0.026445  0.0950 1.281 1.31e-03
      21 -0.074573   0.03497  0.084655 -0.115333  0.004721  0.153777  0.066229  0.1998 1.318 5.80e-03
      22  0.012462   0.62313 -0.547520  0.034210 -0.102778  0.004117 -0.020621 -0.7615 1.323 8.21e-02
      23  0.007417   0.07767  0.112948 -0.066606  0.000558  0.110277 -0.024123 -0.2259 1.218 7.38e-03
      24  0.057330   0.06179 -0.042947 -0.103211 -0.203842 -0.153868 -0.046055  0.2864 1.253 1.18e-02
      25  0.399742  -0.06158 -0.055316 -0.020120 -0.228704  0.012557 -0.400107  0.4310 1.320 2.67e-02
      26 -0.036316  -0.02478  0.062377 -0.032586  0.009891  0.023836  0.035302  0.0810 1.209 9.54e-04
      27  0.062196  -0.01369 -0.045659  0.029139 -0.023443 -0.004247 -0.061136 -0.1034 1.178 1.55e-03
      28  0.022345  -0.01170  0.005156 -0.004665 -0.032184 -0.032509 -0.020565 -0.0462 1.322 3.11e-04
      29 -0.048454   0.02411 -0.016234  0.017630  0.047216 -0.006470  0.048550  0.0678 1.284 6.70e-04
      30  0.020424   0.00550  0.013709  0.018552  0.021831 -0.019583 -0.022322  0.0809 1.272 9.54e-04
      31  0.181882  -0.18332  0.070644 -0.044548 -0.080909  0.204938 -0.188993 -0.4816 0.632 3.08e-02
      32 -0.004554  -0.00646 -0.010581  0.012558  0.011367  0.000130  0.005339  0.0235 1.380 8.04e-05
      33 -0.192246   0.07100 -0.214312  0.246055  0.262466 -0.178104  0.205265  0.4545 0.968 2.89e-02
      34 -0.167748   0.08351 -0.072012  0.100394  0.263665  0.204739  0.156632  0.3493 1.149 1.75e-02
      35  0.056797  -0.06516  0.082675  0.005839  0.006404 -0.128022 -0.056773 -0.2959 0.929 1.23e-02
      36 -0.136228  -0.06411  0.004602  0.101551  0.082178 -0.132967  0.148678  0.2943 0.930 1.22e-02
      37  0.004153  -0.01116 -0.045478 -0.053435 -0.081659  0.014926  0.001372 -0.1651 1.254 3.96e-03
      38 -0.010846  -0.04066  0.048479 -0.013496 -0.004557  0.016275  0.010883  0.0559 1.679 4.56e-04
      39 -0.222802   0.01356  0.109160  0.039879  0.210745  0.089951  0.214317  0.2799 1.260 1.13e-02
      40 -0.005977   0.01113 -0.036499 -0.056239 -0.058831  0.032006  0.008643 -0.1378 1.287 2.76e-03
      41 -0.031032  -0.08019  0.031978  0.020113  0.089394  0.033669  0.030566 -0.1590 1.221 3.67e-03
      42 -0.029798   0.03356 -0.007763  0.018255  0.027276 -0.002476  0.028465  0.0597 1.236 5.19e-04
      43  0.001060  -0.00833  0.008234 -0.000729  0.011017 -0.007667 -0.001249 -0.0327 1.228 1.56e-04
      44  0.004393  -0.00930  0.000714  0.002619 -0.009448 -0.010691 -0.003254  0.0161 1.376 3.79e-05
      45  0.011664   0.03240 -0.064340  0.046954 -0.013554 -0.053380 -0.008156  0.0934 1.260 1.27e-03
      46 -0.044055   0.00853 -0.126785  0.175238  0.125507  0.057840  0.045178  0.2630 1.157 9.95e-03
      47  0.348086  -0.09628  0.068676  0.328440 -0.113019 -0.256880 -0.347655  0.7168 0.485 6.55e-02
      48 -0.139098   0.50451  0.055716 -0.138456 -0.085820  0.270712  0.103904  0.7515 0.885 7.74e-02
      49  0.000830   0.00124  0.009247 -0.002171 -0.015170 -0.005737 -0.000635  0.0297 1.246 1.28e-04
      50  0.010389   0.02638 -0.015605  0.006676 -0.022689 -0.000897 -0.010726  0.0536 1.242 4.18e-04
      51 -0.048345  -0.13467  0.071349  0.066280  0.022910  0.002208  0.053729  0.1778 1.335 4.60e-03
      52 -0.019314   0.04851 -0.061717 -0.025118  0.088505  0.129007  0.012520 -0.1989 1.421 5.75e-03
      53 -0.000455  -0.00517 -0.030151 -0.009949  0.025032 -0.004643  0.001860 -0.0817 1.251 9.72e-04
      54  0.032378  -0.00621 -0.010783  0.019668 -0.025119 -0.000179 -0.031927  0.0506 1.315 3.73e-04
      55  0.072633  -0.00990 -0.045849 -0.458184 -0.072288 -0.111714 -0.060767 -0.7173 0.853 7.03e-02
      56 -0.524582  -0.31857  0.429361 -0.754208  0.024705 -0.528460  0.569683 -1.3723 1.007 2.54e-01
            hat inf
      1  0.2374    
      2  0.1279    
      3  0.1150    
      4  0.0690    
      5  0.0849    
      6  0.0734    
      7  0.1399    
      8  0.1247    
      9  0.0748    
      10 0.1132    
      11 0.1413    
      12 0.1620    
      13 0.1142    
      14 0.2036    
      15 0.1105    
      16 0.0998    
      17 0.1704    
      18 0.1268   *
      19 0.0804    
      20 0.1078    
      21 0.1513    
      22 0.2888    
      23 0.1079    
      24 0.1419    
      25 0.2095    
      26 0.0591    
      27 0.0487    
      28 0.1281    
      29 0.1053    
      30 0.0998    
      31 0.0472    
      32 0.1630    
      33 0.0962    
      34 0.1183    
      35 0.0452    
      36 0.0449    
      37 0.1085    
      38 0.3127   *
      39 0.1435    
      40 0.1204    
      41 0.0888    
      42 0.0714    
      43 0.0611    
      44 0.1604    
      45 0.0943    
      46 0.0936    
      47 0.0693   *
      48 0.1550    
      49 0.0746    
      50 0.0744    
      51 0.1561    
      52 0.2052    
      53 0.0862    
      54 0.1240    
      55 0.1384    
      56 0.3297   *
      
      Show autoplot code
      autoplot(loyn.lm, which = 1:6)
      
      plot of chunk Q2-3aa
      influence.measures(loyn.lm)
      
      Influence measures of
       lm(formula = ABUND ~ log10(DIST) + log10(LDIST) + log10(AREA) +      GRAZE + ALT + YR.ISOL, data = loyn) :
      
            dfb.1_ dfb.l10.D dfb.l10.L dfb.l10.A  dfb.GRAZ   dfb.ALT  dfb.YR.I   dffit cov.r   cook.d
      1  -0.024547   0.08371 -0.022664  0.325348  0.219000 -0.005547  0.008468 -0.4206 1.395 2.55e-02
      2  -0.017509  -0.01656  0.020997  0.012653  0.003658  0.037247  0.016013 -0.0657 1.319 6.29e-04
      3  -0.058912   0.01045 -0.016321  0.048309  0.012241 -0.021952  0.060904 -0.1103 1.288 1.77e-03
      4  -0.024649  -0.01083 -0.015504 -0.047360 -0.005965  0.010247  0.028327  0.0998 1.217 1.45e-03
      5   0.064514   0.17236 -0.075678 -0.091673  0.105181  0.101385 -0.078406  0.3575 1.036 1.81e-02
      6  -0.013955   0.01154  0.003907 -0.027075 -0.003667  0.000920  0.014184  0.0385 1.243 2.16e-04
      7   0.152729  -0.14223  0.018225  0.056318 -0.149535  0.022629 -0.148486 -0.2837 1.250 1.16e-02
      8  -0.012535   0.00936 -0.059652  0.050580 -0.001515  0.058081  0.012466 -0.1520 1.289 3.36e-03
      9   0.240434  -0.10307  0.040989  0.142365 -0.165009 -0.019501 -0.243854 -0.4029 0.937 2.27e-02
      10 -0.049652  -0.03654 -0.001350  0.035981 -0.001217 -0.021708  0.054131 -0.1064 1.286 1.65e-03
      11  0.326311  -0.29680  0.393485 -0.464543 -0.259851  0.555537 -0.347149  0.9263 0.653 1.13e-01
      12 -0.318813  -0.04143  0.184910 -0.047393  0.001434 -0.058520  0.324669 -0.4586 1.179 3.00e-02
      13 -0.003880  -0.00858 -0.003634 -0.013295 -0.006816  0.013654  0.004853  0.0364 1.302 1.93e-04
      14  0.049590  -0.19049 -0.118128  0.391650  0.193189 -0.285935 -0.033841 -0.5334 1.236 4.06e-02
      15  0.028372   0.01892 -0.019252  0.001533  0.002733 -0.003094 -0.029406  0.0490 1.295 3.50e-04
      16 -0.112157   0.03131 -0.058961  0.040297 -0.017552 -0.062221  0.119657 -0.2278 1.199 7.50e-03
      17  0.089920  -0.36895 -0.418556  0.197950 -0.137942 -0.554335 -0.016929  0.9038 0.798 1.10e-01
      18  0.003210   0.44969  0.471153 -0.189555  0.055517  0.413749 -0.081604 -1.0674 0.459 1.43e-01
      19 -0.014516  -0.05622 -0.077388  0.014877  0.026729  0.051520  0.021493  0.2163 1.162 6.75e-03
      20  0.031138   0.01561 -0.029390 -0.019603 -0.059824 -0.066354 -0.026445  0.0950 1.281 1.31e-03
      21 -0.074573   0.03497  0.084655 -0.115333  0.004721  0.153777  0.066229  0.1998 1.318 5.80e-03
      22  0.012462   0.62313 -0.547520  0.034210 -0.102778  0.004117 -0.020621 -0.7615 1.323 8.21e-02
      23  0.007417   0.07767  0.112948 -0.066606  0.000558  0.110277 -0.024123 -0.2259 1.218 7.38e-03
      24  0.057330   0.06179 -0.042947 -0.103211 -0.203842 -0.153868 -0.046055  0.2864 1.253 1.18e-02
      25  0.399742  -0.06158 -0.055316 -0.020120 -0.228704  0.012557 -0.400107  0.4310 1.320 2.67e-02
      26 -0.036316  -0.02478  0.062377 -0.032586  0.009891  0.023836  0.035302  0.0810 1.209 9.54e-04
      27  0.062196  -0.01369 -0.045659  0.029139 -0.023443 -0.004247 -0.061136 -0.1034 1.178 1.55e-03
      28  0.022345  -0.01170  0.005156 -0.004665 -0.032184 -0.032509 -0.020565 -0.0462 1.322 3.11e-04
      29 -0.048454   0.02411 -0.016234  0.017630  0.047216 -0.006470  0.048550  0.0678 1.284 6.70e-04
      30  0.020424   0.00550  0.013709  0.018552  0.021831 -0.019583 -0.022322  0.0809 1.272 9.54e-04
      31  0.181882  -0.18332  0.070644 -0.044548 -0.080909  0.204938 -0.188993 -0.4816 0.632 3.08e-02
      32 -0.004554  -0.00646 -0.010581  0.012558  0.011367  0.000130  0.005339  0.0235 1.380 8.04e-05
      33 -0.192246   0.07100 -0.214312  0.246055  0.262466 -0.178104  0.205265  0.4545 0.968 2.89e-02
      34 -0.167748   0.08351 -0.072012  0.100394  0.263665  0.204739  0.156632  0.3493 1.149 1.75e-02
      35  0.056797  -0.06516  0.082675  0.005839  0.006404 -0.128022 -0.056773 -0.2959 0.929 1.23e-02
      36 -0.136228  -0.06411  0.004602  0.101551  0.082178 -0.132967  0.148678  0.2943 0.930 1.22e-02
      37  0.004153  -0.01116 -0.045478 -0.053435 -0.081659  0.014926  0.001372 -0.1651 1.254 3.96e-03
      38 -0.010846  -0.04066  0.048479 -0.013496 -0.004557  0.016275  0.010883  0.0559 1.679 4.56e-04
      39 -0.222802   0.01356  0.109160  0.039879  0.210745  0.089951  0.214317  0.2799 1.260 1.13e-02
      40 -0.005977   0.01113 -0.036499 -0.056239 -0.058831  0.032006  0.008643 -0.1378 1.287 2.76e-03
      41 -0.031032  -0.08019  0.031978  0.020113  0.089394  0.033669  0.030566 -0.1590 1.221 3.67e-03
      42 -0.029798   0.03356 -0.007763  0.018255  0.027276 -0.002476  0.028465  0.0597 1.236 5.19e-04
      43  0.001060  -0.00833  0.008234 -0.000729  0.011017 -0.007667 -0.001249 -0.0327 1.228 1.56e-04
      44  0.004393  -0.00930  0.000714  0.002619 -0.009448 -0.010691 -0.003254  0.0161 1.376 3.79e-05
      45  0.011664   0.03240 -0.064340  0.046954 -0.013554 -0.053380 -0.008156  0.0934 1.260 1.27e-03
      46 -0.044055   0.00853 -0.126785  0.175238  0.125507  0.057840  0.045178  0.2630 1.157 9.95e-03
      47  0.348086  -0.09628  0.068676  0.328440 -0.113019 -0.256880 -0.347655  0.7168 0.485 6.55e-02
      48 -0.139098   0.50451  0.055716 -0.138456 -0.085820  0.270712  0.103904  0.7515 0.885 7.74e-02
      49  0.000830   0.00124  0.009247 -0.002171 -0.015170 -0.005737 -0.000635  0.0297 1.246 1.28e-04
      50  0.010389   0.02638 -0.015605  0.006676 -0.022689 -0.000897 -0.010726  0.0536 1.242 4.18e-04
      51 -0.048345  -0.13467  0.071349  0.066280  0.022910  0.002208  0.053729  0.1778 1.335 4.60e-03
      52 -0.019314   0.04851 -0.061717 -0.025118  0.088505  0.129007  0.012520 -0.1989 1.421 5.75e-03
      53 -0.000455  -0.00517 -0.030151 -0.009949  0.025032 -0.004643  0.001860 -0.0817 1.251 9.72e-04
      54  0.032378  -0.00621 -0.010783  0.019668 -0.025119 -0.000179 -0.031927  0.0506 1.315 3.73e-04
      55  0.072633  -0.00990 -0.045849 -0.458184 -0.072288 -0.111714 -0.060767 -0.7173 0.853 7.03e-02
      56 -0.524582  -0.31857  0.429361 -0.754208  0.024705 -0.528460  0.569683 -1.3723 1.007 2.54e-01
            hat inf
      1  0.2374    
      2  0.1279    
      3  0.1150    
      4  0.0690    
      5  0.0849    
      6  0.0734    
      7  0.1399    
      8  0.1247    
      9  0.0748    
      10 0.1132    
      11 0.1413    
      12 0.1620    
      13 0.1142    
      14 0.2036    
      15 0.1105    
      16 0.0998    
      17 0.1704    
      18 0.1268   *
      19 0.0804    
      20 0.1078    
      21 0.1513    
      22 0.2888    
      23 0.1079    
      24 0.1419    
      25 0.2095    
      26 0.0591    
      27 0.0487    
      28 0.1281    
      29 0.1053    
      30 0.0998    
      31 0.0472    
      32 0.1630    
      33 0.0962    
      34 0.1183    
      35 0.0452    
      36 0.0449    
      37 0.1085    
      38 0.3127   *
      39 0.1435    
      40 0.1204    
      41 0.0888    
      42 0.0714    
      43 0.0611    
      44 0.1604    
      45 0.0943    
      46 0.0936    
      47 0.0693   *
      48 0.1550    
      49 0.0746    
      50 0.0744    
      51 0.1561    
      52 0.2052    
      53 0.0862    
      54 0.1240    
      55 0.1384    
      56 0.3297   *
      
    2. On initial examination of the fitted model summary, which of the predictor variables might you consider important? (this can initially be determined by exploring which of the effects are significant?)
      Show code
      summary(loyn.lm)
      
      Call:
      lm(formula = ABUND ~ log10(DIST) + log10(LDIST) + log10(AREA) + 
          GRAZE + ALT + YR.ISOL, data = loyn)
      
      Residuals:
           Min       1Q   Median       3Q      Max 
      -15.6506  -2.9390   0.5289   2.5353  15.2842 
      
      Coefficients:
                     Estimate Std. Error t value Pr(>|t|)    
      (Intercept)  -125.69725   91.69228  -1.371   0.1767    
      log10(DIST)    -0.90696    2.67572  -0.339   0.7361    
      log10(LDIST)   -0.64842    2.12270  -0.305   0.7613    
      log10(AREA)     7.47023    1.46489   5.099 5.49e-06 ***
      GRAZE          -1.66774    0.92993  -1.793   0.0791 .  
      ALT             0.01951    0.02396   0.814   0.4195    
      YR.ISOL         0.07387    0.04520   1.634   0.1086    
      ---
      Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
      
      Residual standard error: 6.384 on 49 degrees of freedom
      Multiple R-squared:  0.6849,	Adjusted R-squared:  0.6464 
      F-statistic: 17.75 on 6 and 49 DF,  p-value: 8.443e-11
      
      confint(loyn.lm)
      
                           2.5 %      97.5 %
      (Intercept)  -309.95977340 58.56528176
      log10(DIST)    -6.28402047  4.47010711
      log10(LDIST)   -4.91414111  3.61730123
      log10(AREA)     4.52641155 10.41404091
      GRAZE          -3.53649663  0.20101975
      ALT            -0.02863924  0.06765562
      YR.ISOL        -0.01697045  0.16471247
      
      tidy(loyn.lm)
      
      # A tibble: 7 x 5
        term          estimate std.error statistic    p.value
        <chr>            <dbl>     <dbl>     <dbl>      <dbl>
      1 (Intercept)  -126.       91.7       -1.37  0.177     
      2 log10(DIST)    -0.907     2.68      -0.339 0.736     
      3 log10(LDIST)   -0.648     2.12      -0.305 0.761     
      4 log10(AREA)     7.47      1.46       5.10  0.00000549
      5 GRAZE          -1.67      0.930     -1.79  0.0791    
      6 ALT             0.0195    0.0240     0.814 0.419     
      7 YR.ISOL         0.0739    0.0452     1.63  0.109     
      

      CoefficientEstimatet-valueP-value
      Intercept
      log10(DIST)
      log10(LDIST)
      log10(AREA)
      GRAZE
      ALT
      YR.ISOL

    Clearly, not all of the predictor variables have significant partial effects on the abundance of forest birds. In developing a predictive model for relating forest bird abundances to a range of habitat characteristics, not all of the measured variables are useful. Likewise, the measured variables are not likely to be equally influential in determining the abundance of forest birds.

  5. We would now like to be able to find the 'best' regression model whilst remaining mindful of the criticisms against such an approach..
    Calculate the adjusted R2 (HINT), generic AIC (HINT) and generic BIC (HINT) for the full regression containing all six predictor variables.
    Show code
    summary(loyn.lm)$adj.r.squared
    
    [1] 0.6463567
    
    AIC(loyn.lm)
    
    [1] 375.0638
    
    AIC(loyn.lm, k = log(nrow(loyn)))
    
    [1] 391.2666
    
    library(MuMIn)
    AICc(loyn.lm)
    
    [1] 378.1276
    

    ModelAdj. r2AICBIC
    Full

    Q2-5. Compare all possible models and select the 'best' (most parsimonious) model
    based on AIC and investigate the importance of each of the predictor variables using model averaging (HINT).
    Show code
    library(MuMIn)
    # for more recent versions of MuMIn, it is necessary that the model be fitted with na.action=na.fail
    loyn.lm <- update(loyn.lm, na.action = na.fail)
    loyn.av <- model.avg(get.models(dredge(loyn.lm, rank = "AIC"), subset = delta <= 100))
    summary(loyn.av)
    
    Call:
    model.avg(object = get.models(dredge(loyn.lm, rank = "AIC"), 
        subset = delta <= 100))
    
    Component model call: 
    lm(formula = ABUND ~ <64 unique rhs>, data = loyn, na.action = na.fail)
    
    Component models: 
           df  logLik    AIC delta weight
    236     5 -180.55 371.11  0.00   0.13
    1236    6 -179.76 371.52  0.41   0.10
    2356    6 -180.04 372.07  0.96   0.08
    2346    6 -180.07 372.14  1.04   0.08
    23      4 -182.26 372.51  1.40   0.06
    235     5 -181.35 372.69  1.59   0.06
    136     5 -181.43 372.86  1.75   0.05
    123     5 -181.58 373.15  2.05   0.05
    234     5 -181.58 373.16  2.06   0.05
    12346   7 -179.59 373.17  2.06   0.05
    12356   7 -179.60 373.19  2.09   0.04
    23456   7 -179.91 373.82  2.71   0.03
    36      4 -182.99 373.99  2.88   0.03
    1235    6 -181.09 374.19  3.08   0.03
    2345    6 -181.23 374.45  3.34   0.02
    1234    6 -181.24 374.48  3.37   0.02
    1356    6 -181.32 374.63  3.52   0.02
    356     5 -182.36 374.72  3.61   0.02
    1346    6 -181.38 374.76  3.65   0.02
    123456  8 -179.53 375.06  3.95   0.02
    346     5 -182.64 375.28  4.17   0.02
    12345   7 -181.02 376.04  4.93   0.01
    13456   7 -181.31 376.62  5.51   0.01
    3456    6 -182.32 376.64  5.54   0.01
    13      4 -187.34 382.68 11.57   0.00
    135     5 -186.55 383.10 11.99   0.00
    35      4 -187.61 383.23 12.12   0.00
    134     5 -187.23 384.45 13.34   0.00
    1345    6 -186.54 385.07 13.97   0.00
    345     5 -187.61 385.23 14.12   0.00
    3       3 -189.66 385.32 14.21   0.00
    34      4 -189.01 386.02 14.91   0.00
    2       3 -194.31 394.63 23.52   0.00
    1256    6 -191.46 394.91 23.80   0.00
    125     5 -192.51 395.01 23.91   0.00
    12      4 -193.57 395.15 24.04   0.00
    25      4 -193.84 395.68 24.57   0.00
    26      4 -193.89 395.77 24.66   0.00
    126     5 -193.09 396.17 25.07   0.00
    256     5 -193.11 396.22 25.11   0.00
    24      4 -194.27 396.54 25.43   0.00
    124     5 -193.31 396.63 25.52   0.00
    12456   7 -191.45 396.90 25.80   0.00
    1245    6 -192.50 397.00 25.89   0.00
    1246    6 -192.68 397.36 26.25   0.00
    245     5 -193.77 397.54 26.44   0.00
    246     5 -193.79 397.57 26.46   0.00
    2456    6 -193.05 398.09 26.98   0.00
    156     5 -197.23 404.45 33.34   0.00
    1456    6 -197.12 406.23 35.12   0.00
    146     5 -198.90 407.80 36.70   0.00
    16      4 -200.67 409.34 38.23   0.00
    56      4 -202.12 412.24 41.13   0.00
    6       3 -203.69 413.38 42.27   0.00
    46      4 -202.98 413.96 42.85   0.00
    456     5 -202.11 414.21 43.10   0.00
    15      4 -205.52 419.03 47.92   0.00
    14      4 -205.77 419.54 48.44   0.00
    145     5 -205.16 420.32 49.21   0.00
    1       3 -207.36 420.72 49.61   0.00
    (Null)  2 -211.87 427.74 56.63   0.00
    4       3 -211.42 428.84 57.73   0.00
    5       3 -211.48 428.96 57.85   0.00
    45      4 -211.34 430.68 59.57   0.00
    
    Term codes: 
             ALT        GRAZE  log10(AREA)  log10(DIST) log10(LDIST)      YR.ISOL 
               1            2            3            4            5            6 
    
    Model-averaged coefficients:  
    (full average) 
                   Estimate Std. Error Adjusted SE z value Pr(>|z|)    
    (Intercept)  -102.20213  115.00849   116.06223   0.881    0.379    
    GRAZE          -1.73394    1.18371     1.19557   1.450    0.147    
    log10(AREA)     7.54507    1.44065     1.47013   5.132    3e-07 ***
    YR.ISOL         0.06195    0.05682     0.05735   1.080    0.280    
    ALT             0.01068    0.01960     0.01986   0.538    0.591    
    log10(LDIST)   -0.52644    1.33863     1.36066   0.387    0.699    
    log10(DIST)    -0.51843    1.59104     1.61979   0.320    0.749    
     
    (conditional average) 
                   Estimate Std. Error Adjusted SE z value Pr(>|z|)    
    (Intercept)  -102.20213  115.00849   116.06223   0.881   0.3785    
    GRAZE          -2.11307    0.95206     0.96995   2.179   0.0294 *  
    log10(AREA)     7.54513    1.44050     1.46999   5.133    3e-07 ***
    YR.ISOL         0.08826    0.04772     0.04862   1.815   0.0695 .  
    ALT             0.02531    0.02323     0.02376   1.065   0.2868    
    log10(LDIST)   -1.49522    1.90814     1.95190   0.766   0.4437    
    log10(DIST)    -1.58387    2.45890     2.51559   0.630   0.5289    
    ---
    Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
    
    confint(loyn.av)
    
                         2.5 %       97.5 %
    (Intercept)  -329.67991528 125.27564917
    GRAZE          -4.01414314  -0.21200237
    log10(AREA)     4.66400477  10.42625188
    YR.ISOL        -0.00703386   0.18356296
    ALT            -0.02125843   0.07188093
    log10(LDIST)   -5.32086871   2.33043584
    log10(DIST)    -6.51434345   3.34659824
    
    1. The model that you would consider to be the most parsimonious would be the model containing which terms?
    2. Use the model averaging to estimate the regression coefficients for each term.
      Predictor:EstimateSELower 95% CIUpper 95% CI
      log10 patch distance
      log10 large patch distance
      log10 fragment area
      Grazing intensity
      Altitude
      Years of isolation
      Intercept
  6. Write out the full predictive model relating the abundance of forest birds to habitat characteristics based on:
    • the model averaging
    • refitting the most parsimonious model
      Show code
      loyn.lm1 <- lm(ABUND ~ log10(AREA) + GRAZE + YR.ISOL, data = loyn)
      summary(loyn.lm1)
      
      Call:
      lm(formula = ABUND ~ log10(AREA) + GRAZE + YR.ISOL, data = loyn)
      
      Residuals:
           Min       1Q   Median       3Q      Max 
      -14.5159  -3.8136   0.2027   3.1271  14.5542 
      
      Coefficients:
                    Estimate Std. Error t value Pr(>|t|)    
      (Intercept) -134.26065   86.39085  -1.554   0.1262    
      log10(AREA)    7.16617    1.27260   5.631 7.32e-07 ***
      GRAZE         -1.90216    0.87447  -2.175   0.0342 *  
      YR.ISOL        0.07835    0.04340   1.805   0.0768 .  
      ---
      Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
      
      Residual standard error: 6.311 on 52 degrees of freedom
      Multiple R-squared:  0.6732,	Adjusted R-squared:  0.6544 
      F-statistic: 35.71 on 3 and 52 DF,  p-value: 1.135e-12
      
    Q2-7. Examine the partial effects plots.
    Show added variance plots (car) code
    library(car)
    avPlots(loyn.lm1, ask = F)
    
    plot of chunk Q2-7a
    Show allEffects (effects) code
    library(effects)
    plot(allEffects(loyn.lm1, default.levels = 1000, partial.resid = TRUE),
        ask = F)
    
    plot of chunk Q2-7b
    # Note, the trend with AREA is plotted on a linear scale
    # despite being a log transformed predictor consequently the
    # trend appears non-linear
    
    plot(allEffects(loyn.lm1, default.levels = 1000, partial.resid = TRUE),
        transform.x = list(AREA = c(trans = log10, inverse = function(x) 10^x)),
        ticks.x = list(AREA = list(at = as.vector(c(1, 5) %o% 10^(-1:2)))),
        ask = F)
    
    plot of chunk Q2-7b
    Show ggeffect (ggeffects) code
    library(ggeffects)
    g = ggeffect(loyn.lm1) %>% plot
    do.call("grid.arrange", g)
    
    plot of chunk Q2-7c
    Show ggpredict (ggpredict) code
    library(ggeffects)
    g = ggpredict(loyn.lm1) %>% plot
    do.call("grid.arrange", g)
    
    plot of chunk Q2-7d
    Show ggemmeans (ggpredict) code
    library(ggeffects)
    grid.arrange(ggemmeans(loyn.lm1, ~AREA) %>% plot, ggemmeans(loyn.lm1,
        ~GRAZE) %>% plot, ggemmeans(loyn.lm1, ~YR.ISOL) %>% plot)
    
    plot of chunk Q2-7e
  7. How a more flexible figure.
    Show predict code
    newdata = with(loyn, rbind(data.frame(Var = "AREA", AREA = seq(min(AREA),
        max(AREA), len = 100), GRAZE = mean(GRAZE), YR.ISOL = mean(YR.ISOL)),
        data.frame(Var = "GRAZE", AREA = mean(AREA), GRAZE = seq(min(GRAZE),
            max(GRAZE), len = 100), YR.ISOL = mean(YR.ISOL)), data.frame(Var = "YR.ISOL",
            AREA = mean(AREA), GRAZE = mean(GRAZE), YR.ISOL = seq(min(YR.ISOL),
                max(YR.ISOL), len = 100))))
    fit <- predict(loyn.lm1, newdata = newdata, interval = "confidence")
    fit <- cbind(newdata, fit)
    head(fit)
    
       Var     AREA    GRAZE YR.ISOL       fit        lwr      upr
    1 AREA  0.10000 2.982143 1949.75  5.669721  0.4540249 10.88542
    2 AREA 17.98788 2.982143 1949.75 21.829281 19.9466085 23.71195
    3 AREA 35.87576 2.982143 1949.75 23.977848 21.6553721 26.30032
    4 AREA 53.76364 2.982143 1949.75 25.236854 22.5868714 27.88684
    5 AREA 71.65152 2.982143 1949.75 26.130739 23.2284951 29.03298
    6 AREA 89.53939 2.982143 1949.75 26.824343 23.7179571 29.93073
    
    # calculated the partial observed values
    partial.obs <- with(loyn, rbind(data.frame(Var = "AREA", AREA = AREA,
        GRAZE = mean(GRAZE), YR.ISOL = mean(YR.ISOL)), data.frame(Var = "GRAZE",
        AREA = mean(AREA), GRAZE = GRAZE, YR.ISOL = mean(YR.ISOL)),
        data.frame(Var = "YR.ISOL", AREA = mean(AREA), GRAZE = mean(GRAZE),
            YR.ISOL = YR.ISOL)))
    partial.obs <- cbind(partial.obs, fit = predict(loyn.lm1, newdata = partial.obs))
    partial.obs = partial.obs %>% mutate(presid = fit + resid(loyn.lm1))
    head(partial.obs)
    
       Var AREA    GRAZE YR.ISOL       fit    presid
    1 AREA  0.1 2.982143 1949.75  5.669721  2.001867
    2 AREA  0.5 2.982143 1949.75 10.678656  8.169284
    3 AREA  0.5 2.982143 1949.75 10.678656  9.236347
    4 AREA  1.0 2.982143 1949.75 12.835887 15.860729
    5 AREA  1.0 2.982143 1949.75 12.835887 20.125990
    6 AREA  1.0 2.982143 1949.75 12.835887 12.939082
    
    fit = fit %>% mutate(Value = case_when(Var == "AREA" ~ AREA,
        Var == "GRAZE" ~ GRAZE, Var == "YR.ISOL" ~ YR.ISOL))
    partial.obs = partial.obs %>% mutate(Value = case_when(Var ==
        "AREA" ~ AREA, Var == "GRAZE" ~ GRAZE, Var == "YR.ISOL" ~
        YR.ISOL))
    
    ggplot(fit, aes(y = fit, x = Value)) + geom_line() + geom_ribbon(aes(ymin = lwr,
        ymax = upr), fill = "blue", alpha = 0.3) + geom_point(data = partial.obs,
        aes(y = presid)) + facet_wrap(~Var, scales = "free_x") +
        theme_bw()
    
    plot of chunk Q2-8a
    ## Or if we need a bit more finesse Area
    p1 <- ggplot(data = fit %>% filter(Var == "AREA"), aes(y = fit,
        x = AREA)) + geom_point(data = partial.obs %>% filter(Var ==
        "AREA"), aes(y = presid, x = AREA), color = "grey") + geom_ribbon(aes(ymin = lwr,
        ymax = upr), fill = "blue", alpha = 0.2) + geom_line() +
        scale_y_continuous("Bird abundance") + scale_x_continuous(expression(Fragment ~
        area ~ (ha)), trans = scales::log_trans(base = 10)) + theme_classic() +
        theme(axis.title.y = element_text(vjust = 1, size = rel(1.25)),
            axis.title.x = element_text(vjust = -1, size = rel(1.25)),
            plot.margin = unit(c(0.5, 0.5, 1, 1), "lines"))
    
    ### Grazing
    p2 <- ggplot(data = fit %>% filter(Var == "GRAZE"), aes(y = fit,
        x = GRAZE)) + geom_point(data = partial.obs %>% filter(Var ==
        "GRAZE"), aes(y = presid, x = GRAZE), color = "grey") + geom_ribbon(aes(ymin = lwr,
        ymax = upr), fill = "blue", alpha = 0.2) + geom_line() +
        scale_y_continuous("Bird abundance") + scale_x_continuous(expression(Grazing ~
        intensity)) + theme_classic() + theme(axis.title.y = element_text(vjust = 1,
        size = rel(1.25)), axis.title.x = element_text(vjust = -1,
        size = rel(1.25)), plot.margin = unit(c(0.5, 0.5, 1, 1),
        "lines"))
    
    ### Grazing
    p3 <- ggplot(data = fit %>% filter(Var == "YR.ISOL"), aes(y = fit,
        x = YR.ISOL)) + geom_point(data = partial.obs %>% filter(Var ==
        "YR.ISOL"), aes(y = presid, x = YR.ISOL), color = "grey") +
        geom_ribbon(aes(ymin = lwr, ymax = upr), fill = "blue", alpha = 0.2) +
        geom_line() + scale_y_continuous("Bird abundance") + scale_x_continuous(expression(Year ~
        isolated)) + theme_classic() + theme(axis.title.y = element_text(vjust = 1,
        size = rel(1.25)), axis.title.x = element_text(vjust = -1,
        size = rel(1.25)), plot.margin = unit(c(0.5, 0.5, 1, 1),
        "lines"))
    
    grid.arrange(p1, p2, p3, ncol = 2)
    
    plot of chunk Q2-8a
    Show emmeans code
    ## It is best to generate the partial trends one at a time
    ## AREA
    loyn.list = with(loyn, list(AREA = seq(min(AREA), max(AREA),
        len = 100), GRAZE = mean(GRAZE), YR.ISOL = mean(YR.ISOL)))
    loyn.grid = ref_grid(loyn.lm1, ~AREA, at = loyn.list, cov.reduce = FALSE)
    newdata = confint(loyn.grid)
    head(newdata)
    
          AREA    GRAZE YR.ISOL prediction       SE df   lower.CL upper.CL
    1  0.10000 2.982143 1949.75   5.669721 2.599210 52  0.4540249 10.88542
    2 17.98788 2.982143 1949.75  21.829281 0.938218 52 19.9466085 23.71195
    3 35.87576 2.982143 1949.75  23.977848 1.157392 52 21.6553721 26.30032
    4 53.76364 2.982143 1949.75  25.236854 1.320603 52 22.5868714 27.88684
    5 71.65152 2.982143 1949.75  26.130739 1.446315 52 23.2284951 29.03298
    6 89.53939 2.982143 1949.75  26.824343 1.548048 52 23.7179571 29.93073
    
    # calculated the partial observed values
    partial.obs <- ref_grid(loyn.lm1, ~AREA, cov.reduce = FALSE,
        at = list(AREA = loyn$AREA, GRAZE = mean(loyn$GRAZE), YR.ISOL = mean(loyn$YR.ISOL))) %>%
        summary %>% mutate(presid = prediction + as.vector(resid(loyn.lm1)))
    
    p1 <- ggplot(data = newdata, aes(y = prediction, x = AREA)) +
        geom_point(data = partial.obs, aes(y = presid, x = AREA),
            color = "grey") + geom_ribbon(aes(ymin = lower.CL, ymax = upper.CL),
        fill = "blue", alpha = 0.2) + geom_line() + scale_y_continuous("Bird abundance") +
        scale_x_continuous(expression(Fragment ~ area ~ (ha)), trans = scales::log_trans(base = 10)) +
        theme_classic() + theme(axis.title.y = element_text(vjust = 1,
        size = rel(1.25)), axis.title.x = element_text(vjust = -1,
        size = rel(1.25)), plot.margin = unit(c(0.5, 0.5, 1, 1),
        "lines"))
    
    ### Grazing
    loyn.list = with(loyn, list(GRAZE = seq(min(GRAZE), max(GRAZE),
        len = 100), AREA = mean(AREA), YR.ISOL = mean(YR.ISOL)))
    loyn.grid = ref_grid(loyn.lm1, ~GRAZE, at = loyn.list, cov.reduce = FALSE)
    newdata = confint(loyn.grid)
    head(newdata)
    
          AREA    GRAZE YR.ISOL prediction       SE df lower.CL upper.CL
    1 69.26964 1.000000 1949.75   29.79587 1.728280 52 26.32782 33.26391
    2 69.26964 1.040404 1949.75   29.71901 1.705234 52 26.29721 33.14081
    3 69.26964 1.080808 1949.75   29.64216 1.682615 52 26.26574 33.01857
    4 69.26964 1.121212 1949.75   29.56530 1.660439 52 26.23339 32.89722
    5 69.26964 1.161616 1949.75   29.48845 1.638725 52 26.20010 32.77679
    6 69.26964 1.202020 1949.75   29.41159 1.617491 52 26.16586 32.65733
    
    # calculated the partial observed values
    partial.obs <- ref_grid(loyn.lm1, ~GRAZE, cov.reduce = FALSE,
        at = list(GRAZE = loyn$GRAZE, AREA = mean(loyn$AREA), YR.ISOL = mean(loyn$YR.ISOL))) %>%
        summary %>% mutate(presid = prediction + as.vector(resid(loyn.lm1)))
    
    p2 <- ggplot(data = newdata, aes(y = prediction, x = GRAZE)) +
        geom_point(data = partial.obs, aes(y = presid, x = GRAZE),
            color = "grey") + geom_ribbon(aes(ymin = lower.CL, ymax = upper.CL),
        fill = "blue", alpha = 0.2) + geom_line() + scale_y_continuous("Bird abundance") +
        scale_x_continuous(expression(Grazing ~ intensity)) + theme_classic() +
        theme(axis.title.y = element_text(vjust = 1, size = rel(1.25)),
            axis.title.x = element_text(vjust = -1, size = rel(1.25)),
            plot.margin = unit(c(0.5, 0.5, 1, 1), "lines"))
    
    
    ### Grazing
    loyn.list = with(loyn, list(YR.ISOL = seq(min(YR.ISOL), max(YR.ISOL),
        len = 100), AREA = mean(AREA), GRAZE = mean(GRAZE)))
    loyn.grid = ref_grid(loyn.lm1, ~YR.ISOL, at = loyn.list, cov.reduce = FALSE)
    newdata = confint(loyn.grid)
    head(newdata)
    
          AREA    GRAZE  YR.ISOL prediction       SE df lower.CL upper.CL
    1 69.26964 2.982143 1890.000   21.34392 2.837729 52 15.64960 27.03824
    2 69.26964 2.982143 1890.869   21.41199 2.805184 52 15.78297 27.04100
    3 69.26964 2.982143 1891.737   21.48005 2.772769 52 15.91608 27.04402
    4 69.26964 2.982143 1892.606   21.54811 2.740490 52 16.04892 27.04731
    5 69.26964 2.982143 1893.475   21.61618 2.708351 52 16.18148 27.05088
    6 69.26964 2.982143 1894.343   21.68424 2.676357 52 16.31374 27.05475
    
    # calculated the partial observed values
    partial.obs <- ref_grid(loyn.lm1, ~YR.ISOL, cov.reduce = FALSE,
        at = list(YR.ISOL = loyn$YR.ISOL, AREA = mean(loyn$AREA),
            GRAZE = mean(loyn$GRAZE))) %>% summary %>% mutate(presid = prediction +
        as.vector(resid(loyn.lm1)))
    
    p3 <- ggplot(data = newdata, aes(y = prediction, x = YR.ISOL)) +
        geom_point(data = partial.obs, aes(y = presid, x = YR.ISOL),
            color = "grey") + geom_ribbon(aes(ymin = lower.CL, ymax = upper.CL),
        fill = "blue", alpha = 0.2) + geom_line() + scale_y_continuous("Bird abundance") +
        scale_x_continuous(expression(Year ~ isolation)) + theme_classic() +
        theme(axis.title.y = element_text(vjust = 1, size = rel(1.25)),
            axis.title.x = element_text(vjust = -1, size = rel(1.25)),
            plot.margin = unit(c(0.5, 0.5, 1, 1), "lines"))
    
    grid.arrange(p1, p2, p3, ncol = 2)
    
    plot of chunk Q2-8b
  8. As an arguably more sensible approach to model selection, lets propose a small set of plausible candidate models for comparison. In the absence of any other motives or criteria, I am going to propose we explore the following:
    • fragment area and grazing intensity - habitat model
    • atch distance - connectivity model
    • years of isolation - history model
    • fragment area, grazing intensity and patch distance - habitat/connectivity model
    Show glht code
    loyn.lmH <- lm(ABUND ~ log10(AREA) + GRAZE, data = loyn)
    loyn.lmC <- lm(ABUND ~ log10(DIST), data = loyn)
    loyn.lmY <- lm(ABUND ~ YR.ISOL, data = loyn)
    loyn.lmHC <- lm(ABUND ~ log10(DIST) + log10(AREA) + GRAZE, data = loyn)
    library(MuMIn)
    AICc(loyn.lmH, loyn.lmC, loyn.lmY, loyn.lmHC)
    
              df     AICc
    loyn.lmH   4 373.2977
    loyn.lmC   3 429.2976
    loyn.lmY   3 413.8419
    loyn.lmHC  5 374.3646
    

Polynomial regression

Rademaker and Cerqueira (2006), compiled data from the literature on the reproductive traits of opossoms (Didelphis) so as to investigate latitudinal trends in reproductive output. In particular, they were interested in whether there were any patterns in mean litter size across a longitudinal gradient from 44oN to 34oS. Analyses of data compiled from second hand sources are called metaanalyses and are very usefull at revealing overal trends across a range of studies.

Download Rademaker data set
Format of rademaker.csv data files
SPECIESLATITUDEL/YMAOPMLSREFERENCE
D.v.44216.88.4Tyndale-Biscoe and Mackenzie (1976)
D.v.411.514.19.4Hossler et al. (1994)
D.v.41.5??8.6Reynolds (1952)
D.v.412189Wiseman and Hendrickson (1950)
D.v.40215.87.9Sanderson (1961)
..................

SPECIESDidelphid species (D.al.=Didelphis albiventris, D.au.=Didelphis aurita, D.m.=Didelphis marsupialis, D.v.=Didelphis virginiana - Descriptor variable
LATITUDELattitude (degees) of study site - Predictor variable
L/YMean number of litter per year - Response variable
MAOPMean annual offspring production - Response variable
MLSMean litter size - Response variable
REFERENCEOriginal source of data
opossum

Open the rademaker data file.
Show code
rademaker <- read.table("../downloads/data/rademaker.csv", header = T, sep = ",", strip.white = T)
head(rademaker)
  SPECIES LATITUDE L.Y MAOP MLS         DB                           REFERENCE
1    D.v.     44.0 2.0 16.8 8.4 M-Feb-Sept Tyndale-Biscoe and Mackenzie (1976)
2    D.v.     41.0 1.5 14.1 9.4 E-Mar-Sept               Hossler et al. (1994)
3    D.v.     41.5  NA   NA 8.6                                Reynolds (1952)
4    D.v.     41.0 2.0 18.0 9.0 M-Feb-Sept      Wiseman and Hendrickson (1950)
5    D.v.     40.0 2.0 15.8 7.9 M-Feb-Sept                    Sanderson (1961)
6    D.v.     39.0 2.0 17.8 8.9 E-Feb-Sept                     Reynolds (1945)

The main variables of interest in this data set are MLS (mean litter size) and LATITUDE. The other variables were included so as to enable you to see how meta data might be collected and collated from a number of other sources.

The relationship between two continuous variables can be analyzed by simple linear regression. Before performing the analysis we need to check the assumptions. To evaluate the assumptions of linearity, normality and homogeneity of variance, construct a scatterplot

of MLS against LATITUDE including a lowess smoother and boxplots on the axes. (HINT)
Show code
library(car)
scatterplot(MLS ~ LATITUDE, data = rademaker)
plot of chunk Q3-a

  1. Is there any evidence that any of the regular assumptions of simple linear regression are likely to be violated?
  2. To get an appreciation of what a residual plot would look like when there is some evidence that the linearity assumption has been violated, perform the simple linear regression (by fitting a linear model)

    purely for the purpose of examining the regression diagnostics
    (particularly the residual plot)
    Show code
    # fit the model
    rademaker.lm <- lm(MLS ~ LATITUDE, data = rademaker)
    # setup a 2x6 plotting device with small margins
    par(mfrow = c(2, 3), oma = c(0, 0, 2, 0))
    plot(rademaker.lm, ask = F, which = 1:6)
    
    plot of chunk Q3-1a
    influence.measures(rademaker.lm)
    
    Influence measures of
     lm(formula = MLS ~ LATITUDE, data = rademaker) :
    
          dfb.1_  dfb.LATI    dffit cov.r   cook.d    hat inf
    1   0.094628  0.201743  0.24356 1.118 3.00e-02 0.0861    
    2   0.185879  0.354286  0.43929 1.007 9.30e-02 0.0773    
    3   0.114708  0.222842  0.27499 1.093 3.79e-02 0.0787    
    4   0.150114  0.286119  0.35477 1.053 6.20e-02 0.0773    
    5   0.057329  0.105129  0.13168 1.131 8.87e-03 0.0745    
    6   0.144545  0.254834  0.32264 1.057 5.15e-02 0.0719    
    7   0.016273  0.028690  0.03632 1.141 6.79e-04 0.0719    
    8   0.041857  0.072336  0.09210 1.133 4.35e-03 0.0705    
    9   0.000616  0.001001  0.00130 1.135 8.68e-07 0.0667    
    10 -0.085085 -0.111051 -0.15544 1.095 1.23e-02 0.0552    
    11 -0.031444 -0.039138 -0.05583 1.116 1.60e-03 0.0531    
    12 -0.031444 -0.039138 -0.05583 1.116 1.60e-03 0.0531    
    13 -0.076087 -0.090187 -0.13132 1.096 8.79e-03 0.0512    
    14  0.049708  0.009612  0.05345 1.084 1.47e-03 0.0279    
    15 -0.278150  0.002187 -0.28314 0.925 3.80e-02 0.0270    
    16 -0.302666  0.002380 -0.30809 0.899 4.44e-02 0.0270    
    17 -0.117656  0.000925 -0.11977 1.057 7.27e-03 0.0270    
    18 -0.267330  0.012377 -0.27036 0.939 3.49e-02 0.0271    
    19 -0.255295  0.011820 -0.25819 0.951 3.21e-02 0.0271    
    20 -0.231553  0.010721 -0.23418 0.973 2.67e-02 0.0271    
    21 -0.317516  0.020735 -0.32026 0.887 4.76e-02 0.0271    
    22 -0.280153  0.018295 -0.28257 0.927 3.79e-02 0.0271    
    23 -0.046707  0.026910 -0.05012 1.097 1.29e-03 0.0380    
    24  0.021524 -0.017109  0.02524 1.115 3.28e-04 0.0500    
    25  0.333540 -0.265117  0.39106 0.947 7.25e-02 0.0500    
    26  0.234031 -0.186021  0.27439 1.027 3.72e-02 0.0500    
    27 -0.082266  0.069901 -0.09893 1.109 5.01e-03 0.0540    
    28  0.118457 -0.103838  0.14428 1.100 1.06e-02 0.0561    
    29 -0.002004  0.001757 -0.00244 1.123 3.07e-06 0.0561    
    30  0.051315 -0.044982  0.06250 1.118 2.01e-03 0.0561    
    31  0.231543 -0.209119  0.28565 1.043 4.04e-02 0.0582    
    32  0.066101 -0.059700  0.08155 1.118 3.41e-03 0.0582    
    33  0.067544 -0.062775  0.08440 1.121 3.65e-03 0.0605    
    34  0.311295 -0.297381  0.39396 0.991 7.48e-02 0.0628    
    35  0.180242 -0.172186  0.22811 1.081 2.62e-02 0.0628    
    36 -0.039381  0.039625 -0.05112 1.134 1.34e-03 0.0677    
    37  0.023962 -0.028153  0.03388 1.160 5.91e-04 0.0873    
    

  3. How would you describe the residual plot?
  4. For this sort of trend that is clearly non-linear (yet the boxplots suggest normal data), transformations are of no use. Therefore, rather than attempt to model the data on a simple linear relationship (straight line), it is better to attempt to model the data on a curvilinear linear relationship (curved line). Note it is important to make the distinction between

    line (relationship) linearity and model linearity

  5. If the assumptions are otherwise met, perform the second order polynomial regression analysis ( fit the quadratic polynomial model
    ), examine the output, and use the information to construct the regression equation relating the number of mean litter size to latitude:
    Show code
    rademaker.lm <- lm(MLS ~ LATITUDE + I(LATITUDE^2), data = rademaker)
    rademaker.lm
    
    Call:
    lm(formula = MLS ~ LATITUDE + I(LATITUDE^2), data = rademaker)
    
    Coefficients:
      (Intercept)       LATITUDE  I(LATITUDE^2)  
         5.391575      -0.029758       0.002448  
    
    DV=intercept+slope1xIV2+slope2xIV
    Mean litter size   =     +     x   latitude2   +     x   latitude
  6. In polynomial regression, there are two hypotheses of interest. Firstly, as there are two slopes, we are now interested in whether the individual slopes are equal to one another and to zero
    (that is, does the overall model explain our data better than just a horizontal line (no trend). Secondly, we are interested in whether the second order polynomial model fits the data any better than a simple first order (straight-line) model
    . Test these null hypotheses.
    Show code
    summary(rademaker.lm)
    
    Call:
    lm(formula = MLS ~ LATITUDE + I(LATITUDE^2), data = rademaker)
    
    Residuals:
        Min      1Q  Median      3Q     Max 
    -2.3335 -0.6117 -0.1748  0.4688  2.4592 
    
    Coefficients:
                   Estimate Std. Error t value Pr(>|t|)    
    (Intercept)    5.391574   0.290622  18.552  < 2e-16 ***
    LATITUDE      -0.029758   0.008385  -3.549  0.00115 ** 
    I(LATITUDE^2)  0.002448   0.000368   6.653 1.24e-07 ***
    ---
    Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
    
    Residual standard error: 1.019 on 34 degrees of freedom
    Multiple R-squared:  0.5711,	Adjusted R-squared:  0.5459 
    F-statistic: 22.64 on 2 and 34 DF,  p-value: 5.624e-07
    
    rademaker.lm1 <- lm(MLS ~ LATITUDE, data = rademaker)
    anova(rademaker.lm, rademaker.lm1)
    
    Analysis of Variance Table
    
    Model 1: MLS ~ LATITUDE + I(LATITUDE^2)
    Model 2: MLS ~ LATITUDE
      Res.Df    RSS Df Sum of Sq      F    Pr(>F)    
    1     34 35.322                                  
    2     35 81.311 -1   -45.989 44.267 1.237e-07 ***
    ---
    Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
    

    1. The slopes significantly different from one another and to 0 (F = , df = ,, P = )
    2. The second order polynomial regression model fit the data significantly better than a first order (straight line) regression model (F = , df = ,, P = )
  7. What are your conclusions (statistical and biological)?

  8. Such an outcome might also be accompanied by a scatterpoint that illustrates the relationship between the mean litter size and latitude. Construct a scatterplot without a smoother or marginal boxplots (HINT). Include a quadratic (second order) trendline on this graph (HINT).
    Show code
    opar <- par(mfrow = c(2, 2))
    # first plot trend with 95% confidence intervals
    opar1 <- par(mar = c(5, 5, 0, 0))
    plot(MLS ~ LATITUDE, data = rademaker, ann = F, axes = F, type = "n")
    points(MLS ~ LATITUDE, data = rademaker, col = "grey", pch = 16)
    xs <- with(rademaker, seq(min(LATITUDE), max(LATITUDE), l = 1000))
    ys <- predict(rademaker.lm, data.frame(LATITUDE = xs), interval = "confidence")
    points(ys[, "fit"] ~ xs, col = "black", type = "l")
    points(ys[, "lwr"] ~ xs, col = "black", type = "l", lty = 2)
    points(ys[, "upr"] ~ xs, col = "black", type = "l", lty = 2)
    axis(1)
    mtext("Lattitude", 1, line = 3, cex = 1.2)
    axis(2, las = 1)
    mtext(expression(paste("Mean litter size (", phantom() %+-% 95, "% CI)")), 2, line = 3, cex = 1.2)
    box(bty = "l")
    par(opar1)
    
    # second plot trend with 95% prediction intervals
    opar1 <- par(mar = c(5, 5, 0, 0))
    plot(MLS ~ LATITUDE, data = rademaker, ann = F, axes = F, type = "n")
    points(MLS ~ LATITUDE, data = rademaker, col = "grey", pch = 16)
    xs <- with(rademaker, seq(min(LATITUDE), max(LATITUDE), l = 1000))
    ys <- predict(rademaker.lm, data.frame(LATITUDE = xs), interval = "prediction")
    points(ys[, "fit"] ~ xs, col = "black", type = "l")
    points(ys[, "lwr"] ~ xs, col = "black", type = "l", lty = 2)
    points(ys[, "upr"] ~ xs, col = "black", type = "l", lty = 2)
    axis(1)
    mtext("Lattitude", 1, line = 3, cex = 1.2)
    axis(2, las = 1)
    mtext(expression(paste("Mean litter size (", phantom() %+-% 95, "% PI)")), 2, line = 3, cex = 1.2)
    box(bty = "l")
    par(opar1)
    
    # third plot trend with standard error
    opar1 <- par(mar = c(5, 5, 0, 0))
    plot(MLS ~ LATITUDE, data = rademaker, ann = F, axes = F, type = "n")
    points(MLS ~ LATITUDE, data = rademaker, col = "grey", pch = 16)
    xs <- with(rademaker, seq(min(LATITUDE), max(LATITUDE), l = 1000))
    ys <- predict(rademaker.lm, data.frame(LATITUDE = xs), se.fit = T)
    points(ys$fit ~ xs, col = "black", type = "l")
    points(ys$fit - ys$se.fit ~ xs, col = "black", type = "l", lty = 2)
    points(ys$fit + ys$se.fit ~ xs, col = "black", type = "l", lty = 2)
    axis(1)
    mtext("Lattitude", 1, line = 3, cex = 1.2)
    axis(2, las = 1)
    mtext(expression(paste("Mean litter size (", phantom() %+-% 1, "SE)")), 2, line = 3, cex = 1.2)
    box(bty = "l")
    par(opar1)
    
    # forth plot trend with standard error
    opar1 <- par(mar = c(5, 5, 0, 0))
    plot(MLS ~ LATITUDE, data = rademaker, ann = F, axes = F, type = "n")
    points(MLS ~ LATITUDE, data = rademaker, col = "grey", pch = 16)
    xs <- with(rademaker, seq(min(LATITUDE), max(LATITUDE), l = 1000))
    ys <- predict(rademaker.lm, data.frame(LATITUDE = xs), se.fit = T)
    points(ys$fit ~ xs, col = "black", type = "l")
    points(ys$fit - 2 * ys$se.fit ~ xs, col = "black", type = "l", lty = 2)
    points(ys$fit + 2 * ys$se.fit ~ xs, col = "black", type = "l", lty = 2)
    axis(1)
    mtext("Lattitude", 1, line = 3, cex = 1.2)
    axis(2, las = 1)
    mtext(expression(paste("Mean litter size (", phantom() %+-% 2, "SE)")), 2, line = 3, cex = 1.2)
    box(bty = "l")
    
    plot of chunk Q3-6
    par(opar1)
    par(opar)
    

  Frequentist pooled variances t-test

t.test(y~x, data, var.equal=TRUE)
Error in model.frame.default(formula = y ~ x, data = data): variable lengths differ (found for 'x')
#OR
data.lm <- lm(y~x, data)
Error in model.frame.default(formula = y ~ x, data = data, drop.unused.levels = TRUE): variable lengths differ (found for 'x')
summary(data.lm)
Error in summary(data.lm): object 'data.lm' not found

End of instructions

  Frequentist pooled variances t-test

hello

End of instructions

  Reading data into R

Ensure that the working directory is pointing to the path containing the file to be imported before proceeding.
To import data into R, we read the contents of a file into a data frame. The general format of the command for reading data into a data frame is

> name <- read.table('filename.csv', header=T, sep=',', row.names=column, strip.white=T)

where name is a name you wish the data frame to be referred to as, filename.csv is the name of the csv file that you created in excel and column is the number of the column that had row labels (if there were any). The argument header=T indicates that the variable (vector) names will be created from the names supplied in the first row of the file. The argument sep=',' indicates that entries in the file are separated by a comma (hence a comma delimited file). If the data file does not contain row labels, or you are not sure whether it does, it is best to omit the row.names=column argument. The strip.white=T arguement ensures that no leading or trailing spaces are left in character names (these can be particularly nasty in categorical variables!).

As an example
phasmid <- read.data('phasmid.csv', header=T, sep=',', row.names=1, strip.white=T)

End of instructions

  Scatterplot matrix (SPLOM)

# make sure the car package is loaded
> library(car)
> scatterplot.matrix(~var1+var2+var3,data=data, diagonal='boxplot')

where var1, var2 etc are the names of numeric vectors in the data data frame (data set)

End of instructions

  Print contents

Predictor variables must not be correlated to the combination of other predictor variables in the fitted model. Multicollinearity has major detrimental effects on model fitting: Multicollinearity can be diagnosed with the following: There are several approaches to dealing with collinearity: Interaction terms in multiplicative models are likely to be correlated to their constituent individual predictors, and thus the partial slopes of these individual predictors are likely to be unstable. However, this problem can be reduced by first centering (subtracting the mean from the predictor values) the individual predictor variables.

End of instructions

  Tolerance and Variance Inflation

The tolerance value for a predictor variable is 1-r2 from the regression of this predictor variable against the other predictor variables included in the linear model. Low tolerances (less than 0.2, approaching 0.1) signify that the predictor value is correlated to the set of other predictor variables (an likely to cause issues).

Variance inflation factor (VIF) is the inverse of tolerance and is just another way of expressing the degree of correlation between a predictor variable and other predictor variables in a linear model. VIF values greater than 5 (approaching 10) indicate strong collinearity and are therefore of concern.

R have a function (vif) for calculating VIF in the car package.

End of instructions

  Multiple linear model fitting

> name.lm <- lm(dv~iv1+iv2, data=data)

where name.lm is a name you provide for the output, dv is the name of the dependent variable and iv1, iv2 etc are the independent variables. data is the name of the data frame (data set).

End of instructions

  Best regression model

Clearly, a regression model that includes all of the predictor variables will explain variation in the response variable better than models containing only a subset of the predictor variables. But how much better? Models with fewer predictor variables are easier and more precise to use for predictive purposes. Moreover, we are most often purely interested in determining which predictors are most important in explaining variation in the response variable.

Therefore, the best regression model should be a model that only includes the most important predictor variables. It should be the most efficient model - that is the model that explains the greatest amount of variation in the response variable with the least number of predictors.

There are a number of ways of determining which model is 'best'. These are based on:
  1. Adjusted r2 - this is the r2 that takes into account the number of predictor variables. It is calculated as the ratio of MSregression/MStotal. Larger values indicate better, more parsimonious models.
  2. AIC - Akaike Information Criterion - based on (log)likelihoods, this is a relative measure of goodness of fit of a model given the set of data. Lower values indicate better, more parsimonious models. The AIC comes in a number of forms;
    • a parametric model only version (extractAIC(model))
    • a generic version (AIC(model))
    • AICc a second order correction (for small sample sizes) version (library(MuMIn);AICc(model))
    • QAIC a quasi-AIC corrected for overdispersion (library(MuMIn);QAIC(model))
  3. BIC - Schwarz Bayesian Information Criterion - is similar to AIC, but it penalizes more heavily against models with more predictors. Lower values indicate better, more parsimonious models. There are two variants on the BIC;
    • a parametric model only version (extractAIC(model,k=log(nrow(dataset))))
    • a generic version (AIC(model,k=log(nrow(dataset))))

End of instructions

  Model selection (dredging) and averaging

Model selection is generally the process of selecting the most parsimonious model from a large collection of possible models. For example, which model explains the most with the fewest predictor variables. For an additive model with five predictor variables, there are large number of possible combinations of predictor variables to assess in order to determine which is the most efficient. There are various techniques that have been used in the past to attempt to find the most efficient models with limited computing power. These techniques (stepwise regression) essentially involve starting with the full saturated model, and incrementally removing the most non-significant term (each time re-running the model) until the model either only contains significant terms or else the goodness of fit measure (adjusted r2) plateaus.

Whilst these techniques were computationally efficient, not all possible combinations of predictor variables were explored. Therefore it is possible that more parsimonious models were overlooked. Recent advances in computing power now permit exploration of all possible model combinations (a process called 'dredging'). Dredging can be performed with one of a number of selection criteria, although AIC, AICc and QAIC are the most popular.

Full exploration (fitting and coefficient estimation) of all possible combinations of predictor terms also allows us to get an estimate of the general effect of any given term across all possible model combinations. This process (called 'model averaging') calculates the average (and confidence intervals) of each regression coefficient across all possible model combinations. Arguably, not only does this provide more accurate estimates, it allows us to estimate the relative contributions (importance) of each predictor term.

Model dredging and averaging in R is a multi-step process - although it can all be done in a single line of code

# make sure the MuMIn package is loaded
> model.avg(get.models(dredge(model, rank = "AIC"),subset=delta>=4))

where model is the fitted model. Working from inside to out (starting with dredge function), all possible combinations of predictors from the original model are explored and the goodness of fit of each is assessed via AIC (the general version). Next the get.models function is used to determine which of all of the models to use in the model averaging process. By default, only those models with a delta (difference in AIC between each model and the model with the lowest AIC) less than or equal to 4 are retained (although this can of course be altered). Finally, this selection of models is used by the model.avg function to calculate the mean (and confidence intervals) regression coefficients and estimate the relative importance of each term. .

End of instructions

  Hierarchical Partitioning

Hierarchical partitioning determines the contributions of individual add joint contributions of each predictor variable by partitioning the r2 for the full model (containing all predictors). The individual contribution of a predictor variable are determined by calculating the improvement in fit (by r2) of a series of models (a hierarchy of models) with and without the predictor variable. Joint contributions are calculated from the differences between partial r2 for a series of models and the independent contributions.

End of instructions

  Hierarchical Partitioning

> loyn2 <- data.frame(logAREA=log10(loyn$AREA), logDIST=log10(loyn$DIST), logLDIST=log10(loyn$LDIST), ALT=loyn$ALT,GRAZE=loyn$GRAZE, YEARS=loyn$YR.ISOL)
> library(Hier.part)
> loyn.hier<-hier.part(loyn$ABUND, loyn2, gof="Rsqu")
> loyn.hier

The first step is to create a data frame that contains only the predictor variables. Then we load a package called hier.part by Chris Walsh and Ralph MacNally (Monash University). This contains the functions necessary to perform hierarchical partitioning in R.
Finally, we use the hier.part() function to perform the hierarchical partitioning and print out the results. The gof= parameter specifies in this case to use r2 values.

End of instructions

  Scatterplots

# load the 'car' library
> library(car)
# generate a scatterplot
> scatterplot(dv~iv,data=data)

where dv is the dependent variable, iv is the independent variable and data is the data frame (data set).

End of instructions

  Linear model fitting

where name.lm is a name you provide for the output, dv is the name of the dependent variable, iv is the name of the independent variable and data is the name of the data frame (data set).

End of instructions

  Linear regression diagnostics

> plot(name.lm)

where name.lm is the name of a fitted linear model. You will be prompted to hit return to cycle through a series of four (or six) diagnostic plots.
  1. The first (and most important of these) is the Residual Plot. The residual plot graphs the residual (differences between observed and expected values) against the fitted (expected) values. Residual plots assist in identifying issues with heterogeneity as well non-linearity
  2. The normal Q-Q plot compares the standardized residuals to normal quantiles. This plot helps identify issues with normality
  3. The scale-location (standardized/normalized residuals) plot, is similar to the residual plot described above except that it is on a standard scale that can permit automatic outlier detection
  4. The Residuals vs Leverage plot. Whilst the residuals are a measure of deviation in y-space, leverage is a measure of deviation in x-space (and thus each is a measure of outlierness). Combined together on this graph, along with Cook's D (measure of outlierness in x/y-shape) contours allows us to identify overly influential observations. Observations approaching the Cook's D contour of 1 are considered highly influential

> resid(name.lm)
> influence.measures(name.lm)

Collectively, these two functions provide information about the relative influence of each observation on the final model fit. Ideally, all observation should have an equal influence on the fit. Outliers are observations that have an over-representative impact on fitted models by pulling trends towards themselves to a greater degree than other observations (and thus biasing outcomes).

The first function lists the residuals corresponding to each of the observation sin the data set (from which the model was fit). A residual measures the difference between an observations observed value and the value that would be expected from the trendline (fitted model). Therefore, the residual value of any observation is a measure of how much of an outlier the observation is in the y-axis direction.

The second function produces a matrix whose rows correspond to each of the observations in the data set (from which the model was fit). The two important columns are the ones labelled 'hat' (the leverage values) and 'cook.d' (Cook's D values). Leverage measures how influential each observation was in the model fit in terms of its deviation in x-space (how unusual is the observation with respect to the other values of the predictor variable?). As with residual values, any leverage values that are much larger than the general set of leverage values should be further explored as they indicate potential outliers. Cook's D combines residual and leverage values into a single value that measures how influential each observation is in both x-space and y-space. As a rule of thumb, Cook's D values approaching 1 should be examined as they are having a larger influence on the model fitting.

End of instructions

  Residual plot

There is a residual for each point. The residuals are the differences between the predicted Y-values (point on the least squares regression line) for each X-value and the observed Y-value for each X-value. A residual plot, plots the residuals (Y-axis) against the predicted Y-values (X-axis) Plot a) shows a uniform spread of data around the least squares regression line and thus there is an even scatter of points in the residual plot (Plot b). In Plot c) the deviations of points from the predicted Y-values become greater with increasing X-values and thus there is a wedge-shape pattern in the residuals. Such a wedge suggests a violation of the homogeneity of variance assumption, and thus the regression may be unreliable.



End of instructions

  Linear relationship vs linear model

Linear models are those statistical models in which a series of parameters (predictor variables) are arranged as a linear combination. That is, within the model, no parameter appears as either a multiplier, divisor or exponent to any other parameter. Importantly, the term 'linear' in this context does not pertain to the nature of the relationship between the response variable and the predictor variable(s), and thus linear models are not restricted to 'linear' (straight-line) relationships. The following examples should help make the distinction.

A linear model representing a linear relationship
A linear model representing a curvilinear relationship
A non-linear model representing a curvilinear relationship

End of instructions

  Fitting Polynomial regression models


> name.lm <- lm(dv~iv+I(iv^2), data=data)

where name.lm is a name you provide for the output, dv is the name of the dependent variable, iv is the name of the independent variable and data is the name of the data frame (data set). The function I() essentially preserves the meaning of the object in brackets so that it treated as a separate term in the model. The above template defines a second order polynomial (quadratic). To define higher order polynomials, add additional instances of the iv to the appropriate power and enclosed within I() functions.

End of instructions


End of instructions

  Testing the null hypothesis that β12=...=0

Testing the null hypothesis that each of the slopes are equal to each other and zero (β12=...=0) is done by comparing the fit of the full model (Y=c+β1X12X2) to the fit of a reduced model that depicts β1 and β2 both equalling zero (ie Y=c).
This specific null hypothesis is summarized when you summarize the linear model fit. It is the F-ratio etc given at the end of the output (highlighted in the picture below).

End of instructions

  Comparing the fit of two models

> anova(reduced.lm, full.lm)

where full.lm is the name of full model and reduced.lm is the name of the reduced model (although it doesn't matter which model is put first - this only effects the sign of some of the resulting values - and you only take absolute values anyway).
A modified ANOVA table will be presented. The bottom line summarizes the results of comparing the two models. This line indicates the F-ratio, degrees of freedom and P-value for the comparison. Note, the F-ratio and P-value obtained from this model comparison is equivelent to the regression coefficient t-value (F=t2) and p-value given in the full model summary.

End of instructions

  Factorial boxplots

> boxplot(dv~factor,data=data)

where dv is the name of the numeric vector (dependent variable), factor is the name of the factor (categorical or factor variable) and data is the name of the data frame (data set). The '~' is used in formulas to represent that the left hand side is proportional to the right hand side.

End of instructions