Preparations

Load the necessary libraries

library(tidyverse)
## Loading tidyverse: ggplot2
## Loading tidyverse: tibble
## Loading tidyverse: tidyr
## Loading tidyverse: readr
## Loading tidyverse: purrr
## Loading tidyverse: dplyr
## Conflicts with tidy packages ----------------------------------------------
## filter(): dplyr, stats
## lag():    dplyr, stats

Scenario

A plant pathologist wanted to examine the effects of two different strengths of tobacco virus on the number of lesions on tobacco leaves. She knew from pilot studies that leaves were inherently very variable in response to the virus. In an attempt to account for this leaf to leaf variability, both treatments were applied to each leaf. Eight individual leaves were divided in half, with half of each leaf inoculated with weak strength virus and the other half inoculated with strong virus. So the leaves were blocks and each treatment was represented once in each block. A completely randomised design would have had 16 leaves, with 8 whole leaves randomly allocated to each treatment.

Tobacco plant

Tobacco plant

Format of tobacco.csv data files

LEAF TREAT NUMBER
1 Strong 35.898
1 Week 25.02
2 Strong 34.118
2 Week 23.167
3 Strong 35.702
3 Week 24.122
... ... ...
LEAF The blocking factor - Factor B
TREAT Categorical representation of the strength of the tobacco virus - main factor of interest Factor A
NUMBER Number of lesions on that part of the tobacco leaf - response variable

Read in the data

tobacco = read_csv('data/tobacco.csv', trim_ws=TRUE)
## Parsed with column specification:
## cols(
##   LEAF = col_character(),
##   TREATMENT = col_character(),
##   NUMBER = col_double()
## )
glimpse(tobacco)
## Observations: 16
## Variables: 3
## $ LEAF      <chr> "L1", "L1", "L2", "L2", "L3", "L3", "L4", "L4", "L5"...
## $ TREATMENT <chr> "Strong", "Weak", "Strong", "Weak", "Strong", "Weak"...
## $ NUMBER    <dbl> 35.89776, 25.01984, 34.11786, 23.16740, 35.70215, 24...

Exploratory data analysis

Model formula: \[ y_i \sim{} \mathcal{N}(\mu_i, \sigma^2)\\ \mu_i =\boldsymbol{\beta} \bf{X_i} + \boldsymbol{\gamma} \bf{Z_i} \]

where \(\boldsymbol{\beta}\) and \(\boldsymbol{\gamma}\) are vectors of the fixed and random effects parameters respectively and \(\bf{X}\) is the model matrix representing the overall intercept and effects of the treatment on the number of lesions. \(\bf{Z}\) represents a cell means model matrix for the random intercepts associated with leaves.

Fit the model

Model validation

Model investigation / hypothesis testing

Predictions

Summary figures

References