Jump to main navigation


Workshop 9.4a - Analysis of Variance

9 June 2011

Basic ANOVA references

  • Logan (2010) - Chpt 10-11
  • Quinn & Keough (2002) - Chpt 8-9
Very basic overview of ANOVA

ANOVA and Tukey's test

Here is a modified example from Quinn and Keough (2002). Day and Quinn (1989) described an experiment that examined how rock surface type affected the recruitment of barnacles to a rocky shore. The experiment had a single factor, surface type, with 4 treatments or levels: algal species 1 (ALG1), algal species 2 (ALG2), naturally bare surfaces (NB) and artificially scraped bare surfaces (S). There were 5 replicate plots for each surface type and the response (dependent) variable was the number of newly recruited barnacles on each plot after 4 weeks.

Download Day data set

Format of day.csv data files
TREATBARNACLE
ALG127
....
ALG224
....
NB9
....
S12
....
TREATCategorical listing of surface types. ALG1 = algal species 1, ALG2 = algal species 2, NB = naturally bare surface, S = scraped bare surface.
BARNACLEThe number of newly recruited barnacles on each plot after 4 weeks.
Six-plated barnacle

Open
the day data file.
Show code
> day <- read.table("../downloads/data/day.csv", header = T, sep = ",",
+     strip.white = T)
> head(day)
  TREAT BARNACLE
1  ALG1       27
2  ALG1       19
3  ALG1       18
4  ALG1       23
5  ALG1       25
6  ALG2       24

Note that as with independent t-tests, variables are in columns with levels of the categorical variable listed repeatedly. Day and Quinn (1989) were interested in whether substrate type influenced barnacle recruitment. This is a biological question. To address this question statistically, it is first necessary to re-express the question from a statistical perspective.

Q1-1. From a classical hypothesis testing point of view, what is the statistical question they are investigating? That is, what is their statistical H0?

Q1-2.The appropriate statistical test for comparing the means of more than two groups, is an ANOVA. In the table below, list the assumptions of ANOVA
along with how violations of each assumption are diagnosed and/or the risks of violations are minimized.
AssumptionDiagnostic/Risk Minimization
I.
II.
III.

Using boxplots
(HINT), examine the assumptions of normality and homogeneity of variance. Note that when sample sizes are small (as is the case with this data set), these ANOVA assumptions cannot reliably be checked using boxplots since boxplots require at least 5 replicates (and preferably more), from which to calculate the median and quartiles. As with regression analysis, it is the assumption of homogeneity of variances (and in particular, whether there is a relationship between the mean and variance) that is of most concern for ANOVA.
Show code
> boxplot(BARNACLE ~ TREAT, data = day)

Q1-3. Check the assumption of homogeneity of variances by plotting the sample (group) means against sample variances.
A strong relationship (positive or negative) between mean and variance suggests that the assumption of homogeneity of variances is likely to be violated.
Show code
> means <- with(day, tapply(BARNACLE, TREAT, mean))
> vars <- with(day, tapply(BARNACLE, TREAT, var))
> plot(means, vars)
  1. Any evidence of non-homogeneity? (Y or N)
Q1-4. Test the null hypothesis that the population group means are equal by fitting a linear model (ANOVA) single factor ANOVA
HINT. As with regression analysis, it is also a good habit to examine the resulting diagnostics
(note that Leverage and thus Cook's D have no useful meaning for categorical X variables) and residual plot. If there are no obvious problems, then the analysis is likely to be reliable. Examine the ANOVA table
HINT.
Show code
> contrasts(day$TREAT)
     ALG2 NB S
ALG1    0  0 0
ALG2    1  0 0
NB      0  1 0
S       0  0 1
> day.lm <- lm(BARNACLE ~ TREAT, data = day)
Show code - diagnostics
> day.aov <- lm(BARNACLE ~ TREAT, data = day)
> #OR
> day.aov <- aov(BARNACLE ~ TREAT, data = day)
> # setup a 2x6 plotting device with small margins
> par(mfrow = c(2, 3), oma = c(0, 0, 2, 0))
> plot(day.aov, ask = F, which = 1:6)
Show code - ANOVA table
> anova(day.aov)
Analysis of Variance Table

Response: BARNACLE
          Df Sum Sq Mean Sq F value  Pr(>F)    
TREAT      3    737   245.5    13.2 0.00013 ***
Residuals 16    297    18.6                    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Q1-5. Identify
the important items from the ANOVA output and fill out the following ANOVA table
Source of VariationSSdfMSF-ratio
Between groups
Residual (within groups)  

Q1-6. What is the probability that the observed samples (and the degree of differences in barnacle recruitment between them) or ones more extreme, could be collected from populations in which there are no differences in barnacle recruitment. That is, what is the probability of having the above F-ratio or one more extreme when the null hypothesis is true?

Q1-7. What statistical conclusion would you draw?

Q1-8. Write the results out as though you were writing a research paper/thesis. For example (select the phrase that applies and fill in gaps with your results): 
The mean number of barnacles recruiting was (choose the correct option)
(F = , df = ,, P = )
different from the mean metabolic rate of female fulmars. To see how these results could be incorporated into a report, copy and paste the ANOVA table from the R Console into Word and add an appropriate table caption .

Q1-9.Such a result would normally be accompanied by a graph to illustrate the mean (and variability or precision thereof) barnacle recruitment on each substrate type. Construct such a bar graph
showing the mean barnacle recruitment on each substrate type and an indication of the precision of the means with error bars. To see how these results could be incorporated into a report, copy and paste the graph (as a metafile) into Word.
Show code - bargraph (dynamite plot)
> # calculate the means and standard deviations for each group separately
> library(plyr)
> dat <- ddply(day, "TREAT", function(x) return(c(means = mean(x$BARNACLE),
+     ses = sd(x$BARNACLE)/sqrt(length(x$BARNACLE)))))
> ## configure the margins for the figure
> par(mar = c(6, 10, 1, 1))
> ## plot the bars
> xs <- barplot(dat$means, beside = T, axes = F, ann = F, ylim = c(0,
+     max(dat$means + dat$ses)))
> #put on the error bars
> arrows(xs, dat$means - dat$ses, xs, dat$means + dat$ses, ang = 90,
+     code = 3, length = 0.1)
> #put on the axes
> axis(1, at = xs, lab = levels(day$TREAT))
> mtext("Substrate type", 1, line = 4, cex = 2)
> axis(2, las = 1)
> mtext("Mean number of newly\nrecruited barnacles", 2, line = 4, cex = 2)
> box(bty = "l")
Show code - ggplot boxplot
> # calculate the means and standard deviations for each group separately
> dat <- ddply(day,"TREAT", function(x)
+            return(c(means=mean(x$BARNACLE),ses=sd(x$BARNACLE)/sqrt(length(x$BARNACLE))))
+            )
> library(ggplot2)
> library(grid)
> plot.day<-ggplot(dat,aes(x=TREAT,y=means))+                  #plot the base plot
+                    geom_bar()+                                       #add the bars
+                    ylab("Mean number of newly recruited barnacles")+ #y-axis label
+                    xlab("Substrate type")+                           #x-axis label
+                    geom_linerange(aes(ymax=means+ses, ymin=means), position="dodge")+  #error bars
+                    theme_bw()+
+                    coord_cartesian(ylim = c(0,35))+
+                    opts(plot.margin=unit(c(0,0,2,2),"lines"),        #plotting margins (top,right,bottom,left)
+                          panel.grid.major = theme_blank(),               # switch off major gridlines
+              panel.grid.minor = theme_blank(),               # switch off minor gridlines,
+                      axis.title.x=theme_text(vjust=-3, size=15),      #x-axis title format
+                      axis.title.y=theme_text(vjust=-0.5,size=15,angle = 90))  #y-axis title format
> print(plot.day)
Show code - ggplot boxplot
> library(ggplot2)
> plot.day<-ggplot(day,aes(x=factor(TREAT),y=BARNACLE))+                  #plot the base plot
+                    geom_boxplot(color="grey90")+                                          #add the bars
+                    geom_point()+
+                    ylab("Mean number of newly recruited barnacles")+ #y-axis label
+                    xlab("Substrate type")+                           #x-axis label
+                    theme_bw()+
+                    coord_cartesian(ylim = c(0,35))+
+                    opts(plot.margin=unit(c(0,0,2,2),"lines"),        #plotting margins (top,right,bottom,left)
+                          panel.grid.major = theme_blank(),               # switch off major gridlines
+              panel.grid.minor = theme_blank(),               # switch off minor gridlines,
+                      axis.title.x=theme_text(vjust=-3, size=15),      #x-axis title format
+                      axis.title.y=theme_text(vjust=-0.5,size=15,angle = 90))  #y-axis title format
> print(plot.day)

Q1-10. Although we have now established that there is a statistical difference between the group means, we do not yet know which group(s) are different from which other(s). For this data a Tukey's multiple comparison test
(to determine which surface type groups are different from each other, in terms of number of barnacles) is appropriate. Complete the following table for Tukey's pairwise comparison of group means:
(HINT) include differences between group means and Tukey's adjusted P-values (in brackets) for each pairwise comparison.
Show code
> library(multcomp)
> summary(glht(day.aov, linfct = mcp(TREAT = "Tukey")))
	 Simultaneous Tests for General Linear Hypotheses

Multiple Comparisons of Means: Tukey Contrasts


Fit: aov(formula = BARNACLE ~ TREAT, data = day)

Linear Hypotheses:
                 Estimate Std. Error t value Pr(>|t|)    
ALG2 - ALG1 == 0     6.00       2.73    2.20    0.165    
NB - ALG1 == 0      -7.40       2.73   -2.71    0.066 .  
S - ALG1 == 0       -9.20       2.73   -3.38    0.018 *  
NB - ALG2 == 0     -13.40       2.73   -4.92   <0.001 ***
S - ALG2 == 0      -15.20       2.73   -5.58   <0.001 ***
S - NB == 0         -1.80       2.73   -0.66    0.910    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
(Adjusted p values reported -- single-step method)

 ALG1ALG2NBs
ALG1 0.000 (1.00)   
ALG2 6.000 (0.165)0.000 (1.000)  
NB () ()0.000 (1.000) 
S () () ()0.000 (1.000)

Q1-11. What are your conclusions from the Tukey's tests?

Q1-12.A way of representing/summarizing the results of multiple comparison tests is to incorporate symbols
into the bargraph such that similarities and differences in the symbols associated with bars reflect statistical outcomes. Modify the bargraph (HINT) so as to incorporate the Tukey's test results. Finally, generate an appropriate caption to accompany this figure.
Show code - bargraph with alphabet soup
> # calculate the means and standard deviations for each group separately
> library(plyr)
> dat <- ddply(day, "TREAT", function(x) return(c(means = mean(x$BARNACLE),
+     ses = sd(x$BARNACLE)/sqrt(length(x$BARNACLE)))))
> ## configure the margins for the figure
> par(mar = c(6, 10, 1, 1))
> ## plot the bars
> xs <- barplot(dat$means, beside = T, axes = F, ann = F, ylim = c(0,
+     max(dat$means + 2 * dat$ses)))
> #put on the error bars
> arrows(xs, dat$means - dat$ses, xs, dat$means + dat$ses, ang = 90,
+     code = 3, length = 0.1)
> #put on the axes
> axis(1, at = xs, lab = levels(day$TREAT))
> mtext("Substrate type", 1, line = 4, cex = 2)
> axis(2, las = 1)
> mtext("Mean number of newly\nrecruited barnacles", 2, line = 4, cex = 2)
> #add spaghetti to the graph
> text(xs, dat$means + dat$ses, lab = c("A", "A", "AC", "BC"), pos = 3)
> box(bty = "l")

ANOVA and Tukey's test

Here is a modified example from Quinn and Keough (2002). Medley & Clements (1998) studied the response of diatom communities to heavy metals, especially zinc, in streams in the Rocky Mountain region of Colorado, U.S.A.. As part of their study, they sampled a number of stations (between four and seven) on six streams known to be polluted by heavy metals. At each station, they recorded a range of physiochemical variables (pH, dissolved oxygen etc.), zinc concentration, and variables describing the diatom community (species richness, species diversity H and proportion of diatom cells that were the early-successional species, Achanthes minutissima). One of their analyses was to ignore streams and partition the 34 stations into four zinc-level categories: background (< 20µg.l-1, 8 stations), low (21-50µg.l-1, 8 stations), medium (51-200µg.l-1, 9 stations), and high (> 200µg.l-1, 9 stations) and test null hypotheses that there we no differences in diatom species diversity between zinc-level groups, using stations as replicates. We will also use these data to test the null hypotheses that there are no differences in diatom species diversity between streams, again using stations as replicates.

Download Medley data set

Format of medley.csv data files
STATIONZINCDIVERSITY
ER1BACK2.27
.........
ER2HIGH1.25
.........
EF1LOW1.4
.........
ER4MEDIUM1.62
.........
STATIONUniquely identifies the sampling station from which the data were collected.
ZINCZinc level concentration categories.
DIVERSITYShannon-Weiner species diversity of diatoms
A stream in the Rocky Mountains

Open the medley data file.

Show code
> medley <- read.table("../downloads/data/medley.csv", header = T,
+     sep = ",", strip.white = T)
> head(medley)
  STATION   ZINC DIVERSITY
1     ER1   BACK      2.27
2     ER2   HIGH      1.25
3     ER3   HIGH      1.15
4     ER4 MEDIUM      1.62
5     FC1   BACK      1.70
6     FC2   HIGH      0.63

Most statistical packages automatically order the levels of categorical variables alphabetically. Therefore, the levels of the ZINC categorical variable will automatically be ordered as (BACK, HIGH, LOW, MEDIUM). For some data sets the ordering of factors is not important. However, in the medley data set, it would make more sense if the factors were in the following order (BACK, LOW, MEDIUM, HIGH) as this would more correctly represent the relationships between the levels. Note that the ordering of a factor has no bearing on any analyses, it just makes the arrangement of data summaries within some graphs and tables more logical. It is therefore recommended that whenever a data set includes categorical variables, reorder the levels of these variables
into a logical order. HINT
Show code
> levels(medley$ZINC)
[1] "BACK"   "HIGH"   "LOW"    "MEDIUM"
> medley$ZINC <- factor(medley$ZINC, levels = c("HIGH", "MEDIUM", "LOW",
+     "BACK"))
> levels(medley$ZINC)
[1] "HIGH"   "MEDIUM" "LOW"    "BACK"  

Q2-1. Check the ANOVA assumptions using a boxplots
HINT. Any evidence of skewness? Outliers? Does the spread of data look homogeneous between the different Zinc levels?
Show code
> boxplot(DIVERSITY ~ ZINC, data = medley)

If the assumptions seem reasonable, fit the linear model
(HINT), check the residuals
(HINT) and if still there is no clear indication of problems, examine the ANOVA table
(HINT).
Show code
> medley.lm <- lm(DIVERSITY ~ ZINC, data = medley)
> # setup a 2x6 plotting device with small margins
> par(mfrow = c(2, 3), oma = c(0, 0, 2, 0))
> plot(medley.lm, ask = F, which = 1:6)
> anova(medley.lm)
Analysis of Variance Table

Response: DIVERSITY
          Df Sum Sq Mean Sq F value Pr(>F)  
ZINC       3   2.57   0.856    3.94  0.018 *
Residuals 30   6.52   0.217                 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Q2-2. Write the results out as though you were writing a research paper/thesis. For example (select the phrase that applies and fill in gaps with your results): 
The mean diatom diversity was (choose the correct option)
(F = , df = ,, P = )
different between the four zinc level groups.
This can be abbreviated to FdfGroups,dfResidual=fratio, P=pvalue.To see how the full anova table might be included in a report/thesis, Copy and paste the ANOVA table into Word and add an appropriate table caption .

Q2-3.Such a result would normally be accompanied by a graph to illustrate the means as well as an indication of the precision of the means. Copy and paste the graph (as a metafile) into Word
.
Show code - bargraph (dynamite plot)
> # calculate the means and standard deviations for each group separately
> library(plyr)
> dat <- ddply(medley, "ZINC", function(x) return(c(means = mean(x$DIVERSITY),
+     ses = sd(x$DIVERSITY)/sqrt(length(x$DIVERSITY)))))
> ## configure the margins for the figure
> par(mar = c(6, 10, 1, 1))
> ## plot the bars
> xs <- barplot(dat$means, beside = T, axes = F, ann = F, ylim = c(0,
+     max(dat$means + dat$ses)))
> #put on the error bars
> arrows(xs, dat$means - dat$ses, xs, dat$means + dat$ses, ang = 90,
+     code = 3, length = 0.1)
> #put on the axes
> axis(1, at = xs, lab = levels(medley$ZINC))
> mtext("Zinc level", 1, line = 4, cex = 2)
> axis(2, las = 1)
> mtext("Diatom diversity", 2, line = 4, cex = 2)
> box(bty = "l")

Q2-4.Now determine which zinc level groups were different from each other, in terms of diatom species diversity, using a Tukey's multiple comparison test
(HINT).
Show code
> library(multcomp)
> summary(glht(medley.lm, linfct = mcp(ZINC = "Tukey")))
	 Simultaneous Tests for General Linear Hypotheses

Multiple Comparisons of Means: Tukey Contrasts


Fit: lm(formula = DIVERSITY ~ ZINC, data = medley)

Linear Hypotheses:
                   Estimate Std. Error t value Pr(>|t|)  
MEDIUM - HIGH == 0   0.4400     0.2197    2.00    0.209  
LOW - HIGH == 0      0.7547     0.2265    3.33    0.012 *
BACK - HIGH == 0     0.5197     0.2265    2.29    0.122  
LOW - MEDIUM == 0    0.3147     0.2265    1.39    0.515  
BACK - MEDIUM == 0   0.0797     0.2265    0.35    0.985  
BACK - LOW == 0     -0.2350     0.2330   -1.01    0.746  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
(Adjusted p values reported -- single-step method)

ANOVA and planned comparisons

Here is a modified example from Quinn and Keough (2002). Partridge and Farquhar (1981) set up an experiment to examine the effect of reproductive activity on longevity (response variable) of male fruitflies (Drosophila sp.). A total of 125 male fruitflies were individually caged and randomly assigned to one of five treatment groups. Two of the groups were used to to investigate the effects of the number of partners (potential matings) on male longevity, and thus differed in the number of female partners in the cages (8 vs 1). There were two corresponding control groups containing eight and one newly pregnant female partners (with which the male flies cannot mate), which served as controls for any potential effects of competition between males and females for food or space on male longevity. The final group had no partners, and was used as an overall control to determine the longevity of un-partnered male fruitflies.

Download Partridge data set

Format of partridge.csv data files
GROUPLONGEVITY
PREG835
....
NON040
....
PREG146
....
VIRGIN121
....
VIRGIN816
....
GROUPCategorical listing of female partner type.
PREG1 = 1 pregnant partner, NONE0 = no female partners, PREG8 = 8 pregnant partners, VIRGIN1 = 1 virgin partner, VIRGIN8 = 8 virgin partners.
Groups 1,2,3 - Control groups
Groups 4,5 - Experimental groups.
LONGEVITYLongevity of male fruitflies (days)
Male fruitfly

Open the partridge data file.

Show code
> partridge <- read.table("../downloads/data/partridge.csv", header = T,
+     sep = ",", strip.white = T)
> head(partridge)
  GROUP LONGEVITY
1 PREG8        35
2 PREG8        37
3 PREG8        49
4 PREG8        46
5 PREG8        63
6 PREG8        39

Q3-1. When comparing the mean male longevity of each group, what is the null hypothesis?

Note, normally we might like to adjust the ordering of the levels of the categorical variable (GROUP), however, in this case, the alphabetical ordering also results in the most logical ordering of levels.

Q3-2. Before performing the ANOVA, check the assumptions (Boxplots, scatterplot of Mean vs Variance) using the variable GROUP as the grouping (IV) variable for the X-axes. Is there any evidence that the ANOVA assumptions have been violated (Y or N)?

In addition to the global ANOVA in which the overall effect of the factor on male fruit fly longevity is examined, a number of other comparisons can be performed to identify differences between specific groups. As with the previous question, we could perform Tukey's post-hoc pairwise comparisons to examine the differences between each group. Technically, it is only statistically legal to perform n-1 pairwise comparisons, where n is the number of groups. This is because if each individual comparison excepts a 5% (&alpha=0.05) probability of falsely rejecting the H0, then the greater the number of tests performed the greater the risk eventually making a statistical error. Post-hoc tests protect against such an outcome by adjusting the &alpha values for each individual comparison down. Consequently, the power of each comparison is greatly reduced.

This particular study was designed with particular comparisons in mind, while other pairwise comparisons would have very little biological meaning or importance. For example, in the context of the project aims, comparing group 1 with group 2 would not yield anything meaningful. As we have five groups (df=4), we can do four planned comparisons.

Q3-3. In addition to the global ANOVA, we will address two specific questions by planned comparisons.
  1. "Is longevity affected by the presence of a large number of potential mates (8 virgin females compared to 1 virgin females)?" (contrast coefficients: 0, 0, 0, -1, 1)
  2. "Is longevity affected by the presence of any number of potential mates compared with either no partners or pregnant partners?" (contrast coefficients: -2, -2, -2, 3, 3)

These specific (planned) comparisons are performed by altering the contrast coefficients used in the model fitting. By default, when coding your factorial variable into numeric dummy variables, the contrast coefficients are defined as;

  1. the intercept is the mean of the first group
  2. the first contrast is the difference between the means of the first and second group
  3. the second contrast is the difference between the means and the third group
  4. ...
That is, each of the groups are compared to the first group. If we have alternative specific comparisons that we would like to perform (such as the ones listed above), we can define the contrasts to represent different combinations of comparisons. Clearly, we cannot define more comparisons that allowable contrasts (groups minus 1) and importantly, these contrasts must be orthogonal (independent) of one another). In practice, it can be difficult to get all the contrasts to be orthogonal and incorporate a balanced representation from all groups. Consequently, it is usually recomended that you define no more than (groups minus two) and let R calculate the rest.

Show code
> contrasts(partridge$GROUP) <- cbind(c(0, 0, 0, 1, -1), c(-2, -2,
+     -2, 3, 3))
> #Check that the defined contrasts are orthogonal
> round(crossprod(contrasts(partridge$GROUP)), 2)
     [,1] [,2] [,3] [,4]
[1,]    2    0    0    0
[2,]    0   30    0    0
[3,]    0    0    1    0
[4,]    0    0    0    1
> #A contrast is considered orthogonal to others if the value corresponding
> #   to it and another in the lower (or upper) triangle of the cross
> #   product matrix is 0. Non-zero values indicate non-orthogonality.
> #In order to incorporate planned comparisons in the anova table, it is
> #   necessary to convert linear model fitted via lm into aov objects
> partridge.lm <- lm(LONGEVITY ~ GROUP, data = partridge)
> summary(aov(partridge.lm), split = list(GROUP = list(`8virg vs 1virg` = 1,
+     `Partners vs Controls` = 2)))
                               Df Sum Sq Mean Sq F value  Pr(>F)    
GROUP                           4  11939    2985    13.6 3.5e-09 ***
  GROUP: 8virg vs 1virg         1   4068    4068    18.6 3.4e-05 ***
  GROUP: Partners vs Controls   1   7841    7841    35.8 2.4e-08 ***
Residuals                     120  26314     219                    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
> #It is also possible to make similar conclusions via the t-tests<br>
> #The above contrast F-tests correspond to the first two t-tests (GROUP1
> #   and GROUP2)<br>
> #Note that F=t<sup>2</sup><br>
> summary(partridge.lm)
Call:
lm(formula = LONGEVITY ~ GROUP, data = partridge)

Residuals:
   Min     1Q Median     3Q    Max 
-35.76  -8.76   0.20  11.20  32.44 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   57.440      1.324   43.37  < 2e-16 ***
GROUP1         9.020      2.094    4.31  3.4e-05 ***
GROUP2        -3.233      0.541   -5.98  2.4e-08 ***
GROUP3        -0.895      2.962   -0.30     0.76    
GROUP4         0.645      2.962    0.22     0.83    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Residual standard error: 14.8 on 120 degrees of freedom
Multiple R-squared: 0.312,	Adjusted R-squared: 0.289 
F-statistic: 13.6 on 4 and 120 DF,  p-value: 3.52e-09 

Q3-6. Present the results of the planned comparisons as part of the following ANOVA table:
Source of VariationSSdfMSF-ratioPvalue
Between groups
  8 virgin vs 1 virgin
  Partners vs Controls
Residual (within groups)   

Note that the Residual (within groups) term is common to each planned comparison as well as the original global ANOVA.

Q3-5. Summarize the conclusions (statistical and biological) from the analyses.

Q3-7. Summarize the conclusions (statistical and biological) from the analyses.
  1. Global null hypothesis (H0: population group means all equal)
  2. Planned comparison 1 (H0: population mean of 8PREG group is equal to that of 1PREG)
  3. Planned comparison 2 (H0: population mean of average of 1PREG and 8PREG groups are equal to the population mean of average of CONTROL, 1VIRGIN and 8VIRGIN groups)

Q3-8. List any other specific comparisons that may have been of interest to this study. Remember that the total number of comparisons should not exceed the global degrees of freedom (4 in this case) and each outcome of each comparison should be independent of all other comparisons.

Q3-9. Attempt to redefine the set of contrasts including any additional ones you thought of from Q3-8 above. Assess whether or not orthogonality is still met. If it is, refit the linear model.

Q3-10.Finally, construct an appropriate bargraph or other graphic to accompany the above analyses.
Show code - bargraph (dynamite plot)
> # calculate the means and standard deviations for each group separately
> library(plyr)
> dat <- ddply(partridge, "GROUP", function(x) return(c(means = mean(x$LONGEVITY),
+     ses = sd(x$LONGEVITY)/sqrt(length(x$LONGEVITY)))))
> ## configure the margins for the figure
> par(mar = c(6, 10, 1, 1))
> ## plot the bars
> xs <- barplot(dat$means, beside = T, axes = F, ann = F, ylim = c(0,
+     max(dat$means + dat$ses)))
> #put on the error bars
> arrows(xs, dat$means - dat$ses, xs, dat$means + dat$ses, ang = 90,
+     code = 3, length = 0.1)
> #put on the axes
> axis(1, at = xs, lab = levels(partridge$GROUP))
> mtext("Mating group", 1, line = 4, cex = 2)
> axis(2, las = 1)
> mtext("Male fruitfly longevity", 2, line = 4, cex = 2)
> box(bty = "l")

ANOVA and planned comparisons

Snodgrass et al. (2000) were interested in how the assemblages of larval amphibians varied between depression wetlands in South Carolina, USA, with different hydrological regimes. A secondary question was whether the presence of fish, which only occurred in wetlands with long hydroperiods, also affected the assemblages of larval amphibians. They sampled 22 wetlands in 1997 (they originally had 25 but three dried out early in the study) and recorded the species richness and total abundance of larval amphibians as well as the abundance of individual taxa. Wetlands were also classified into three hydroperiods: short (6 wetlands), medium (5) and long (11) - the latter being split into those with fish (5) and those without (6). The short and medium hydroperiod wetlands did not have fish.

The overall question of interest is whether species richness differed between the four groups of wetlands. However, there are also specific questions related separately to hydroperiod and fish. Is there a difference in species richness between long hydroperiod wetlands with fish and those without? Is there a difference between the hydroperiods for wetlands without fish? We can address these questions with a single factor fixed effects ANOVA and planned contrasts using species richness of larval amphibians as the response variable and hydroperiod/fish category as the predictor (grouping variable).

Download Snodgrass data set

Format of snodgrass.csv data files
HYDROPERIODRICHNESS
Short3
....
Medium9
....
Longnofish7
....
Longfish12
....
HYDROPERIODCategorical listing of the four hydroperiod/fish wetlands (short, medium and longnofish represent the hydroperiods of wetlands without fish; longfish represents wetlands with long hydroperiods that contain fish).
RICHNESSSpecies richness of larval amphibians
Male fruitfly

Open the snodgrass data file.

Show code
> snodgrass <- read.table("../downloads/data/snodgrass.csv", header = T,
+     sep = ",", strip.white = T)
> head(snodgrass)
  HYDROPERIOD RICHNESS
1       Short        3
2       Short        7
3       Short        2
4       Short        3
5       Short        3
6       Short        5

Reorder the factor levels
of HYDROPERIOD into a more logical order (e.g. Short, Medium, Longnofish, Longfish) HINT
Show code
> snodgrass$HYDROPERIOD <- factor(snodgrass$HYDROPERIOD, levels = c("Short",
+     "Medium", "Longnofish", "Longfish"))

Q4-1. Examine the group means and variances
and boxplots
for species richness across the wetland categories HINT.

Show code
> means <- with(snodgrass, tapply(RICHNESS, HYDROPERIOD, mean))
> vars <- with(snodgrass, tapply(RICHNESS, HYDROPERIOD, var))
> plot(vars ~ means)
> boxplot(RICHNESS ~ HYDROPERIOD, data = snodgrass)
Is there any evidence that any of the assumptions have been violated? ('Y' or 'N')

Q4-2. Now fit a single factor ANOVA model
(HINT) and examine the residuals
(HINT).
Show code
> snodgrass.lm <- lm(RICHNESS ~ HYDROPERIOD, data = snodgrass)
> # setup a 2x6 plotting device with small margins
> par(mfrow = c(2, 3), oma = c(0, 0, 2, 0))
> plot(snodgrass.lm, ask = F, which = 1:6)
Any evidence of skewness or unequal variances? Any outliers? Any evidence of violations? ('Y' or 'N')

Q4-3. As well as the overall analysis, Snodgrass et al. (2000) were particularly interested in two specific comparisons a) whether there was a difference in species richness between the long hydroperiod wetlands with and without fish, and b) whether there was a difference in species richness between permanent wetlands (long hydroperiods) and temporary wetlands (short and medium hydroperiods). What specific null hypotheses are being tested;


Q4-4. Define the appropriate contrast coefficients (and thus comparisons)
(HINT), check orthogonality (HINT), define the contrast labels (HINT) and refit the linear model incorporating these contrasts.
Show code
> #define contrasts
> contrasts(snodgrass$HYDROPERIOD) <- cbind(c(0, 0, -1, 1), c(1, 1,
+     -1, -1))
> round(crossprod(contrasts(snodgrass$HYDROPERIOD)), 2)
     [,1] [,2] [,3]
[1,]    2    0    0
[2,]    0    4    0
[3,]    0    0    1
> #define contrast labels
> snodgrass.list <- list(HYDROPERIOD = list(`Long with vs Long without` = 1,
+     `Permanent vs Temporary` = 2))
> #refit model
> snodgrass.lm <- lm(RICHNESS ~ HYDROPERIOD, data = snodgrass)

Q4-5. Perform the ANOVA
with specific comparisons
using the appropriate contrasts (HINT). Fill in the following table (HINT):
Show code
> summary(aov(snodgrass.lm), split = snodgrass.list)
                                         Df Sum Sq Mean Sq F value Pr(>F)   
HYDROPERIOD                               3  108.8    36.3    7.29 0.0021 **
  HYDROPERIOD: Long with vs Long without  1    4.8     4.8    0.97 0.3376   
  HYDROPERIOD: Permanent vs Temporary     1   19.5    19.5    3.92 0.0633 . 
Residuals                                18   89.5     5.0                  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
Source of VariationSSdfMSF-ratioPvalue
Between groups
  Long with vs nofish
  Permanent vs Temporary
Residual (within groups)   

Q4-6. What statistical conclusions would you draw from the overall ANOVA and the two specific contrasts and what do they mean biologically?

Q4-7.Finally, construct an appropriate graphic
to accompany the above analyses.
Show code - bargraph (dynamite plot)
> # calculate the means and standard deviations for each group separately
> library(plyr)
> dat <- ddply(snodgrass, "HYDROPERIOD", function(x) return(c(means = mean(x$RICHNESS),
+     ses = sd(x$RICHNESS)/sqrt(length(x$RICHNESS)))))
> ## configure the margins for the figure
> par(mar = c(6, 10, 1, 1))
> ## plot the bars
> xs <- barplot(dat$means, beside = T, axes = F, ann = F, ylim = c(0,
+     max(dat$means + dat$ses)))
> #put on the error bars
> arrows(xs, dat$means - dat$ses, xs, dat$means + dat$ses, ang = 90,
+     code = 3, length = 0.1)
> #put on the axes
> axis(1, at = xs, lab = levels(snodgrass$HYDROPERIOD))
> mtext("Hydroperiod", 1, line = 4, cex = 2)
> axis(2, las = 1)
> mtext("Amphibian larval richness", 2, line = 4, cex = 2)
> box(bty = "l")

ANOVA and power analysis

Laughing kookaburras (Dacelo novaguineae) live in coorperatively breeding groups of between two and eight individuals in which a dominant pair is assisted by their offspring from previous seasons. Their helpers are involved in all aspects of clutch, nestling and fledgling care. An ornithologist was interested in investigating whether there was an effect of group size on fledgling success. Previous research on semi captive pairs (without helpers) yielded a mean fledgling success rate of 1.65 (fledged young per year) with a standard deviation of 0.51. The ornithologist is planning to measure the success of breeding groups containing either two, four, six or eight individuals.

Cooperative breading in kookaburras
Laughing Kookaburras

Q5-1. When designing an experiment or sampling regime around an ANOVA, it is necessary to consider the nature in which groups could differ from each other and the overall mean. Typically, following on from a significant ANOVA, the patterns are further explored via either multiple pairwise comparisons, or if possible, more preferably via planned contrasts (see Exercises 1-4 above).

The ability (power) to detect an effect of a factor depends on sample size as well as levels of within an between group variation. While ANOVA assumes that all groups are equally varied, levels of between group variability are dependent on the nature of the trends that we wish to detect.

  • Do we wish to detect the situation where just one group differs from all the others?
  • Do we wish to be able to detect the situation where two of the groups differ from others?
  • Do we wish to be able to detect e situation where the groups differ according to some polynomial trend, or some other combination?

For the purpose of example, we are going to consider the design implications of a number of potential scenarios. In each case we will try to accommodate an effect size of 50% (we wish to be able to detect a change of at least 50% in response variable - mean fledgling success rate), a power of 0.8 and a significance criteria of 0.05. Calculate the expected sample sizes necessary to detect the following:
  1. The mean fledgling success rate of pairs in groups of two individuals is 50% less than the mean fledgling success rate of individuals in groups with larger numbers of individuals - that is, µTWOFOURSIXEIGHT HINT
    Show code
    > mean1 <- 1.65
    > sd1 <- 0.51
    > es <- 0.5
    > mean2 <- mean1 + (mean1 * es)
    > b.var <- var(c(mean1, mean2, mean2, mean2))
    > power.anova.test(group = 4, power = 0.8, between.var = b.var, within.var = sqrt(sd1))
    
         Balanced one-way analysis of variance power calculation 
    
             groups = 4
                  n = 16.26
        between.var = 0.1702
         within.var = 0.7141
          sig.level = 0.05
              power = 0.8
    
     NOTE: n is number in each group 
    
    
  2. The mean fledgling success rate of pairs in small groups (two and four) are equal and 50% less than the mean fledgling success rate of individuals in larger groups (six and eight) - that is, µTWOFOURSIXEIGHT
    Show code
    > mean1 <- 1.65
    > sd1 <- 0.51
    > es <- 0.5
    > delta <- mean1 + (mean1 * es)
    > b.var <- var(c(mean1, mean1, mean2, mean2))
    > power.anova.test(group = 4, power = 0.8, between.var = b.var, within.var = sqrt(sd1))
    
         Balanced one-way analysis of variance power calculation 
    
             groups = 4
                  n = 12.46
        between.var = 0.2269
         within.var = 0.7141
          sig.level = 0.05
              power = 0.8
    
     NOTE: n is number in each group 
    
    
  3. The mean fledgling success rate increases linearly by 50% with increasing group size.
    Show code
    > mean1 <- 1.65
    > sd1 <- 0.51
    > es <- 0.5
    > delta <- mean1 + (mean1 * es)
    > b.var <- var(seq(mean1, mean2, l = 4))
    > power.anova.test(group = 4, power = 0.8, between.var = b.var, within.var = sqrt(sd1))
    
         Balanced one-way analysis of variance power calculation 
    
             groups = 4
                  n = 21.59
        between.var = 0.126
         within.var = 0.7141
          sig.level = 0.05
              power = 0.8
    
     NOTE: n is number in each group 
    
    

Note that we would not normally be contemplating accommodating both polynomial as well as non-polynomial contrasts or pairwise contrasts. We did so in Exercise 5-1 above purely for illustrative purposes!

Q5-2. Often when designing an experiment or survey it is useful to be able to estimate the relationship between power and sample size for a range of sample sizes and a range of outcomes so as to make more informed choices. This can be done by plotting a power envelope
. Plot such an power envelope.
Show code
> x <- seq(3, 30, l = 100)
> b.var <- var(c(mean1, mean1, mean2, mean2))
> plot(x, power.anova.test(group = 4, n = x, between.var = b.var, within.var = sqrt(sd1))$power,
+     ylim = c(0, 1), ylab = "power", xlab = "n", type = "l")
> abline(0.8, 0, lty = 2)
> m <- mean(c(mean1, mean2))
> b.var <- var(c(mean1, m, m, mean2))
> points(x, power.anova.test(group = 4, n = x, between.var = b.var,
+     within.var = sqrt(sd1))$power, type = "l", lty = 2)
> b.var <- var(c(mean1, mean2, mean2, mean2))
> points(x, power.anova.test(group = 4, n = x, between.var = b.var,
+     within.var = sqrt(sd1))$power, type = "l", lty = 3)
> b.var <- var(seq(mean1, mean2, l = 4))
> points(x, power.anova.test(group = 4, n = x, between.var = b.var,
+     within.var = sqrt(sd1))$power, type = "l", lty = 4)
> 
> bw <- c(var(c(mean1, mean1, mean2, mean2)), var(c(mean1, m, m, mean2)))
> points(x, power.anova.test(group = 4, n = x, between.var = bw, within.var = sqrt(sd1))$power,
+     type = "l", col = "gray")


  ANOVA

Analysis of variance, or ANOVA, partitions the variation in the response (DV) variable into that explained and that unexplained by one of more categorical predictor variables, called factors. The ratio of this partitioning can then be used to test the null hypothesis (H0) that population group or treatment means are equal. A single factor ANOVA investigates the effect of a single factor, with 2 or more groups (levels), on the response variable. Single factor ANOVA's are completely randomized (CR) designs. That is, there is no restriction on the random allocation of experimental or sampling units to factor levels.
Single factor ANOVA tests the H0 that there is no difference between the population group means.
H0: = µ1 = µ2 = .... = µi = µ
If the effect of the ith group is:
(&alphai = µi - µ) the this can be written as
H0: = &alpha1 = &alpha2 = ... = &alphai = 0
If one or more of the I are different from zero, the hull hypothesis will be rejected.

Keough and Raymondi (1995) investigated the degree to which biofilms (films of diatoms, algal spores, bacteria, and other organic material that develop on hard surfaces) influence the settlement of polychaete worms. They had four categories (levels) of the biofilm treatment: sterile substrata, lab developed biofilms without larvae, lab developed biofilms with larvae (any larvae), field developed biofilms without larvae. Biofilm plates where placed in the field in a completely randomized array. After a week the number of polychaete species settled on each plate was then recorded. The diagram illustrates an example of the spatial layout of a single factor with four treatments (four levels of the treatment factor, each with a different pattern fill) and four experimental units (replicates) for each treatment.

End of instructions

  Factorial boxplots

> boxplot(dv~factor,data=data)

where dv is the name of the numeric vector (dependent variable), factor is the name of the factor (categorical or factor variable) and data is the name of the data frame (data set). The '~' is used in formulas to represent that the left hand side is proportional to the right hand side.

End of instructions

  Plotting mean vs variance

# first calculate the means and variances
> means <- tapply(data$dv, data$factor, mean)
> vars <- tapply(data$dv, data$factor, var)
# then plot the mean against variance
> plot(means,vars)

where dv is the name of the numeric vector (dependent variable), factor is the name of the factor (categorical or factor variable) and data is the name of the data frame (data set).

End of instructions

  ANOVA

> name.aov <- lm(dv~factor,data=data)
# OR
> day.aov <- aov(dv~factor,data=data)

where dv is the name of the numeric vector (dependent variable), factor is the name of the factor (categorical or factor variable and data is the name of the data frame (data set)). The '~' is used in formulas to represent that the left hand side is proportional to the right hand side.

End of instructions

  Linear regression diagnostics

> plot(name.lm)

where name.lm is the name of a fitted linear model. You will be prompted to hit return to cycle through a series of four diagnostic plots. The first (and most important of these) is the Residual Plot.

End of instructions

  ANOVA output

> anova(day.aov)

where dv is the name of the numeric vector (dependent variable), factor is the name of the factor (categorical or factor variable and data is the name of the data frame (data set). The '~' is used in formulas to represent that the left hand side is proportional to the right hand side.

End of instructions

  ANOVA table

#print a summary of the ANOVA table
> anova(name.aov)

where name.aov is the name of a fitted linear model.

End of instructions

  Copy and Paste text

  1. Highlight the text (table) you want to copy
  2. Select the Edit.. menu..
  3. Select the Copy.. submenu..
  4. Switch focus to the program you want to paste the text into (e.g. Microsoft Word)
  5. Select the Edit.. menu. (in Word)
  6. Select the Paste.. submenu. (in Word)
  7. To format the table in Word
    1. Add the word 'Source' to the lop left hand corder of the first row of the ANOVA table to ensure that each table column has a heading and remove the stars


    2. Add commas after each entry in the table (except the last entry in each row) and remove any spaces between entries. This step defines the columns of the table, so it is important that they are placed in the correct positions. Note that 'Sums' and 'Sq' are not separate entries, the entry is 'Sums Sq'.

      1. Highlight the table in Word
      2. Select the Table menu. (in Word)
      3. Select the Convert submenu. (in Word)
      4. Select the Text to Table submenu. (in Word)
      5. The Convert Text to Table dialog box will appear

      6. Ensure that Commas is selected as the text separator
      7. Click the button
      8. The Table AutoFormat dialog box will appear

      9. Select the Table Classic 1 Table style from the Tables styles list
      10. Unselect the First column, Last row and Last column checkboxes
      11. Click the button

End of instructions

  Bar graphs

In a bargraph, the height of the bar represents the mean and the error bars (or whiskers) represent standard error (or standard deviation). Before offering suggestions on how to produce a bargraph in R, I need to present a quick disclaimer. Bargraphs are not officially supported in R (or many other statistical packages for that matter). The reason for this is that despite their popularity in some disciplines (i.e. ecology), they are considered by many to be completely inappropriate.

Opponents of these graphics call them 'Dynamite plots' - a reference to both their resemblance to sticks of dynamite and the danger they present as a summary graphic. The main arguments of opponents are; Consider the following artificial situation.
> TREAT & lt - factor(c(rep("A", 15), rep("B", 2), rep("C", 10), rep("D",
+     10)))
Error: object 'TREAT' not found
> m <- 300
> s <- 100
> a <- rnorm(15, 300, 100)
> aa <- (a - mean(a))/(sd(a) * 1/s) + m
> x <- rep(1, 15)
> b <- rnorm(2, 300, 100)
> bb <- (b - mean(b))/(sd(b) * 1/s) + m
> x <- c(x, rep(2, 2))
> m1 <- 200
> s1 <- 50
> c <- rnorm(3, m1, s)
> c <- c(c, rnorm(7, quantile(c)[2], 1))
> cc <- (c - mean(c))/(sd(c) * 1/s1) + m1
> x <- c(x, rep(3, 3), seq(2.8, 3.2, l = 7))
> d <- rnorm(2, m1, s1)
> d <- c(rnorm(6, quantile(d)[5]), rnorm(4, quantile(d)[1]))
> dd <- (d - mean(d))/(sd(d) * 1/s1) + m1
> x <- c(x, seq(3.8, 4.2, l = 6), seq(3.8, 4.2, l = 4))
> RESP <- c(aa, bb, cc, dd)
> ds <- data.frame(TREAT, x, RESP)
Error: object 'TREAT' not found
> #plot(RESP~x)
> #make the bargraph
> dat <- ddply(ds, "TREAT", function(x) return(c(means = mean(x$RESP),
+     ses = sd(x$RESP))))
Error: object 'ds' not found
> ## configure the margins for the figure
> par(mar = c(6, 8, 1, 1))
> ## plot the bars
> xs <- barplot(dat$means, beside = T, axes = F, ann = F, ylim = c(0,
+     max(dat$means + dat$ses)))
> #put on the error bars
> arrows(xs, dat$means, xs, dat$means + dat$ses, ang = 90, code = 2,
+     length = 0.1)
> #put on the axes
> axis(1, at = xs, lab = levels(ds$TREAT))
Error: object 'ds' not found
> mtext("Treatments", 1, line = 4, cex = 2)
> axis(2, las = 1)
> mtext("Response", 2, line = 4, cex = 2)
> box(bty = "l")

The following graphic suggests the following;

If we now plot the raw data from which the means and standard deviations were calculated and reassess the above questions, our conclusions might be somewhat different.

> plot(RESP ~ x, las = 1, pch = 16, xlim = c(0, 5), ylim = c(0, 550),
+     axes = F, ann = F)
> axis(1, at = xs, lab = levels(ds$TREAT))
Error: object 'ds' not found
> mtext("Treatments", 1, line = 4, cex = 2)
> axis(2, las = 1)
> mtext("Response", 2, line = 4, cex = 2)
> box(bty = "l")
> boxplot(RESP ~ TREAT, ds, add = TRUE, border = "grey80", ann = F,
+     axes = F)
Error: object 'ds' not found
> points(RESP ~ x, pch = 16)

This artificial example illustrates how misleading a bargraph can be. That said, when the assumptions of parametric ANOVA are satisfied (and this is normally when a bargraph is used to summarize the patterns), then a bargraph is not necessarily misleading.

# calculate the means
> means <- tapply(dv, factor, mean)
# calculate the standard deviation
> sds <- tapply(dv, factor, sd)
# calculate the lengths
> ns <- tapply(dv, factor, length)
# calculate the standard errors
> ses <- sds/sqrt(ns)
# plot the bars
> xs <- barplot(means,beside=T)
# load package containing error bar function
> library(Hmisc)
# plot the error bars
> errbar(xs, means, means+ses, means-ses, add=T)

End of instructions

  Copy and Paste graphs

  1. Graph window, click the right mouse button
  2. Select either the copy as metafile menu (if intending to modify/edit the graph after it is pasted into another program) or copy as bitmap (if don't intend to modify the graph after it is pasted into another program)
  3. Switch control (focus) to the other program (such as Microsoft Word) using either Alt-tab or the Windows navigation buttons
  4. Select the Edit.. menu. (in Word)
  5. Select the Paste.. submenu. (in Word)
  6. Switch focus back to RGui once you have finished and have produced a caption.

End of instructions

  Specific comparisons of means

Following a significant ANOVA result, it is often desirable to specifically compare group means to determine which groups are significantly different. However, multiple comparisons lead to two statistical problems. Firstly, multiple significance tests increase the Type I errors (&alpha, the probability of falsely rejecting H0). E.g., testing for differences between 5 groups requires ten pairwise comparisons. If the &alpha for each test is 0.05 (5%), then the probability of at least one Type I error for the family of 10 tests is 0.4 (40%). Secondly, the outcome of each test needs to be independent (orthogonality). E.g. if A>B and B>C then we already know the result of A vs. C.

Post-hoc unplanned pairwise comparisons (e.g. Tukey's test) compare all possible pairs of group means and are useful in an exploratory fashion to reveal major differences. Tukey's test control the family-wise Type I error rate to no more that 0.05. However, this reduces the power of each of the pairwise comparisons, and only very large differences are detected (a consequence that exacerbates with an increasing number of groups).

Planned comparisons are specific comparisons that are usually planned during the design stage of the experiment. No more than (p-1, where p is the number of groups) comparisons can be made, however, each comparison (provided it is non-orthogonal) can be tested at &alpha = 0.05. Amongst all possible pairwise comparisons, specific comparisons are selected, while other meaningless (within the biological context of the investigation) are ignored.

Planned comparisons are defined using what are known as contrasts coefficients. These coefficients are a set of numbers that indicate which groups are to be compared, and what the relative contribution of each group is to the comparison. For example, if a factor has four groups (A, B, C and D) and you wish to specifically compare groups A and B, the contrast coefficients for this comparison would be 1, -1, 0,and 0 respectively. This indicates that groups A and B are to be compared (since their signs oppose) and that groups C and D are omitted from the specific comparison (their values are 0).

It is also possible to compare the average of two or more groups with the average of other groups. For example, to compare the average of groups A and B with the average of group C, the contrast coefficients would be 1, 1, -2, and 0 respectively. Note that 0.5, 0.5, 1 and 0 would be equivalent.

The following points relate to the specification of contrast coefficients:

  1. The sum of the coefficients should equal 0
  2. Groups to be compared should have opposing signs

End of instructions

  Tukey's Test

# load the 'multcomp' package
> library(multcomp)
# perform Tukey's test
> summary(simtest(dv~factor, data=data, type="Tukey"))
# OR the more up to date method
> summary(glht(data.aov, linfct=mcp(factor="Tukey")))

where dv is the name of the numeric vector (dependent variable), factor is the name of the factor (categorical or factor variable, data.aov is the name of the fitted ANOVA model and data is the name of the data frame (data set). The '~' is used in formulas to represent that the left hand side is proportional to the right hand side. The argument type="Tukey" indicates that a Tukeys p-value adjustments should be made.


The coefficients list the specific comparisons made. Each line represents a specific hypothesis test including difference between groups, t-value, raw p value and Tukey's adjusted p value. Note, the p-value calculations involve a randomization procedure (making them more robust to unequal sample sizes) and thus they will differ slightly each time they calculated.

End of instructions

  Symbols on bargraphs

Add symbols to the bars such that common symbols signify non-significant comparisons and different symbols signify significant differences. As these graphs can end up with lots of different symbols, they are referred to by some (opponents) as 'alphabet soup'.

> Mbargraph(day$BARNACLE, day$TREAT, symbols=c('AB','A','BC','C'))

where xs is a list of the x-coordinates of the bar graph bars (as generated by the barplot() function), means+5 is a list of y-coordinates of where the labels should be positioned and labels is a list of the labels to add to the graph. Note, there should be as many labels as there are bars.

In the following graph the mean of group a was found to be significantly different from the means of both groups b and c whilst the means of groups b and c were not found to be significantly different from one another.


End of instructions

  Reordering factor levels

> dv <- factor(dv, levels=c(ordered levels))

where dv is the name of the numeric vector (dependent variable) and ordered levels is a list of factor levels listed in the preferred order
Note that the data will not appear to have changed at all, it is only during the presentation of results and graphs that reordering the factor levels makes a difference.

End of instructions

  ANOVA output with planned contrasts

> summary(name.aov, split=list)

where name.aov is the name of the fitted linear model and list is the list of labels associated with the defined contrasts. For example:

> summary(partridge.aov, split=partridge.list)

End of instructions

  Power and ANOVA in R

Start off why defining some of the values, such as the usual (background or control) population mean and standard deviation.

> mean1 <- 1.65
> sd1 <- 0.51

Then define the effect size and expected between group variation. This reflects the of the amount of change (usually as a percentage) as well as the nature of the differences between groups. The following illustration defines between group variation based on three of the groups having equal population means that are 50% greater than another group.

> es<- .50
> mean2 <- mean1+(mean1*es)
> b.var <- var(c(mean1, mean2, mean2, mean2))

An alternative definition of expected between group variation might represent two of the groups having equal population means that are 50% greater than another groups.

> es<- .50
> mean2 <- mean1+(mean1*es)
> b.var <- var(c(mean1, mean1, mean2, mean2))

Alternatively, we could define the expected between group variation in terms of detecting a 50% increase in means according to some polynomial trend. The expected means are then calculated as a regular sequence.

> es<- .50
> mean2 <- mean1+(mean1*es)
> b.var <- var(seq(mean1, mean2, length=4))

Having defined the expected between group variation, power analysis can be performed as follows:

> power.anova.test(group=4, power=.8, between.var=b.var, within.var=sd1^2)

These instructions are altered by customising the groups size, power etc.

End of instructions

  Power envelope for ANOVA

A power envelope is the area between maximum and minimum power curves. When the design incorporates four groups (A, B, C, and D), the easiest pattern to detect (maximum power relationship) is A=B<C=D (or equivelently, A=B>C=D), and the most difficult pattern (minimum power relationship) is A<B=C<D (or A>B=C>D), where B and C are equidistant from both A and D. Similarly, if there are five groups (A, B, C, D and E), the easiest pattern to detect (maximum power relationship) is A=B<C=D=E (or equivelently, A=B>C=D=E, A=B=C>D=E, A=B=C<D=E,), and the most difficult pattern (minimum power relationship) is A<B=C=D<E (or A>B=C=D>E), where B and C and D are equidistant from both A and E.

> x=seq(3,30,l=100)
# generate maximum power curve
> b.var <- var(c(mean1, mean1, mean2, mean2))
> plot(x, power.anova.test(group=4, n=x, between.var=b.var, within.var=sqrt(sd1))$power, ylim=c(0,1), ylab="power", xlab="n", type="l")
# put in a line at power=0.8
> abline(0.8, 0, lty=2)
# generate the minimum power curve
> m<-mean(c(mean1,mean2))
> b.var <- var(c(mean1,m,m,mean2))
> points(x,power.anova.test(group=4,n=x,between.var=b.var, within.var=sqrt(sd1))$power, type="l",lty=4)

End of instructions