Top: Index Previous: Overexpression Up: Microarray Practical Next: Heatmaps

CSC8309 -- Gene Expression and Proteomics

Limma

Limma is a commonly used package for microarray analysis in BioConductor. It can be used for analysing all sorts of microarray data, not just expression arrays.

The name is derived from 'Linear Models for MicroArray'. Limma is especially useful for determining significant changes where there are a number of variables to test (for instance if you had an experiment that had a mixture of genotypes, drug responses and timepoints) - a situation that is less well catered for in ANOVA analysis (a common microarray analyis technique for identifying differentially regulated genes for 1 or 2 variables).

Limma is used here to provide a 'moderated t-statistic' as part of the test for significantly changed genes. This moderated t-statistic 'borrrows' variability information from other genes on the chip in order to calculate the t-statistic, rather than considering each gene in isolation. This moderated t statistic and the empirical Bayes method used to derive it are only available in the limma package.

The procedure for producing the volcano plot is the same as the previous section, only using the more accurate limma derived numbers in order to derive a final list of changed genes (those with a large fold change on the lod scale, that also have small p-values from the moderated t-statistic).

act

Evaluate the following code

library(limma)

# Construct population.groups for model design
population.groups <- factor(c(rep('Taiwan/Susceptible',3), rep('Hawaii/Resistant',3)))
design <- model.matrix( ~ population.groups )

# Display design
design

# Fit linear model
fit <- lmFit(eset, design)

# Check the dimension of fit$coeff; should be 37744 x 2
dim(fit$coeff)

# Execute eBayes to obtain attenuated t statistics
fit.eBayes <- eBayes(fit)

# Check dimension of fit.eBayes$t; should be 37744 x 2
dim(fit.eBayes$t)

# Compute statistics necessary for Volcano Plot
# using attenuated t statistics
lodd <- -log10(fit.eBayes$p.value[,2])
oo2 <- order(abs(fit.eBayes$t[,2]), decreasing = TRUE)[1:50]
oo <- union(o1, oo2)
ii <- intersect(o1, oo2)

# Display ii
ii

# Construct Volcano plot using attenuated t statistics
plot(Difference[-oo], lodd[-oo], cex = .25, xlim = c(-3,3), ylim = range(lodd), xlab = 'Average (log) Fold-change', ylab = 'LOD score - Negative log10 of P-value')
points(Difference[o1], lodd[o1], pch = 18, col = 'blue', cex = 1.5)
points(Difference[oo2], lodd[oo2], pch = 1, col ='red',cex = 2,lwd = 2)
abline(h = 3)
title('Volcano Plot with moderated t statistics')
text(-2, 3.2, 'p < 0.001')
text(1, 4, 'Nine intersects')

(Complete File)(Rout)

The results are here

quest
  1. What documentation exists for the limma package?
  2. Find a published paper (other than the one this practical is based around) which uses limma.
  3. How would you analyse the genes of interest?

Top: Index Previous: Overexpression Up: Microarray Practical Next: Heatmaps