Top: Index Previous: Reading the Data Up: Microarray Practical Next: Basic Eye Balling

CSC8309 -- Gene Expression and Proteomics

Subsetting the data set

The affy chip in question contains probes from a number of different species besides Soybean. We don't want these in the analysis as they are not relevant. Any results from them will be cross-hybridisation or noise. So, we need to select just those probes that we want.

This section mostly consists of some fairly obscure R manipulations.

act

For this section you need two files, firstly SpeciesAffyID.txt and secondly SoybeanCutObjects.RData. Download these first and put then in your working directory.

Now, evalutate the following R.

##
## We are trying to do some subsetting because not all of the probes on the
## chip are from soy


## read in another data frame called Species.Affy.ID.
## this links species names to affy ids.
Species.Affy.ID <- read.table('SpeciesAffyID.txt', header = T, sep = "")
dim(Species.Affy.ID)



load( 'SoybeanCutObjects.RData' )

tv.for.glycine.max <- Species.Affy.ID$species == 'Glycine max'
table( tv.for.glycine.max )
listOutProbeSets <- Species.Affy.ID$affyID[ tv.for.glycine.max==FALSE ]

length( listOutProbeSets )
is.factor( listOutProbeSets )

## Create a character vector for listOutProbeSets
## One way: rename listOutProbeSets as a character vector
listOutProbeSets <- as.character(listOutProbeSets)

## Confirm that listOutProbeSets is a character vector
is.character(listOutProbeSets)

## check object
soy.ab


## this is the bit which actually removes the stuff we are not intereste
RemoveProbes(listOutProbes=NULL, listOutProbeSets, cdfpackagename, probepackagename)

## Check that the object has less IDs now. There should be 37444.
soy.ab

(Complete File)(Rout)
quest
  1. What does the string Glycine max represent?
  2. Why would a single chip contain probes from several species?
  3. What functions are defined in the .RData file?

Change some names

We now want to change some of the metadata associated with the array, again to make the pictures prettier.

act

Evaluate the following R.

# Start preparation for phenoData slot in AffyBatch object
pd <- data.frame(population = c(1,1,1,2,2,2), replicate = c(1,2,3,1,2,3))

# Display contents of pd
pd

# Assign the sampleNames(soy.ab) to the rownames of pd
rownames(pd) <- sampleNames(soy.ab)

# Display contents of pd again, notice change in rownames
pd

## Continue preparation for phenoData slot
metaData <- data.frame(labelDescription = c( 'population', 'replicate' ))

## Establish new phenoData slot
phenoData(soy.ab) <- new( 'AnnotatedDataFrame', data = pd, varMetadata = metaData)

## Display pData(soy.ab)
pData(soy.ab)

## Display phenoData(soy.ab)
phenoData(soy.ab)

(Complete File)(Rout)
quest
  1. Can you find documentation for the phenoData class?
  2. What other methods operate on this class? Can you find any more information from these?

Image set up

The next set of code does the exciting task of setting up some colour palettes. Remember, black and white is boring.

act
palette.gray <- c(rep(gray(0:10/10), times = seq(1,41, by = 4)))

library('RColorBrewer')
brewer.cols <- brewer.pal(6, 'Set1')

(Complete File)(Rout)

Top: Index Previous: Reading the Data Up: Microarray Practical Next: Basic Eye Balling