CSC8309 -- Gene Expression and Proteomics
Biological Introduction
Now that we have a list of proteins, in a useful form (UniProt accession
numbers), we can perform some functional analysis and get to know a bit more
about them, how they might be related, what processes they take part in etc.
Introduction to Cytoscape
For this analysis we are going to use the visualisation and analysis platform,
Cytoscape. We start off with a tutorial on the use of cytoscape. We will apply
what we have learnt later to the data set from the previous section.
|
- Find the website and download the Cytoscape installer.
- Install Cytoscape (double-click the installer and follow the on screen
instructions).
- Download the example data set.
|
The data you have downloaded is a small subset of a large human
protein-protein interaction network. It is centred around the interactions of
TP53.
|
- Load the downloaded data into Cytoscape (File -> Import -> Network
(Multiple file types)...)
|
The network should load into the main panel of the Cytoscape window, and in
CytoPanel 1 (on the left hand side), you should see some basic statistics
about the network (number of nodes and edges). Cytoscape should currently look
like this.
Network Layout and Navigation
Cytoscape has many different network layout algorithms included in the build
(see the layout menu). Different layouts are useful in different situations,
depending on the shape of the network and what you are trying to show.
|
- The default grid layout of the network is not very good for seeing the
connections between nodes in the protein-protein interaction network you
have loaded, so change it, using the Layout menu (Layout -> Cytoscape
Layouts -> Spring Embedded). Cytoscape should now look like this
- Another good layout is yFiles -> Organic.
|
Beware, some layouts can take a long time to calculate, or can cause Cytoscape
to hang altogether.
|
- On the main canvas, you can select individual nodes by clicking on them
(they will turn yellow). Groups of nodes can be selected by dragging out a
box around them (all nodes and edges touched by the box will be selected).
The selected nodes can be moved around the canvas by holding the left mouse
button and dragging the mouse around. Try it out.
- You can zoom into the selected node(s) by clicking on the
icon, the 1:1 ()icon returns you to a view of the whole network.
- You can zoom in and out of the current view by holding the right mouse
button and moving the mouse, and you can pan around the network by holding
the middle button (scroll wheel) down and moving the mouse around.
|
Node Attributes
Notice that when you select a node, or group of nodes, information about them
appears in the Node Attribute Browser, at the bottom of the Cytoscape Window.
You can change the information that is displayed by clicking on this icon
and checking the appropriate boxes from the drop down
list that appears. Currently there are no extra attributes to display, because
the input file that contained the network was very sparse (they just have an
ID number, which is actually the numerical Entrez ID of the gene).
|
- Download the Node Attribute File.
- Goto File -> Import -> Node Attributes, and load the file you just
downloaded into Cytoscape.
- Close the dialog box and have a look at the node attributes available in
the attribute browser now. You can now view the official HUGO symbol for
all the nodes.
|
Searching a Network
|
- Use the search box in the toolbar (the network must be indexed on the node
attribute you wish to search (settings icon next to the search box)). Or
hit Ctrl-F to get a dialog box (only searches by node ID).
- Find and select TP53 (7157).
- Create a smaller network by Select -> Nodes -> First neighbours of selected
nodes, then File -> New -> Network -> From selected nodes, all edges.
- Improve the layout, and use the 1:1 button to zoom in.
|
Analysis
Cytoscape alone has a number of useful features, but these features are added
to and enhanced by the addition of plugins to the program. Plugins have to be
installed to be used, there are 2 ways of doing this. You can drop the
appropriate .jar
into the Cytoscape plugins folder, or you can use the built
in plugin installer (which installs them to a .cytoscape
directory in your
home). Some only install in one way, some work both, but it is different for
different plugins.
|
- Install the MCODE plugin, try the Cytoscape installer first, but if it
doesn't work, you'll have to get the
.jar . Google will provide.
- Install the BiNGO plugin, code here if necessary.
- If you have had to restart Cytoscape (which you will if the Cytoscape plugin installer doesn't work), reload the example data. Otherwise destroy the small network you made in the "Searching a Network" section, and go back to the original network.
- Start the MCODE plugin (Plugins -> MCODE -> Start MCODE).
- A new panel appears in Cytopanel 1 (on the left of the window). Click the 'Analyze' button.
|
|
- What is the function of the MCODE plugin?
- What are the resulting networks likely to be (when working with protein-protein interactions)?
|
|
- Select the highest scoring network produced by MCODE. See what happens in the main network (which has changed its visual style as a result of using MCODE, read the documentation to find out what the node shapes and colours mean).
- You may have to select 'View -> Show Results Panel' if it does not appear automatically.
- Click the 'Create Sub-Network' button to copy the nodes and edges out to a fresh network.
- Click Select -> Nodes -> Select all nodes.
- You may need to install the BiNGO plugin!
- Start the BiNGO plugin (Plugins -> BiNGO).
|
BiNGO assesses the over-representation of Gene Ontology terms in a given
network or list of genes/proteins. If the network found by MCODE is a genuine
protein complex, then a particular molecular function or biological process
should be over-represented in the proteins concerned.
|
- Explain the use of the Hypergeometric Distribution (used in BiNGO and GOStats (a Bioconductor package).
|
We will use BiNGO to assess the over-representation of Biological Process GO
terms in the small subnetwork you created using MCODE:
|
In the BiNGO Settings dialog box, fill in the following:
- A network name of your choice,
- Leave the 'Get cluster from network' box checked,
- Select the Hypergeometric statistics test with the FDR multiple testing correction,
- Select a high cutoff p-value of 0.05. Why? A higher cutoff value will give us more data than we can review in detail,
- Select the GO categories over-represented after correction for visualisation,
- Under 'Select Reference Set', select 'Use whole annotation as reference set',
- Select an ontology of 'GO_Biological_Process' and the organism 'Homo sapiens'.
- The dialog box should now look like this.
- Hit 'Start BiNGO'.
|
|
- What is the function of the top MCODE scoring complex?
|