Top: Index Previous: Analysing Peaks Up: Proteomics Practical Next: Determining Function

CSC8309 -- Gene Expression and Proteomics

Biological Introduction

Now that we have a list of proteins, in a useful form (UniProt accession numbers), we can perform some functional analysis and get to know a bit more about them, how they might be related, what processes they take part in etc.

Introduction to Cytoscape

For this analysis we are going to use the visualisation and analysis platform, Cytoscape. We start off with a tutorial on the use of cytoscape. We will apply what we have learnt later to the data set from the previous section.

act
  • Find the website and download the Cytoscape installer.
  • Install Cytoscape (double-click the installer and follow the on screen instructions).
  • Download the example data set.

The data you have downloaded is a small subset of a large human protein-protein interaction network. It is centred around the interactions of TP53.

act
  • Load the downloaded data into Cytoscape (File -> Import -> Network (Multiple file types)...)

The network should load into the main panel of the Cytoscape window, and in CytoPanel 1 (on the left hand side), you should see some basic statistics about the network (number of nodes and edges). Cytoscape should currently look like this.

Network Layout and Navigation

Cytoscape has many different network layout algorithms included in the build (see the layout menu). Different layouts are useful in different situations, depending on the shape of the network and what you are trying to show.

act
  • The default grid layout of the network is not very good for seeing the connections between nodes in the protein-protein interaction network you have loaded, so change it, using the Layout menu (Layout -> Cytoscape Layouts -> Spring Embedded). Cytoscape should now look like this
  • Another good layout is yFiles -> Organic.

Beware, some layouts can take a long time to calculate, or can cause Cytoscape to hang altogether.

act
  • On the main canvas, you can select individual nodes by clicking on them (they will turn yellow). Groups of nodes can be selected by dragging out a box around them (all nodes and edges touched by the box will be selected). The selected nodes can be moved around the canvas by holding the left mouse button and dragging the mouse around. Try it out.
  • You can zoom into the selected node(s) by clicking on the icon, the 1:1 ()icon returns you to a view of the whole network.
  • You can zoom in and out of the current view by holding the right mouse button and moving the mouse, and you can pan around the network by holding the middle button (scroll wheel) down and moving the mouse around.

Node Attributes

Notice that when you select a node, or group of nodes, information about them appears in the Node Attribute Browser, at the bottom of the Cytoscape Window. You can change the information that is displayed by clicking on this icon and checking the appropriate boxes from the drop down list that appears. Currently there are no extra attributes to display, because the input file that contained the network was very sparse (they just have an ID number, which is actually the numerical Entrez ID of the gene).

act
  • Download the Node Attribute File.
  • Goto File -> Import -> Node Attributes, and load the file you just downloaded into Cytoscape.
  • Close the dialog box and have a look at the node attributes available in the attribute browser now. You can now view the official HUGO symbol for all the nodes.

Searching a Network

act
  • Use the search box in the toolbar (the network must be indexed on the node attribute you wish to search (settings icon next to the search box)). Or hit Ctrl-F to get a dialog box (only searches by node ID).
  • Find and select TP53 (7157).
  • Create a smaller network by Select -> Nodes -> First neighbours of selected nodes, then File -> New -> Network -> From selected nodes, all edges.
  • Improve the layout, and use the 1:1 button to zoom in.

Analysis

Cytoscape alone has a number of useful features, but these features are added to and enhanced by the addition of plugins to the program. Plugins have to be installed to be used, there are 2 ways of doing this. You can drop the appropriate .jar into the Cytoscape plugins folder, or you can use the built in plugin installer (which installs them to a .cytoscape directory in your home). Some only install in one way, some work both, but it is different for different plugins.

act
  • Install the MCODE plugin, try the Cytoscape installer first, but if it doesn't work, you'll have to get the .jar. Google will provide.
  • Install the BiNGO plugin, code here if necessary.
  • If you have had to restart Cytoscape (which you will if the Cytoscape plugin installer doesn't work), reload the example data. Otherwise destroy the small network you made in the "Searching a Network" section, and go back to the original network.
  • Start the MCODE plugin (Plugins -> MCODE -> Start MCODE).
  • A new panel appears in Cytopanel 1 (on the left of the window). Click the 'Analyze' button.
quest
  1. What is the function of the MCODE plugin?
  2. What are the resulting networks likely to be (when working with protein-protein interactions)?
act
  • Select the highest scoring network produced by MCODE. See what happens in the main network (which has changed its visual style as a result of using MCODE, read the documentation to find out what the node shapes and colours mean).
  • You may have to select 'View -> Show Results Panel' if it does not appear automatically.
  • Click the 'Create Sub-Network' button to copy the nodes and edges out to a fresh network.
  • Click Select -> Nodes -> Select all nodes.
  • You may need to install the BiNGO plugin!
  • Start the BiNGO plugin (Plugins -> BiNGO).

BiNGO assesses the over-representation of Gene Ontology terms in a given network or list of genes/proteins. If the network found by MCODE is a genuine protein complex, then a particular molecular function or biological process should be over-represented in the proteins concerned.

quest
  1. Explain the use of the Hypergeometric Distribution (used in BiNGO and GOStats (a Bioconductor package).

We will use BiNGO to assess the over-representation of Biological Process GO terms in the small subnetwork you created using MCODE:

act In the BiNGO Settings dialog box, fill in the following:
  • A network name of your choice,
  • Leave the 'Get cluster from network' box checked,
  • Select the Hypergeometric statistics test with the FDR multiple testing correction,
  • Select a high cutoff p-value of 0.05. Why? A higher cutoff value will give us more data than we can review in detail,
  • Select the GO categories over-represented after correction for visualisation,
  • Under 'Select Reference Set', select 'Use whole annotation as reference set',
  • Select an ontology of 'GO_Biological_Process' and the organism 'Homo sapiens'.
  • The dialog box should now look like this.
  • Hit 'Start BiNGO'.
quest
  1. What is the function of the top MCODE scoring complex?

Top: Index Previous: Analysing Peaks Up: Proteomics Practical Next: Determining Function