Skip to main content

Cytoscape 3 and GNNs

USING CYTOSCAPE 3.X FOR EFI GNN NETWORKS

 

The following tutorial has been modified from the Cytoscape Wiki to specifically address working with Genome Neighborhood Networks created by the Enzyme Function Initative (EFI) tool EFI-GNT. To view Cytoscape 3’s extensive tutorial pages, please go here.

 

CONTENTS OF THIS TUTORIAL

 

1) Download Cytoscape 

2) Importing Both SSN and GNN

3) Changing Visual Aspects

4) Filtering by SSN Cluster Number

5) Eliminating Low Co-occurrence Neighbors 

6) Eliminating Spoke-less Hubs 

7) Saving Sessions 

8) Opening a Session File

 

1. DOWNLOAD and adjust CYTOSCAPE

Go to http://www.cytoscape.org/download.html and download a copy of Cytoscape. Check that your machine is running an up-to-date and correct bit (32-bit or 64-bit to match your Windows OS) version of Java Runtime Environment, to ensure successful installation of Cytoscape.

 

ADJUST VIEW THRESHOLD – in Cytoscape the automatic rendering and coloring of the colorized SSN is size dependent. Cytoscape settings include a “Threshold View” that needs to be adjusted in the following manner in order to automatically view your colored SSN:

  • In any version 3.X, go to Edit -> Preferences -> Properties...
  • With “cytoscape 3” selected in the pull-down menu at the top, scroll to the bottom of the Property list and select “viewThreshold”
  • Click “Modify” and insert 5 zeros to the end of the displayed number
  • Click “OK”
  • Restart Cytoscape

ADJUST VISUAL STYLE DEFAULT – in some Cytoscape versions, the default visual style assigns a label font color of white, which is not optimal for viewing Genome Neighborhood Networks. Changing this default style to "BioPAX" loads a better visual style automatically:

  • In any version 3.X, go to Edit -> Preferences -> Properties...
  • With “cytoscape 3” selected in the pull-down menu at the top, select “defaultVisualStyle”
  • Click “Modify” and type "BioPAX"
  • Click “OK"

 

2. Importing both ssn and gnn

Before starting, you will need a dataset in the form of an xgmml file. Run EFI-GNT with a previously generated SSN input (or run the Test Cases at the bottom of the tool start page) and download the results. Browsers may add file extensions (.txt or .xml) to the downloaded files. Rename your files so that they end in .xgmml. Launch Cytoscape. You should see a window that looks like this:

 

                                        

 

 

Load your GNN network file into Cytoscape by selecting File → Import → Network → File... and navigating to the location of your .xgmml file. Once loaded, click OK on the Import Network popup window. 

 

 

                 

 

 

The initial view is not informative, but this is normal. Under the Layout menu, select yFiles Layouts → Organic. After a brief calculation, your screen should look like this:

 

 

                   

                

 

Now, import the colorized SSN output, via the same mechanism described above, but be sure to select "Create new network collection" from the first pull-down menu of the Import Network pop-up prompt. Again, after importing, apply the Organic layout to the colorized SSN.

 

                   

                   

 

 

3. changing visual aspects

Due to various Cytoscape preferences, networks may load with sub-optimal visual aspects - these are easy to change in the Control Panel's Style tab. Two examples:

 

  • Label color: if label color is white, change to black for easier visualization.

                  

 

  • SSN node shape: if node shape is rectangular, click the "Lock node width and height" box for more uniform node appearance.

                 

 

 

4. filtering by ssn cluster number

Generally, the full +/- 10 neighbor GNN presents an overwhelming amount of information. We found that it is substantially more useful to filter GNN networks by some criteria, such as SSN cluster number, before further analysis. Filterable criteria are limited to the GNN node and edge attributes, described here. In the colorized SSN, the node attribute, Supercluster, refers to an arbitrary number assigned to each SSN cluster in the GNN generation process. One can determine the Supercluster number associated with an SSN cluster of interest by viewing the Supercluster column in the Data Panel. For example, you may be interested to see, for a cluster containing a single protein with a SwissProt reviewed status, if all sequences within that cluster contain a shared genome context, thus allowing confident transfer of the SwissProt annotation to all connected sequences. One can then filter the GNN by this Supercluster number to inspect only Pfam neighbors in the neighborhood of a specific SSN cluster.

 

                           

 

You can filter the GNN using the Select tab at the top of the Control Panel. Click the "plus sign" button and select "Column Filter". From the "Choose column..." dropdown menu, select "Cluster Number (Supercluster)". Uncheck the "Apply Automatically" button below. Then enter the Supercluster number as both lower and upper limit. Click apply. The selected nodes will be highlighted yellow (second image below).

 

 

                  

                  

 

Typing command+6 (control+6 in Windows) twice selects, first, the nearest neighbors to your selected nodes (aka. the hub nodes), and second, the next nearest neighbors (aka. all other spoke nodes connected to the hub node).

 

                   

 

Create a daughter network with the selected nodes, by using the hot button at the top of the Cytoscape window (button image is a network connected to a file via an arrow). 

 

                   

 

Re-layout this network with the Organic Layout.

 

 

5. eliminating low co-occurrence neighbors

One may wish to remove neighbors from the neighborhood network at a more stringent co-occurrence than was designated during network generation. You can remove nodes of a certain co-occurrence using the Select tab at the top of the Control Panel. Click the "plus sign" button and select "Column Filter". From the "Choose column..." dropdown menu, select "ClusterFraction". With the "Apply Automatically" box selected, adjust the right arrow of the scroll bar to increase the ClusterFraction threshold. Or, with the "Apply Automatically" box de-selected, enter manual selections for the lower and upper bounds, such as 0.2 - 0.3, aka. a co-occurrence between 20-30%. 

 

ClusterFraction = Co-occurrence / 100

 

Delete selected nodes by selecting Edit --> Delete selected nodes and edges...

 

                 

 

Now the lowest co-occurring neighbors in the network must occur in the neighborhood of atleast 30% of the sequences in a given SSN cluster, according to the example above.

 

 

6. eliminating spoke-less hubs

Deleting spoke nodes from a GNN sometimes produces hub nodes with no spoke nodes connected. Eliminate these spoke-less hub nodes using a Topology Filter. Using the Select tab at the top of the Control Panel, click the "plus sign" button and select "Topology Filter". Specify nodes with at least 1 neighbour within distance 1. 

 

                  

Now - select the spoke-less hubs by using the Select menu at the top of the Cytoscape window. Click Select --> Nodes --> Invert Node Selection.

                   

                   

 

Delete spoke-less hubs using Edit --> Delete selected nodes and edges...

 

                  

 

You can now re-layout the network to clean up the appearance.

 

 

7. SAVING SESSIONS

 

Cytoscape can save all workspace states, including networks, attributes, visual styles, properties, and window sizes, into a session file (.cys). To save as a session, click the Save Session "floppy disk" icon on the toolbar and a .cys file will be saved.

 

 

8. OPENING A SESSION FILE

 

To open the session file, click the Open Session "file folder" icon on the toolbar. A warning pop-up window will be shown. Click OK and select a session file. By doing this, everything will be restored automatically from the file.