The authors have developed CytoKavosh: a Cytoscape Plugin for Finding Network Motifs in Large Biological Networks. There are no patents, products in development or marketed products to declare. This does not alter the authors' adherence to all the PLoS ONE policies on sharing data and materials.
Conceived and designed the experiments: AMN MA. Performed the experiments: MA ASY. Analyzed the data: MA SK ZRMK. Contributed reagents/materials/analysis tools: MA SK. Wrote the paper: MA AMN. Checked and validated the results: SK ASY.
Network motifs are small connected subgraphs that have recently gathered much attention to discover structural behaviors of large and complex networks. Finding motifs with any size is one of the most important problems in complex and large networks. It needs fast and reliable algorithms and tools for achieving this purpose. CytoKavosh is one of the best choices for finding motifs with any given size in any complex network. It relies on a fast algorithm, Kavosh, which makes it faster than other existing tools. Kavosh algorithm applies some well known algorithmic features and includes tricky aspects, which make it an efficient algorithm in this field. CytoKavosh is a Cytoscape plugin which supports us in finding motifs of given size in a network that is formerly loaded into the Cytoscape workspace (directed or undirected). High performance of CytoKavosh is achieved by dynamically linking highly optimized functions of Kavosh's C++ to the Cytoscape Java program, which makes this plugin suitable for analyzing large biological networks. Some significant attributes of CytoKavosh is efficiency in time usage and memory and having no limitation related to the implementation in motif size. CytoKavosh is implemented in a visual environment Cytoscape that is convenient for the users to interact and create visual options to analyze the structural behavior of a network. This plugin can work on any given network and is very simple to use and generates graphical results of discovered motifs with any required details. There is no specific Cytoscape plugin, specific for finding the network motifs, based on original concept. So, we have introduced for the first time, CytoKavosh as the first plugin, and we hope that this plugin can be improved to cover other options to make it the best motifanalyzing tool.
The network concept is widely used to analyze and predict the dynamics of a complex system
The main attributes of the biological networks are their complexity and vast amount of data, so extracting the meaningful data from them needs powerful and accurate methods. Motifs are the building blocks of complex networks, providing a bridge between local vertexrelated properties and global functional properties of networks. Motif analysis in the network is notably important because they may reflect functional properties
A motif is a small connected graph expressed as G. The size of a motif is represented by the number of vertices (nodes)
Here, we introduce CytoKavosh as the first network motif finder plugin for Cytoscape, which uses all Kavosh features and strengthen the studies of finding network motifs based on Milo
CytoKavosh is implemented in Java that uses Cytoscape API. It uses Kavosh algorithm, which makes it faster than similar programs, based on other algorithms. CytoKavosh is suitable to detect network motif in both the directed and undirected networks. The main idea of the enumeration is based on Kavosh
For counting the subgraphs of size
The protocol for extracting subgraphs makes use of the composition operation of an integer. For the extraction of subgraphs of size
To clarify the algorithm, it is necessary to mention that for a particular level
(a) Trees built according to (1,1,1) pattern. According to this pattern, after selecting vertex 1 in root, one of its neighbors must be selected, so the second selected vertex is vertex 2. Continuing the selecting process, one of the neighbors of the vertex 2 (vertex 6) and after that vertex 4 is selected. All chosen vertices are shown by specified circles in these figures. (b) Trees built according to (1,2) pattern. (c) Trees built according to (2,1) pattern. (d) Tree built according to (3) pattern.
After discovering a subgraph, involved as a match in the target network in order to be able to evaluate the size of each class according to the target network, Kavosh employs the Nauty algorithm
Generating random networks in CytoKavosh and Kavosh is similar to Milo's random modelswitching operations. Generating random networks is an essential step in any motif discovery algorithm, as the concept of network motif is meaningful in comparing the frequencies of each subgraph in the given network with which we except in random ones. By using switching operations, more restricted random networks are generated. The operation is applied on the edges of the input network repeatedly, until the network is well randomized. This switching operation is applied on the randomly chosen nodes of the network. By applying this switching operation repeatedly on the input network, an ensemble of random networks is generated.
CytoKavosh improves the CPU time and memory usage in comparison with other algorithms. It can also be employed for finding motifs of the sizes greater than eight, while most of the other algorithms have restriction on motifs with the size greater than eight. Besides, comparing with other algorithms, CytoKavosh has better performance for large networks.
The time complexity of the algorithm is
The upper bound of the memory required by CytoKavosh can be measured by formula
With respect to above, as in other algorithms, it is the nature of a given network which determines the required processing and storage capacities for finding motifs of a specific size in that network. But what makes Kavosh different is the enumeration method it utilizes, which costs less when finding each subgraph in the network. In other words,
CytoKavosh uses Java for executing native Kavosh program, written in C++. CytoKavosh can also group the isomorphic subgraphs by computing the canonical labeling. Native C++ code uses Nauty API for computing canonical labeling that makes the native code very fast. So, we decided to import these native C++ codes to make this plugin as fast as the original program, written in C++.
Network  4size motif  5size motif  6size motif  7size motif 

2  16  149  1407 

0.45  2.37  11.8  73.11 

0.3  1.89  11.49  70 
Rows indicate the running time (seconds) of the studied tool for each motif size.
Furthermore, processing time is more important than memory in Kavosh and can limit the size of discoverable motifs. It is difficult to specify the exact relationship between the size of motif and the processing time, but roughly speaking based on our experiments, in Kavosh it is closer to an exponential relationship than other algorithms. This is illustrated in
Illustrated results are for (a) transcriptional network of
The first step of running the plugin is the loading of network into Cytoscape among loading dialogs. The input graph (directed or undirected) is build from loaded network. By choosing “CytoKavosh” sub menu from “Plugins” menu and by starting this plugin, the program will start working. The next step, following loading of the network, is specifying the input parameters required by the program. These parameters are located in the separate control tab that is named as “CytoKavosh”.
The right side of the figure shows the ‘results’ table panel after running the CytoKavosh for given input parameters. A table for each run of plugin appears in the separate tab in ‘result’ panel. These tabs keep the results until finishing the plugin. For larger sizes of the motifs, the number of detected motifs increases exponentially. So, the ‘results’ table can be explored page by page. The below panel shows the graphical representation of selected motif in the table.
The size of motif to be discovered would be given by the user. This size for current version of plugin can vary from 3 to 9 but the main algorithm supports any given number for motif size.
The significance of a subgraph is evaluated by some measures such as its frequency, ZScore and PValue which are later described in details. Accuracy of these measures increases if the program generates more random networks.
There are two minimum thresholds for frequency and ZScore, which can be given by the users for filtering results, according to their purpose. The calculated significance of each subgraph in the network (like Zscore and Pvalue with respect to the generated random networks) can be calculated from frequency concept. CytoKavosh computes and analyses the network for finding motifs, in following steps:
CytoKavosh traverses the input network and finds all subgraphs of a given size, which exist in the input network. Each subgraph should be tested for its isomorphism class.
During step 1, a binary tree is constructed. This tree holds nonisomorphic subgraphs in different paths of the tree. Traversing this tree gives the adjacency matrix of each motif, and each path from root to a leave is related to one isomorphic class.
The number that is stored in the leaves of tree represents the number of matches of each subgraph in the input network. As mentioned before there are three measures for evaluating an isomorphic class of subgraphs. Each are described in details in the following.
Frequency for each subgraph is calculated from the bellowstated formula:
Statistical measures such as ZScore and PValue are very important factors for comparing subgraphs. CytoKavosh gives ZScore by generating random networks. One of the flexibility of this too is that the number of random networks is determined by user and it varies from 1 to 100. The ZScore for a discovered subgraph
Where
The PValue measure indicates the number of random networks in which a subgraph
For each execution of the program for given input parameters, there is a separate tab in the ‘
Attributes of a motif in the resulting table are frequency, ZScore and the number of occurrences of that motif in the network. Sorting the attributes assists users to export knowledge from the results with graphical facilities deployed in the program. Showing the graphical representation of subgraph, related to selected motif, is one of the graphical facilities.
All 590 motifs are listed in this picture in separate pages. Each page shows only 10 motifs. By clicking on each row (each motif), a graphical image of that motif appears in below panel. For exporting this view, we can right click on the row and export the motif in a separate network. For each running of CytoKavosh, there will open a new tab in ‘result’ panel, with the number shown in <> is the number of found motifs. In this sample, there are 590 motifs of size 5 for the transcriptional network of E. coli. Subgraph count presents the number of subgraphs of this motif type.
User can select desired motif in the
With right clicking on a motif in the list, user can export the image of that motif in a separate network view in the Cytoscape. As the size of motif increases, the number of finding motifs, especially in large networks, increases too.
For better understanding the advantages of CytoKavosh plugin, we chose FANMOD software to compare and detect the motifs. Because the FANMOD is the most efficient tool according to time among other existing tools like MAVisto, Mfinder and MODA. So, the comparisons with other tools are just limited to FANMOD.
The main advantage of CytoKavosh plugin over FANMOD is that CytoKavosh is based on Kavosh algorithm that operates over any given size of motif (more than 8) but FANMOD cannot detect up to 8 vertices of motifs.
3  4  5  6  7  8  9  

0.3  1.84  14.51  141.98  1374  13173  121110 

0.81  2.53  15.71  132.24  1205.97  92566.1   
Rows indicate the running time (seconds) of the studied tool for each motif size. 10 random networks are generated for this comparison.
Both tools can operate on directed and undirected graphs and both of them allow users to filter their results. But, another advantage of CytoKavosh is the graphical options to display the results. FANMOD can export the table of result for detected motifs in a dummy file, which does not allow us to carry out any modification or additional operation on it. On the other hand, CytoKavosh makes it possible for users to get an overview of occurrences of motifs within the whole network by highlighting them in the main network view.
There is not any other Cytoscape plugin, specific for finding network motifs, based on the original term introduced by Milo
In Molecular Biology, there is a need to study interactions between biological elements like DNA, RNA, proteins or genes. To model these interactions, we adopt graph theory concepts: the biological elements that interact with others are represented by vertices and their interactions are represented by edges or arcs, depending on the type of interactions. We, hence, obtain a large graph called biological network. To extract pertinent knowledge from a biological network, we adopt several techniques, like network clustering
CytoKavosh is a flexible and extendable tool, used to analyze motifs in the biological networks. We believe that embedding the CytoKavosh in Cytoscape will further contribute to the establishment of Cytoscape as an integrated suite of tools for the analysis of biological networks. The high performance of CytoKavosh is achieved by dynamically linking the highly optimized C++ functions of the Kavosh program to the Cytoscape Java program, which makes CytoKavosh extremely suitable for the analysis of large biological networks. As Cytoscape continues to evolve, CytoKavosh evolves alongside it. The original algorithm is highly efficient and allows further extension. In particular, future versions can extend CytoKavosh towards the evaluation of motifs with larger sizes and also the parallelization of the algorithm for analyzing huge networks. Details of the main part of CytoKavosh algorithm are explained in Appendix 1.
The tutorial, full package and associated examples are available at the following website:
We would like to appreciate Dr. Ameneh Javid for editing the English language of the manuscript.