(main page)
Network analysis
Network analysis tools

The MCL software includes a variety of programs that implement various modes of network analysis. The clustering program is unsurprisingly called mcl. The mcx program implements network-related functionality, and has different modes each dedicated to a specific task. The clm program implements clustering-specific functionality, and has different modes for different tasks.

mcl

This clustering method accepts a single parameter controlling the granularity of the resulting clustering called inflation. Low inflation leads to coarser clusterings, high inflation leads to fine-grained clusterings. It is suggested to use a few values, for example 1.4, 2, and 6.

 
mcxarray

The program mcxarray constructs networks from tabular data such as provided by gene expression arrays. Either Pearson or Spearman correlation can be used. The program can handle missing data in the form of empty columns, NA values (not available/applicable) or NaN (not a number). It is efficient, parallelized and can handle large data sets.

 
mcx query

The mcx program in mode query. Its main use is to vary a cutoff below which edges are removed, emitting statistics on the resulting thresholded graphs such as the number of components, the number of singletons, the average and median node degrees, and the average and median edge weights. It additionally produces a graph plotting, at each threshold, the R-squared value for log(k) versus log(#nodes of degree >=k) (high for scale-free-ish networks). This program can be used for example to find a good correlation cutoff for networks created using mcxarray.

 
mcx ctty

The mcx program in mode ctty. It computes betweenness centrality for all nodes in a network, a very compute-intensive task. The program uses the efficient update algorithm by Ulrik Brandes, a clever node-wise parallelizable algorithm. This mode can run on multiple machines, each machine running multiple threads, and hence can make effective use of available resources.

 
mcx diameter

The mcx program in mode diameter. It computes the diameter of a graph as well as the eccentricity of each node. This is also a computationally intensive task, and this mode can also run on multiple machines, each machine running multiple threads.

 
mcx clcf

The mcx program in mode clcf. It computes the clustering coefficient for each node in a network. This is not a computationally intensive operation, and hence parallelism is not required.

 
mcx erdos

The mcx program in mode erdos. It computes ensembles of shortest simple (unweighted) paths between two nodes. It was written with a focus on speed.

 
clm order

The clm program in mode order. Given a set of input clusterings, this program creates a reconciled fully nested set of output clusterings. Additionally, clusters are reordered at all levels such that larger clusters precede smaller clusters. It can output a tree structure that can be converted to Newick format with mcxdump.

 
clm dist

The clm program in mode dist. It computes distances between clusterings, according to one of the split/join, variance of information, or Mirkin metrics.

 
clm info

The clm program in mode info. It outputs a simple numerical performance criterion for a clustering. It rewards clusterings both for being granular and for capturing many edges in the input graph. Its criterion lies in the range [0-1] and achieves 1 only for the canonical clustering of a graph that consists of pairwise disjoint internally completely connected subparts. In addition, it is affected by differentiation among the edge weight. It is not intended as an optimization criterion, but can be used to detect trends and optionally to spot bad clusterings.

 
mcxrand

This program can generate random graphs using a uniform edge generation model. It can also shuffle an existing graph while preserving the node degree distribution.