Network properties of SGI and other published datasets. (a) A plot of the percentage of targets (y-axis) that interact with a given number of query genes (x-axis), illustrating that the SGI network has properties similar to that of scale-free networks. (b) A plot of the percentage of targets that yield a catastrophic phenotype when targeted by RNAi in a wild-type background  (y-axis) as a function of how many query genes they interact with (degree, x-axis). (c) The precision and recall of interaction networks calculated with respect to GoProcess1000 (see Materials and methods). Significance values (in brackets) were calculated using the hypergeometric distribution. The source of the networks is presented in the text, except for the SuperNet (superimposed network, see Materials and methods). The orange dashed line indicates the precision of the fine genetic interactions extracted from WormBase. The lower dashed line indicates the precision of the interolog network (see Materials and methods). The recall of these two datasets cannot be calculated, as the number of genes that were tested cannot be ascertained. (d) An independent test of the likelihood of true interactions among the Lehner  and SGI genetic-interaction datasets using the algorithm of Zhong and Sternberg , which predicts a confidence level for a genetic interaction between any given gene pair in C. elegans. The 656 interactions of the 'high-confidence' SGI variant, along with the 229 interactions of the highest interaction strength within the SGI network are also analyzed. Each experimentally derived interacting gene pair is binned according to the confidence level predicted by Zhong and Sternberg (x-axis): low-, moderate- and high-confidence predictions have interaction probabilities of 0–0.6, 0.6–0.9, and 0.9–1.0, respectively. The results are plotted as a ratio of the number of experimentally identified interacting gene pairs to the number of gene pairs expected to be in that bin by chance (y-axis). Expected counts were determined by assuming a uniform distribution across all bins for all tested gene pairs. Values within each bar show the number of observed gene pairs over the number expected by chance. The key indicates the data source. Error bars indicate one standard error of the mean.
Byrne et al. Journal of Biology 2007 6:8 doi:10.1186/jbiol58