http://www.plosgenetics.org/article/info%3Adoi%2F10.1371%2Fjournal.pgen.1000590
The paper describes a computational method called expression screening for extracting novel information from large amounts of experimental data sets. More precisely the authors find a series of new regulator genes for the human oxidative phosphorylation system (OxPhos). A number of 1427 non redundant microarray data sets were used as the information pool, based on the available data from NCBI Gene Expression Omnibus. The method analyses the correlation of the expression pattern between a defined set of genes in each of the microarray data set. “Genes that consistently co-express with the query gene set in many independent data sets likely have a functional role in the query pathway”. Each data set is weighted and then all the data is integrated in order to give the coexpression probability of each gene in each data set.
The method was tested successfully on the genes from the cholesterol biosysntehsis pathway. When tested on the OxPhos pathway a series of new genes emerged as being necessary for the OxPhos function. A set of five novel OxPhos genes were tested experimentally (C14orf2, USMG5, CHCHD2, SLIRP and PARK7) using knockout and qPCR assays. From these, SLIRP emerged as a gene encoding an RNA binding protein that regulates the mitochondrial RNA, more specifically affecting the mtRNA transcripts. Studies also show that mtRNA abundance affects in equal measure the stability of SLIRP proteins. CHCHD2 was also proven essential to OxPhos.