Genome wide survey of MicroRNA - Transcription Factors regulatory circuits in human

Presented by: Angela Re

A basic notion of modern system biology is that biological functions are performed by groups of genes which act in an interdependent and synergic way. This is true in particular for regulatory processes for which a "network" point of view is by now a mandatory concept.

Among the various important consequences of this approach a prominent role is played by the notion of "network motif". The idea is that a complex network (say a regulatory network) can be divided into simpler, distinct regulatory patterns called network motifs, typically composed by three or four interacting components which are able to perform elementary signal processing functions. Network motifs can be thought of as the smallest functional modules of the network and, by suitably combining them, the whole complexity of the original network can be recovered.

In particular, in transcriptional regulatory networks an important role is played by feed-forward loop motifs. Several theoretical and experimental works dealing with the kinetic properties of these network motifs and the associated temporal programming of gene expression are now available in the literature.

The major focus of our work was to include post-transcriptional regulatory interactions in this class of network motifs.

Indeed, in the last few years it has become more and more evident that post-transcriptional processes play a role much more important than previously expected in the regulation of gene expression.

Among the various mechanisms of post-transcriptional regulation a prominent role is played by a class of small RNAs called microRNAs. MicroRNAs (miRNAs) are a family of 22nt long non-coding RNAs which negatively regulate gene expression at the post-transcriptional level, in a wide range of organisms.

Mature miRNAs are produced from longer precursors which in some cases cluster together in the so called miRNA "Transcriptional Units" (TU) whose expression is regulated by the same molecular mechanisms which control the transcription of standard protein-coding gene.

Even though the precise mechanism of action of the miRNAs is not very well understood, the current paradigm is that in animals miRNAs are able to repress the translation of target genes by recognizing specific target sites in the 3'-UTR of the regulated genes. It has been shown that miRNA binding sites are usually overrepresented in the 3'-UTR sequence of the target genes.

All these findings in addition with a large amount of work related to the discovery of transcription factor binding sites, suggest that both at the trascriptional and at the postrascriptional level regulatory interactions could be predicted in silico by looking at overrepresented short sequences of nucleotides (in the promoters or in the 3'-UTRs) and then filtering the results by suitably imposing evolutionary or functional constraints.

    
In particular, the aim of this work is to obtain with a computational tool of this type a list of feedforward circuits in which a master transcription factor (TF) regulates a miRNA and together with it a set of target genes.

To do so, we first constructed a transcriptional and, separately, a posttranscriptional regulatory network in human based on a genome-wide screen for conserved overrepresented motifs (with putative regulatory role as transcription factor binding sites and microRNA binding sites) in human and mouse promoters 3'-UTRs. We then integrated the two networks looking for all possible cases in which a given transcription factor A regulates a certain protein-coding gene B and a certain miRNA B' and, at the same time, the miRNA B' regulates the gene B as well.

We obtained a catalogue composed by a total of a few hundreds of such regulatory circuits in human.

An appealing feature of our approach was that the process of deriving all the predictions in this study was unbiased by previous experimentally or computationally derived knowledge. It is clear, however, that this kind of computational approach may produce false positive entries. Therefore, to reduce the number of these false positives we made a serious effort to compare our results with various different and independent sources of information, both of experimental and computational type and developed a bioinformatics scoring scheme which allowed us to rank our results and select among the others those regulatory circuits which are more likely to be biologically relevant. In a few cases some (or all) of the regulatory interactions which compose the feed forward loop were already known in the literature but their combination into a closed regulatory circuit was not noticed. The predictions that are confirmed by the literature represent in our opinion an important validation of our approach. However most loops are new and provide a baseline to generate future working hypotheses.

In addition to identifying regulatory circuits, we also tried to use our results to explore the regulatory interactions that affect the hematopoietic system development or activity. We also discussed with particular attention specific examples that seem to be important in oncogenesis.

The approach used in this study has several useful outcomes. First, the criterion for finding regulatory circuits should ensure a lower rate of false positives in defining the genes controlled by miRNAs. Second, the catalogue of circuits presented here may be important for annotating the regulatory role of transcription factors affecting miRNA expression. Finally, such approach contributes to analyze the complex architecture of the gene expression regulatory network.

References: Shalgi R, Lieber D, Oren M, Pilpel Y (2007) "Global and local architecture of the mammalian microrna-transcription factor regulatory network". PLoS Comput Biology 3:e131.

Chan C, Elemento O, Tavazoie S (2005) "Revealing posttranscriptional regulatory elements through network-level conservation". PLoS Comput Biol 1:e69.

Cora D, Di Cunto F, Caselle M, Provero P (2007) "Identification of candidate regulatory sequences in mammalian 3' utrs by statistical analysis of oligonucleotide distributions". BMC Bioinformatics 8:174.

No assets have been submitted for this session.