A large-scale view of RNA biology


RNA binding proteins (RBPs) bind to RNA molecules in order to define the mature spliced sequence, direct nuclear export and sub-cellular localization, control translation initiation and efficiency, and regulate stability and ultimate decay. RNA processing plays critical roles in regulation of early development and differentiation of stem cells into mature lineages, and mis-regulation of RNA processing plays critical roles in cancer, neurodegenerative and other heritable diseases, and in both viral replication and anti-viral immune responses. Indeed, the emergence of genomics techniques have rapidly advanced our ability to identify genetic and transcriptomic causes of disease, and indeed there is now an ever-growing list of mutations in RBPs and RNA processing events causally linked to human diseases. However, it remains challenging to rapidly convert this genetic knowledge into the mechanistic understanding of the physiologically relevant mis-regulation required to develop therapeutic interventions, and to understand the complex regulatory networks controlled by the more than 1500 RBPs in the human genome. We utilize a mix of experimental and computational approaches to map RNA binding protein interaction networks and the regulatory roles of RBPs, in order to develop a global understanding of how RNA processing regulatory networks drive human physiology.

Methods to explore RNA processing

The first step to understanding an RBP's roles is to identify the RNAs it interacts with. The development of HITS-CLIP and iCLIP methods revolutionized our ability to map RBP binding sites in vivo by enabling pulldown of an RBP of interest, followed by high-throughput sequencing of crosslinked RNA fragments. However, low efficiency of converting RNA into sequencing library limited their scalability in building large-scale networks incorporating many RBPs.

To address this limitation, we and others recently described improved CLIP methodologies that dramatically improve the efficiency of converting immunoprecipitated RNA into high-throughput sequencing libraries, decreasing experimental failures and costs. The improved efficiency in our eCLIP approach enabled quantitative normalization against paired inputs to distinguish true binding events from false positive artifacts, and empowered the generation of 223 eCLIP datasets profiling 150 RBPs in K562 and HepG2 cells.

We are continuing to build upon our eCLIP work to enable profiling of additional RBPs (including those lacking suitable antibodies), as well as in low-input or complex tissue samples. We are also using the eCLIP framework to develop improved targeted methods that deeply explore individual aspects of RNA processing, including deep experimental mapping of microRNA targets and simplified approaches to quantify translation efficiency transcriptome-wide.

Large-scale maps of RNA regulatory networks

Deep characterization of individual RBPs can give important insights into how an RBP drives molecular and cellular phenotypes. However, layering dozens or hundreds of RBPs together can give further insights by revealing similar and distinct binding modalities and properties between RBPs, identifying co-interacting and co-regulating factors, and identifying critical highly-interacting RNA elements.

Our ENCODE work profiling targets for 150 RBPs revealed insights into basic RNA processing mechanisms in K562 and HepG2 cell lines. In the Van Nostrand lab, we are expanding this effort into RNA viruses (SARS-CoV-2 and Dengue) to build a global regulatory map of host RBP interactions with these viral RNAs, in order to better understand the essential roles host RBPs play in enabling viral replication and infection.

RNA translation in breast cancer

RNA processing has emerged as a critical regulatory step in initiation and progression of cancer, with alterations in RNA transcription, splicing, and translation identified in nearly all cancer types. In particular, alterations in the rate of translation of mRNAs into proteins is a particularly complex yet commonly dysregulated aspect of RNA processing in tumorigenesis, as cancer cells show alterations in both global translation rates as well as differential translation of specific mRNA subsets that drive tumor genesis, progression, and metastasis. However, our understanding of how RBPs can dysregulate RNA translation and contribute to tumorigenesis and progression remains fragmented due to technologic limitations that impede systematic RBP characterizations.

To create the first global map of the landscape of RBPs that drive aberrant translation during emergence and metastasis of breast cancer, we will develop novel integrative approaches to map RNA processing regulatory networks and predict key functional nodes that drive altered translation and tumor phenotypes. In addition to insights into potential therapeutic targets in breast cancer, our development of new techniques and coupled bioinformatic tools will provide a framework for future research into other types of cancer that often share similar aberrations in RNA processing.