Understanding Transcriptional Regulation Using De-novo Sequence Motif Discovery, Network Inference and Interactome Data
Arvind Rao, Alfred O. Hero, David J. States, James Douglas Engel

TL;DR
This paper explores computational methods to identify gene regulatory elements by integrating sequence, expression, and interactome data, aiming to improve annotation of uncharacterized genomic regions.
Contribution
It introduces a multi-modal data integration approach using statistical learning to predict regulatory enhancers, exemplified by the Gata2 gene case study.
Findings
High-throughput assays provide valuable features for enhancer prediction.
Integrated data modalities improve the accuracy of regulatory element identification.
Specific features from genomic and expression data are most discriminatory.
Abstract
Gene regulation is a complex process involving the role of several genomic elements which work in concert to drive spatio-temporal expression. The experimental characterization of gene regulatory elements is a very complex and resource-intensive process. One of the major goals in computational biology is the \textit{in-silico} annotation of previously uncharacterized elements using results from the subset of known, previously annotated, regulatory elements. The recent results of the ENCODE project (\emph{http://encode.nih.gov}) presented in-depth analysis of such functional (regulatory) non-coding elements for 1% of the human genome. It is hoped that the results obtained on this subset can be scaled to the rest of the genome. This is an extremely important effort which will enable faster dissection of other functional elements in key biological processes such as disease progression…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Chromatin Dynamics · Gene expression and cancer classification · Genomics and Phylogenetic Studies
