Ab initio identification of putative human transcription factor binding sites by comparative genomics
Davide Cora', Carl Herrmann, Christoph Dieterich, Ferdinando Di Cunto,, Paolo Provero, Michele Caselle

TL;DR
This paper presents a comparative genomics approach combining human-mouse sequence analysis, statistical filtering, and coregulation evidence to identify potential transcription factor binding sites across the human genome.
Contribution
It introduces an ab initio method that integrates sequence conservation, functional annotation, and expression data to discover regulatory motifs.
Findings
Identified known transcription factor binding motifs.
Discovered new candidate regulatory sites.
Validated coregulation through gene ontology and expression analysis.
Abstract
We discuss a simple and powerful approach for the ab initio identification of cis-regulatory motifs involved in transcriptional regulation. The method we present integrates several elements: human-mouse comparison, statistical analysis of genomic sequences and the concept of coregulation. We apply it to a complete scan of the human genome. By using the catalogue of conserved upstream sequences collected in the CORG database we construct sets of genes sharing the same overrepresented motif (short DNA sequence) in their upstream regions both in human and in mouse. We perform this construction for all possible motifs from 5 to 8 nucleotides in length and then filter the resulting sets looking for two types of evidence of coregulation: first, we analyze the Gene Ontology annotation of the genes in the set, searching for statistically significant common annotations; second, we analyze the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Chromatin Dynamics · RNA modifications and cancer · RNA Research and Splicing
