Estimating binding properties of transcription factors from genome-wide binding profiles
Nicolae Radu Zabet, Boris Adryan

TL;DR
This paper presents an analytical model to predict transcription factor occupancy on DNA using genomic data, and applies it to infer binding characteristics of five Drosophila TFs during development.
Contribution
It introduces a new model integrating sequence preferences, DNA accessibility, and TF copy number to estimate binding occupancy from ChIP-seq data.
Findings
TFs have thousands of molecules specifically bound to DNA
Bicoid and Caudal show higher specificity than others
Many TF molecules are not bound specifically to DNA
Abstract
The binding of transcription factors (TFs) is essential for gene expression. One important characteristic is the actual occupancy of a putative binding site in the genome. In this study, we propose an analytical model to predict genomic occupancy that incorporates the preferred target sequence of a TF in the form of a position weight matrix (PWM), DNA accessibility data (in case of eukaryotes), the number of TF molecules expected to be bound specifically to the DNA and a parameter that modulates the specificity of the TF. Given actual occupancy data in form of ChIP-seq profiles, we backwards inferred copy number and specificity for five Drosophila TFs during early embryonic development: Bicoid, Caudal, Giant, Hunchback and Kruppel. Our results suggest that these TFs display thousands of molecules that are specifically bound to the DNA and that, while Bicoid and Caudal display a higher…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
