Mapping transcription factor binding sites by learning UV damage fingerprints
Hannah E Wilson, Scott Stevison, Levi Lamprey, John J Wyrick

TL;DR
This paper introduces a new method to map transcription factor binding sites by analyzing UV-induced DNA damage patterns, improving accuracy and resolution.
Contribution
The novel approach uses UV damage fingerprints and machine learning to identify transcription factor binding sites with single-nucleotide resolution.
Findings
CPD fingerprints from UV damage can be used to identify TF binding sites with machine learning.
New binding sites for Hap2/Hap3/Hap5 and Gcr1 were identified in yeast, including sites missed by previous methods.
The method successfully identified new TFBS in human cells for the Nuclear Factor-Y complex.
Abstract
Deciphering transcriptional networks requires methods to accurately map binding sites of sequence-specific transcription factors (ssTFs) across the genome. Here, we show that ssTF binding induces distinct patterns of UV-induced cyclobutane pyrimidine dimers (CPDs), and that these CPD ‘fingerprints’ can be exploited by machine learning methods to identify ssTF binding sites (TFBS). As a proof of principle, we analyzed CPD-seq data from yeast cells using the Random Forest algorithm to identify 75 TFBS bound by the Hap2/Hap3/Hap5 ssTF complex, including ∼25 new sites missed by previous chromatin immunoprecipitation (ChIP)-based experiments. Parallel analysis of the Gcr1 ssTF using a neural network trained on CPD-seq data including only 6 known binding sites identified 63 Gcr1 TFBS across the genome. Our analysis indicates that the newly identified TFBS are associated with many genes that…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Chromatin Dynamics · RNA Research and Splicing · RNA and protein synthesis mechanisms
