TRAFICA: an open chromatin language model to improve transcription factor binding affinity prediction
Yu Xu, Chonghao Wang, Ke Xu, Yi Ding, Aiping Lyu, Lu Zhang

TL;DR
TRAFICA is a new model that improves predictions of how transcription factors bind to DNA by considering open chromatin regions.
Contribution
TRAFICA integrates open chromatin data with in vitro binding profiles to enhance TF–DNA binding affinity prediction.
Findings
TRAFICA outperforms existing tools in predicting TF–DNA binding affinity.
Incorporating open chromatin regions improves prediction accuracy.
TRAFICA achieves state-of-the-art performance in both in vitro and in vivo settings.
Abstract
In silico transcription factor and DNA (TF–DNA) binding affinity prediction plays a vital role in examining TF binding preferences and understanding gene regulation. The existing tools employ TF–DNA binding profiles from in vitro high-throughput technologies to predict TF–DNA binding affinity. However, TFs tend to bind to sequences in open chromatin regions in vivo, such TF binding preference is seldomly considered by these existing tools. In this study, we developed TRAFICA, an open chromatin language model to predict TF–DNA binding affinity by integrating sequence characteristics of open chromatin regions from ATAC-seq experiments and in vitro TF–DNA binding profiles from high-throughput technologies. We pretrained TRAFICA on over 2.8 million nucleotide sequences in open chromatin regions derived from 197 ATAC-seq experiments (115 cell lines) to learn in vivo TF binding preferences.…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Chromatin Dynamics · Machine Learning in Bioinformatics · Bioinformatics and Genomic Networks
