Predicting Transcription Factor Binding Sites using Transformer based Capsule Network
Nimisha Ghosh, Daniele Santoni, Indrajit Saha, Giovanni, Felici

TL;DR
This paper introduces DNABERT-Cap, a transformer-based capsule network that predicts transcription factor binding sites from ChIP-seq data, achieving high accuracy and outperforming existing models.
Contribution
The work presents a novel transformer-based capsule network model for TF binding site prediction, integrating bidirectional encoding with capsule layers for improved accuracy.
Findings
Achieves average AUC > 0.91 across five cell lines.
Outperforms existing deep learning predictors like DeepARC and DeepBind.
Demonstrates the effectiveness of combining transformer and capsule network architectures.
Abstract
Prediction of binding sites for transcription factors is important to understand how they regulate gene expression and how this regulation can be modulated for therapeutic purposes. Although in the past few years there are significant works addressing this issue, there is still space for improvement. In this regard, a transformer based capsule network viz. DNABERT-Cap is proposed in this work to predict transcription factor binding sites mining ChIP-seq datasets. DNABERT-Cap is a bidirectional encoder pre-trained with large number of genomic DNA sequences, empowered with a capsule layer responsible for the final prediction. The proposed model builds a predictor for transcription factor binding sites using the joint optimisation of features encompassing both bidirectional encoder and capsule layer, along with convolutional and bidirectional long-short term memory layers. To evaluate the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Bioinformatics · Genomics and Chromatin Dynamics · RNA and protein synthesis mechanisms
MethodsCapsule Network
