# QTFPred: robust high-performance quantum machine learning modeling that predicts main and cooperative transcription factor bindings with base resolution

**Authors:** Taichi Matsubara, Shuto Machida, Samuel Papa Kwesi Owusu, Akihiro Asakura, Hiroki Hashimoto, Masanori Matsuoka, Masao Nagasaki

PMC · DOI: 10.1093/bib/bbaf604 · 2025-11-26

## TL;DR

QTFPred is a quantum machine learning model that accurately predicts transcription factor bindings at base resolution, even with limited data.

## Contribution

Introduces QTFPred, a quantum-classical hybrid framework for robust TF binding prediction with superior performance in data-sparse scenarios.

## Key findings

- QTFPred achieved state-of-the-art accuracy in 92% of binary and 96% of signal prediction tasks on 49 ChIP-seq datasets.
- The model outperformed conventional models in precision and stability, especially in data-sparse conditions.
- QTFPred reveals TF motif representations and provides insights into cooperative binding mechanisms.

## Abstract

Deep learning has become an essential tool for identifying transcription factor (TF) binding sites, yet conventional approaches often struggle with limited training data for specific TFs. Here, we introduce QTFPred (Quantum-based TF Predictor), a quantum-classical hybrid framework that integrates quantum convolutional layers within neural networks to predict TF binding at base resolution. By leveraging the exponential feature space offered by quantum circuits and training from scratch via GPU simulation, QTFPred achieves robust performance even in data-sparse scenarios. In benchmarks on 49 Encyclopedia of DNA elements ChIP-seq datasets, QTFPred delivered state-of-the-art accuracy in 92% of binary prediction and 96% of signal prediction tasks, outperforming conventional models in precision and stability. Moreover, the method reveals underlying TF motif representations, offering insights into cooperative binding mechanisms. These results highlight the potential of quantum machine learning to overcome the limitations of traditional deep learning in genomics modeling.

## Linked entities

- **Proteins:** SEP2 (K-box region and MADS-box transcription factor family protein)

## Full-text entities

- **Genes:** YY1 (YY1 transcription factor) [NCBI Gene 7528] {aka DELTA, GADEVS, INO80S, NF-E1, UCRBP, YIN-YANG-1}, CTCF (CCCTC-binding factor) [NCBI Gene 10664] {aka CFAP108, FAP108, MRD21}, JUND (JunD proto-oncogene, AP-1 transcription factor subunit) [NCBI Gene 3727] {aka AP-1}, ELK1 (ETS transcription factor ELK1) [NCBI Gene 2002], EBF1 (EBF transcription factor 1) [NCBI Gene 1879] {aka COE1, EBF, O/E-1, OLF1}, MYC (MYC proto-oncogene, bHLH transcription factor) [NCBI Gene 4609] {aka MRTL, MYCC, bHLHe39, c-Myc}, BHLHE40 (basic helix-loop-helix family member e40) [NCBI Gene 8553] {aka BHLHB2, Clast5, DEC1, HLHB2, SHARP-2, SHARP2}, MAFF (MAF bZIP transcription factor F) [NCBI Gene 23764] {aka U-MAF, hMafF}, REST (RE1 silencing transcription factor) [NCBI Gene 5978] {aka DFNA27, GINGF5, HGF5, NRSF, WT6, XBR}, TBP (TATA-box binding protein) [NCBI Gene 6908] {aka GTF2D, GTF2D1, HDL4, SCA17, TBP1, TFIID}, E2F1 (E2F transcription factor 1) [NCBI Gene 1869] {aka E2F-1, RBAP1, RBBP3, RBP3}, MAZ (MYC associated zinc finger protein) [NCBI Gene 4150] {aka PUR1, Pur-1, SAF-1, SAF-2, SAF-3, ZF87}, FOXA2 (forkhead box A2) [NCBI Gene 3170] {aka HNF-3-beta, HNF3B, TCF3B}, F3 (coagulation factor III, tissue factor) [NCBI Gene 2152] {aka CD142, TF, TFA}, TCF12 (transcription factor 12) [NCBI Gene 6938] {aka CRS3, HEB, HH26, HTF4, HsT17266, TCF-12}, RFX5 (regulatory factor X5) [NCBI Gene 5993] {aka MHC2D3, MHC2D5}, ELK4 (ETS transcription factor ELK4) [NCBI Gene 2005] {aka SAP1}, TAF1 (TATA-box binding protein associated factor 1) [NCBI Gene 6872] {aka BA2R, CCG1, CCGS, DYT3, DYT3/TAF1, KAT4}, ZBTB33 (zinc finger and BTB domain containing 33) [NCBI Gene 10009] {aka ZNF-kaiso, ZNF348}, E2F6 (E2F transcription factor 6) [NCBI Gene 1876] {aka E2F-6}
- **Diseases:** cervical carcinoma (MESH:D002583), TF (MESH:D005171), cervical carcinoma cell line (MESH:D002575), PFMs (MESH:D006316), chronic myelogenous leukemia (MESH:D015464), QTFPred (MESH:D019292)
- **Chemicals:** CPU (-), nucleotide (MESH:D009711)
- **Species:** Homo sapiens (human, species) [taxon 9606]
- **Cell lines:** MCF7 — Homo sapiens (Human), Invasive breast carcinoma of no special type, Cancer cell line (CVCL_0031), A549 — Homo sapiens (Human), Lung adenocarcinoma, Cancer cell line (CVCL_0023), K562 — Homo sapiens (Human), Blast phase chronic myelogenous leukemia, BCR-ABL1 positive, Cancer cell line (CVCL_0004), HeLa-S3 — Homo sapiens (Human), Human papillomavirus-related endocervical adenocarcinoma, Cancer cell line (CVCL_0058), S2 — Drosophila melanogaster (Fruit fly), Spontaneously immortalized cell line (CVCL_Z232), cell line — Oryzias dancena (Indian ricefish), Spontaneously immortalized cell line (CVCL_YD82), HeLa — Homo sapiens (Human), Human papillomavirus-related endocervical adenocarcinoma, Cancer cell line (CVCL_0030), GM12878 — Homo sapiens (Human), Transformed cell line (CVCL_7526), line — Mus musculus (Mouse), Adenoma of the mouse pulmonary system, Cancer cell line (CVCL_5V03)

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12648403/full.md

---
Source: https://tomesphere.com/paper/PMC12648403