# Enhancing anticancer peptide discovery: A fusion-centric framework with conditional diffusion for prediction and generation

**Authors:** Binyu Li, Xin Zhang, Zhihua Huang, Prayag Tiwari, Quan Zou, Yijie Ding, Xiaoyi Guo

PMC · DOI: 10.1371/journal.pcbi.1014098 · PLOS Computational Biology · 2026-03-26

## TL;DR

The paper introduces UACD-ACPs, a new framework for predicting and generating anticancer peptides that improves accuracy and handles imbalanced data.

## Contribution

UACD-ACPs introduces a fusion-driven framework combining diffusion-based prediction and generation modules for anticancer peptides.

## Key findings

- UACD-ACPs outperforms existing methods in accuracy, F1-score, and AUC-ROC for ACP prediction.
- Generated peptides show favorable physicochemical properties and structural stability.
- The framework enables targeted peptide generation organized by cancer type for downstream screening.

## Abstract

Anticancer peptides (ACPs) are short bioactive sequences that selectively target tumor cells with minimal toxicity, positioning them as promising candidates for next-generation cancer therapies. However, existing computational models face limitations in sequence representation and class imbalance. To address these challenges, we propose UACD-ACPs, a unified fusion-driven framework that integrates a diffusion-inspired noise-conditioned classifier for ACP prediction and a diffusion-based peptide generation module with cancer-type-aware organization for targeted downstream screening. The classification module integrates ProtBERT-based semantic embeddings with physicochemical descriptors via the Multiscale Embedding Compression Strategy (MECS) and a diffusion-inspired noise-conditioned encoder, substantially enhancing predictive robustness and accuracy, particularly under challenging imbalanced multi-class settings. In the generative pipeline, we introduce a denoising diffusion-based generative framework augmented by two novel fusion modules: the Bitemporal Fusion Module (BFM) and the Temporal Feature Attention Module (TFAM). These modules perform multi-scale temporal and semantic fusion to promote the generation of structurally coherent and functionally relevant peptide candidates. Experimental results demonstrate that UACD-ACPs outperforms state-of-the-art methods in terms of accuracy, F1-score, and AUC-ROC. The generated peptides exhibit favorable physicochemical properties, diverse secondary structures, and strong structural stability, as validated by molecular dynamics simulations and membrane-binding analyses. Overall, this study highlights the potential of fusion-driven diffusion-based frameworks for alleviating class imbalance and data heterogeneity in anticancer peptide modeling, paving the way for scalable and biologically grounded ACP discovery.

Anticancer peptides (ACPs) are short protein sequences that can selectively target tumor cells while causing minimal harm to healthy tissues, making them promising candidates for cancer therapy. However, the computational discovery of ACPs remains challenging because peptide sequences are highly diverse and data across different cancer types are often severely imbalanced. In this study, we propose UACD-ACPs, a unified computational framework designed to support both the prediction and generation of anticancer peptides. By integrating complementary sequence representations with physicochemical information, the framework improves the reliability of ACP identification under challenging imbalanced data conditions. In addition, the model enables the generation of novel peptide candidates organized by cancer type, facilitating targeted downstream screening. Experimental results demonstrate that UACD-ACPs achieves improved predictive performance compared with existing methods. The generated peptides also exhibit favorable physicochemical properties and structural stability, suggesting their potential biological relevance. Overall, this work provides an integrated and scalable computational strategy to support anticancer peptide discovery and guide future experimental studies.

## Linked entities

- **Diseases:** cancer (MONDO:0004992)

## Full-text entities

- **Genes:** KDR (kinase insert domain receptor) [NCBI Gene 3791] {aka CD309, FLK1, VEGFR, VEGFR2}, MDM2 (MDM2 proto-oncogene) [NCBI Gene 4193] {aka ACTFS, HDMX, LSKB, hdm2}, AASDHPPT (aminoadipate-semialdehyde dehydrogenase-phosphopantetheinyl transferase) [NCBI Gene 60496] {aka AASD-PPT, ACPS, CGI-80, LYS2, LYS5}, CPAT1 (cerebral palsy, ataxic 1) [NCBI Gene 60502] {aka ACP}, GOSR1 (golgi SNAP receptor complex member 1) [NCBI Gene 9527] {aka GOLIM2, GOS-28, GOS28, GOS28/P28, GS28, P28}, ERBB2 (erb-b2 receptor tyrosine kinase 2) [NCBI Gene 2064] {aka CD340, HER-2, HER-2/neu, HER2, MLN 19, MLN-19}, HLA-A (major histocompatibility complex, class I, A) [NCBI Gene 3105] {aka HLAA}, TP53 (tumor protein p53) [NCBI Gene 7157] {aka BCC7, BMFS5, LFS1, P53, TRP53}, MDM4 (MDM4 regulator of p53) [NCBI Gene 4194] {aka BMFS6, HDMX, MDMX, MRP1}
- **Diseases:** ACP (MESH:C562856), Cancer (MESH:D009369), glioblastoma (MESH:D005909), breast cancer (MESH:D001943), hematologic malignancies (MESH:D019337), MECS (MESH:D009408), lung cancer (MESH:D008175), deaths (MESH:D003643), gastric cancer (MESH:D013274), breast, gastric, and non-small cell lung cancers (MESH:D002289), toxicities (MESH:D064420)
- **Chemicals:** amino acid (MESH:D000596), DPM (MESH:C064754), water (MESH:D014867), ALRN-6924 (-), dipeptide (MESH:D004151), m6A (MESH:C005955), DOPC (MESH:C017251), lipid (MESH:D008055)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC13021173/full.md

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/PMC13021173/full.md

## References

75 references — full list in the complete paper: https://tomesphere.com/paper/PMC13021173/full.md

---
Source: https://tomesphere.com/paper/PMC13021173