# Enhancing cancer classification accuracy with a self-attention network using panel capture sequencing data

**Authors:** Yi Jia, Chan Zhang, Han Zhang, Kang Dong, Yuruo Hu, Yinan Wang, Zicheng Zhao

PMC · DOI: 10.1093/bib/bbag120 · Briefings in Bioinformatics · 2026-03-23

## TL;DR

A new machine learning model improves cancer classification accuracy using panel capture sequencing data, with high precision for cervical and gastric cancers.

## Contribution

A self-attention based Conv1D network is introduced for cancer classification using clinical sequencing data, achieving over 90% accuracy.

## Key findings

- The model achieved over 90% overall classification accuracy with 100% precision for cervical and gastric cancers.
- Key genes like C3orf36, JHY, and TASP1 showed significant mutation differences across cancer types.

## Abstract

Cancer classification is pivotal for precision oncology, yet traditional methods struggle with the molecular heterogeneity of tumors. Our study introduces a self-attention based Conv1D machine learning network designed for panel capture sequencing data, which is more commonly used in clinical settings. Combining clinical capture sequencing data and The Cancer Genome Atlas data, we achieved an overall classification accuracy of over 90%, with precision rates reaching 100% for cervical and gastric cancers. Additionally, recall rates were highest at 95.79% for gastric cancer and lowest at 77.46% for cervical cancer, demonstrating robust performance across various cancer types. The model identified key genes such as C3orf36, JHY, and TASP1, showing significant differences in mutation counts across cancers. High-impact gene enrichment analysis highlighted critical pathways like acute myeloid leukemia and adipocytokine signaling. This approach not only significantly improves the precision of cancer classification, demonstrating the potential for clinical application, but also enhances our understanding of cancer biology.

## Linked entities

- **Genes:** C3orf36 (chromosome 3 putative open reading frame 36) [NCBI Gene 80111], JHY (junctional cadherin complex regulator) [NCBI Gene 79864], TASP1 (taspase 1) [NCBI Gene 55617]
- **Diseases:** cervical cancer (MONDO:0002974), gastric cancer (MONDO:0001056), acute myeloid leukemia (MONDO:0015667)

## Full-text entities

- **Genes:** IFNK (interferon kappa) [NCBI Gene 56832] {aka IFNT1, INFE1}, TNF (tumor necrosis factor) [NCBI Gene 7124] {aka DIF, IMD127, TNF-alpha, TNFA, TNFSF2, TNLG1F}, SEMA6C (semaphorin 6C) [NCBI Gene 10500] {aka SEMAY, Sema-Y, m-SemaY, m-SemaY2}, TRIM53AP (tripartite motif containing 53A, pseudogene) [NCBI Gene 642569] {aka TRIM53, TRIM53P}, PTPRC (protein tyrosine phosphatase receptor type C) [NCBI Gene 5788] {aka B220, CD45, CD45R, GP180, IMD105, L-CA}, PRL (prolactin) [NCBI Gene 5617] {aka GHA1, pPRL}, NFKB1 (nuclear factor kappa B subunit 1) [NCBI Gene 4790] {aka CVID12, EBP-1, KBF1, NF-kB, NF-kB1, NF-kappa-B1}, GH1 (growth hormone 1) [NCBI Gene 2688] {aka GH, GH-N, GHB5, GHN, IGHD1A, IGHD1B}, C3orf36 (chromosome 3 putative open reading frame 36) [NCBI Gene 80111], TIRAP (TIR domain containing adaptor protein) [NCBI Gene 114609] {aka BACTS1, Mal, MyD88-2, wyatt}, TOP2B (DNA topoisomerase II beta) [NCBI Gene 7155] {aka BILU, TOPIIB, top2beta}, TSPYL2 (TSPY like 2) [NCBI Gene 64061] {aka CDA1, CINAP, CTCL, DENTT, HRIHFB2216, NP79}, TENM1 (teneurin transmembrane protein 1) [NCBI Gene 10178] {aka ODZ1, ODZ3, TEN-M1, TEN1, TNM, TNM1}, PTPRJ (protein tyrosine phosphatase receptor type J) [NCBI Gene 5795] {aka CD148, DEP1, HPTP eta, HPTPeta, R-PTP-ETA, R-PTP-J}, IL2 (interleukin 2) [NCBI Gene 3558] {aka IL-2, TCGF, lymphokine}, TASP1 (taspase 1) [NCBI Gene 55617] {aka C20orf13, SULEHS, dJ585I14.2}, PIK3CG (phosphatidylinositol-4,5-bisphosphate 3-kinase catalytic subunit gamma) [NCBI Gene 5294] {aka IMD97, PI3CG, PI3K, PI3Kgamma, PIK3, p110gamma}, EGFR (epidermal growth factor receptor) [NCBI Gene 1956] {aka ERBB, ERBB1, ERRP, HER1, NISBD2, NNCIS}, PLS3 (plastin 3) [NCBI Gene 5358] {aka BMND18, DIH5, T-plastin}, SAP30L (SAP30 like) [NCBI Gene 79685] {aka NS4ATP2}, SMO (smoothened, frizzled class receptor) [NCBI Gene 6608] {aka CRJS, FZD11, Gx, PHLS, SMOH}, LGR5 (leucine rich repeat containing G protein-coupled receptor 5) [NCBI Gene 8549] {aka FEX, GPR49, GPR67, GRP49, HG38}, TP53 (tumor protein p53) [NCBI Gene 7157] {aka BCC7, BMFS5, LFS1, P53, TRP53}, CD274 (CD274 molecule) [NCBI Gene 29126] {aka ADMIO5, B7-H, B7H1, PD-L1, PDCD1L1, PDCD1LG1}, STAT3 (signal transducer and activator of transcription 3) [NCBI Gene 6774] {aka ADMIO, ADMIO1, APRF, HIES}
- **Diseases:** chronic (MESH:D002908), metastasis (MESH:D009362), Acute myeloid leukemia (MESH:D015470), carcinogenesis (MESH:D063646), lung cancer (MESH:D008175), Mycobacterium tuberculosis infection (MESH:D014376), Cancer (MESH:D009369), CHOL (MESH:D018281), endocervical adenocarcinoma (MESH:D000230), LUAD (MESH:D000077192), viral infections (MESH:D014777), Spinocerebellar ataxia (MESH:D020754), epithelial carcinoma (MESH:D009375), thyroid cancers (MESH:D013964), GBM (MESH:D005910), human papillomavirus infection (MESH:D030361), Small cell lung cancer (MESH:D055752), B (MESH:D006509), epithelial tumors (MESH:D002277), colon cancer (MESH:D015179), ovarian and colorectal cancers (MESH:D010051), Cervical squamous cell carcinoma (MESH:D002294), pancreatic and endometrial cancer (MESH:D010190), chronic inflammation (MESH:D007249), Epstein-Barr virus infection (MESH:D020031), breast cancers (MESH:D001943), COAD (MESH:D003110), breast, pancreatic, gastric, ovarian, colon cancers (MESH:D010195), non-small cell lung cancer (MESH:D002289), immune dysregulation (OMIM:614878), STAD (MESH:D013274), cervical cancer (MESH:D002583), Ovarian serous cystadenocarcinoma (MESH:D010049), HCC (MESH:D006528)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC13006975/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC13006975/full.md

## References

44 references — full list in the complete paper: https://tomesphere.com/paper/PMC13006975/full.md

---
Source: https://tomesphere.com/paper/PMC13006975