# CNAttention: an attention-based deep multiple-instance method for uncovering copy number aberration signatures across cancers

**Authors:** Ziying Yang, Michael Baudis

PMC · DOI: 10.1093/bib/bbaf696 · 2026-01-15

## TL;DR

CNAttention is a new deep learning method that identifies copy number aberration patterns across 30 cancer types, improving cancer classification and uncovering hidden genomic relationships.

## Contribution

CNAttention introduces an attention-based deep multiple instance learning framework to extract cancer-specific CNA signatures with high accuracy and stability.

## Key findings

- CNAttention generates CNA signatures for 30 cancer types using attention mechanisms, capturing unique genomic features.
- The method reveals common CNA patterns among physiologically related and distant cancer types, such as neural crest-derived cancers.
- CNA signatures uncover genomic heterogeneity within individual cancer types like brain lower grade glioma.

## Abstract

Somatic copy number aberrations (CNAs) represent a distinct class of genomic mutations associated with oncogenetic effects. Over the past three decades, significant volumes of CNA data have been generated through molecular-cytogenetic and genome sequencing-based techniques. These data have been pivotal in identifying cancer-related genes and advancing research on the relationship between CNAs and histopathologically defined cancer types. However, comprehensive studies of CNA landscapes and disease parameters are challenging due to the vast diagnostic and genomic heterogeneity encountered in ”pan-cancer” approaches. In this study, we introduce CNAttention, an attention-based deep multiple instance learning method designed to comprehensively analyze CNAs across different cancers and uncover specific CNA patterns within integrated gene-level CNA profiles of 30 cancer types. CNAttention effectively learns CNA features unique to each cancer type and generates CNA signatures for 30 cancer types using attention mechanisms, highlighting the distinctiveness of their CNA landscapes. CNAttention demonstrates high accuracy and exhibits stable performance even with the incorporation of external datasets or parameter adjustments, underscoring its effectiveness in tumor identification. Expanding these signatures to cancer classification trees reveals common patterns not only among physiologically related cancer types but also among clinico-pathologically distant types, such as different cancers originating from neural crest derived cells. Additionally, detected signatures also uncover genomic heterogeneity in individual cancer types, for instance in brain lower grade glioma. Additional experiments with classification models underscore the efficacy of these signatures in representing various cancer types and their potential utility in clinical diagnosis.

## Full-text entities

- **Diseases:** cancer (MESH:D009369), glioma (MESH:D005910)

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12805253/full.md

---
Source: https://tomesphere.com/paper/PMC12805253