# Peripheral Blood TCR Clonotype Diversity as a Biomarker for Colorectal Cancer

**Authors:** Gaochen Zhu, Tao Chen, Chen Ma, Kai Liu, Bihui Huang, Guan Yang

PMC · DOI: 10.3390/bioengineering12111215 · 2025-11-07

## TL;DR

This study explores using T cell receptor diversity in blood as a non-invasive biomarker for colorectal cancer diagnosis.

## Contribution

A novel TCR repertoire-based diagnostic model for CRC using machine learning and a 50-feature TCR panel is introduced.

## Key findings

- The Transformer model achieved an AUC of 0.973 in predicting CRC status in the internal test set.
- An independent test set showed robust predictions with an AUC of 0.814.
- A panel of 50 TCR CDR3 sequences achieved a diagnostic AUC of 0.869.

## Abstract

There exists an urgent need to improve colorectal cancer (CRC) diagnosis due to limitations in current diagnostic approaches. Systematic characterization of the human T cell receptor (TCR) repertoire, coupled with advanced computational methods, provides a promising opportunity to develop more accurate and less invasive diagnostic strategies for this major malignancy. The main objective of this work is to establish a TCR repertoire-based diagnostic model for CRC using machine learning algorithms and to identify the most significant features contributing to accurate diagnosis. Through comprehensive comparative analysis of several machine learning algorithms, our results demonstrated that the Transformer model exhibited superior performance capabilities. The trained model achieved an area under the receiver operating characteristic curve (AUC) of 0.973 in predicting disease status in the internal test set. Furthermore, TCR repertoire analysis from the independent test set demonstrated robust predictions with an AUC of 0.814. Notably, we identified a panel of 50 TCR repertoire features that showed a diagnostic AUC of 0.869 using these 50 TCR CDR3 sequences. Together, this TCR repertoire-based disease model demonstrates significant potential for clinical applications in CRC diagnosis and treatment response monitoring. Furthermore, similar diagnostic models could be established for other immune-related diseases based on disease-specific TCR repertoire data.

## Linked entities

- **Diseases:** colorectal cancer (MONDO:0005575), CRC (MONDO:0005575)

## Full-text entities

- **Genes:** TRBV20OR9-2 (T cell receptor beta variable 20/OR9-2 (non-functional)) [NCBI Gene 6962] {aka CDR3, TCRBV20S2, TCRBV2O, TCRBV2S2O}
- **Diseases:** CRC (MESH:D015179), malignancy (MESH:D009369)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12650616/full.md

---
Source: https://tomesphere.com/paper/PMC12650616