Mathematical Characterization of Private and Public Immune Repertoire Sequences
Lucas B\"ottcher, Sascha Wald, Tom Chou

TL;DR
This paper develops a mathematical framework to quantify the sharing and diversity of immune receptor sequences across individuals, aiding understanding of immune repertoire publicness and privacy.
Contribution
It introduces probabilistic and information-theoretic models to analyze clone sharing, providing explicit formulas for repertoire overlap and diversity measures.
Findings
Derived formulas for mean and variance of clone richness and overlap.
Validated models with synthetic and empirical TCR sequence data.
Compared simulated repertoire sharing with analytical predictions.
Abstract
Diverse T and B cell repertoires play an important role in mounting effective immune responses against a wide range of pathogens and malignant cells. The number of unique T and B cell clones is characterized by T and B cell receptors (TCRs and BCRs), respectively. Although receptor sequences are generated probabilistically by recombination processes, clinical studies found a high degree of sharing of TCRs and BCRs among different individuals. In this work, we use a general probabilistic model for T/B cell receptor clone abundances to define "publicness" or "privateness" and information-theoretic measures for comparing the frequency of sampled sequences observed across different individuals. We derive mathematical formulae to quantify the mean and the variances of clone richness and overlap. Our results can be used to evaluate the effect of different sampling protocols on abundances of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
