Learning the statistics and landscape of somatic mutation-induced insertions and deletions in antibodies
Cosimo Lupo, Natanael Spisak, Aleksandra M. Walczak, Thierry Mora

TL;DR
This paper introduces a probabilistic model to analyze the statistical features of insertions and deletions in antibody hypermutation, revealing universal patterns and hotspots in human immunoglobulin heavy chains.
Contribution
A novel inference tool that accurately characterizes indel statistics in antibody repertoires, overcoming biases of traditional annotation methods.
Findings
Indels follow a geometric length distribution.
Identification of universal indel hotspots.
Distinct insertion and deletion hotspots in heavy chains.
Abstract
Affinity maturation is crucial for improving the binding affinity of antibodies to antigens. This process is mainly driven by point substitutions caused by somatic hypermutations of the immunoglobulin gene. It also includes deletions and insertions of genomic material known as indels. While the landscape of point substitutions has been extensively studied, a detailed statistical description of indels is still lacking. Here we present a probabilistic inference tool to learn the statistics of indels from repertoire sequencing data, which overcomes the pitfalls and biases of standard annotation methods. The model includes antibody-specific maturation ages to account for variable mutational loads in the repertoire. After validation on synthetic data, we applied our tool to a large dataset of human immunoglobulin heavy chains. The inferred model allows us to identify universal statistical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
