TL;DR
OLGA is a fast, dynamic programming-based algorithm that accurately computes the generation probabilities of T- and B-cell receptor amino acid sequences, aiding immune repertoire analysis and vaccine design.
Contribution
This paper introduces OLGA, a novel efficient algorithm for calculating amino acid sequence generation probabilities from V(D)J recombination, overcoming computational intractability of brute-force methods.
Findings
OLGA accurately predicts T-cell receptor generation probabilities.
The model aligns well with published immune repertoire data.
OLGA can inform vaccine development by identifying likely receptor sequences.
Abstract
Motivation: High-throughput sequencing of large immune repertoires has enabled the development of methods to predict the probability of generation by V(D)J recombination of T- and B-cell receptors of any specific nucleotide sequence. These generation probabilities are very non-homogeneous, ranging over 20 orders of magnitude in real repertoires. Since the function of a receptor really depends on its protein sequence, it is important to be able to predict this probability of generation at the amino acid level. However, brute-force summation over all the nucleotide sequences with the correct amino acid translation is computationally intractable. The purpose of this paper is to present a solution to this problem. Results: We use dynamic programming to construct an efficient and flexible algorithm, called OLGA (Optimized Likelihood estimate of immunoGlobulin Amino-acid sequences), for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
