Leveraging large language models to predict antibiotic resistance in Mycobacterium tuberculosis
Conrad Testagrose, Sakshi Pandey, Mohammadali Serajian, Simone Marini, Mattia Prosperi, Christina Boucher

TL;DR
This paper introduces a new method using large language models to predict antibiotic resistance in tuberculosis bacteria, improving treatment strategies and public health efforts.
Contribution
A novel approach using large language models for antibiotic resistance prediction in Mycobacterium tuberculosis, enabling adaptation to new drugs with minimal data.
Findings
The model achieves high performance in predicting resistance to various antibiotics using genomic data.
LLMTB identifies critical genes and novel resistance mechanisms through its analysis.
The method reduces reliance on extensive data curation by using fine-tuning and few-shot learning.
Abstract
Antibiotic resistance in Mycobacterium tuberculosis (MTB) poses a significant challenge to global public health. Rapid and accurate prediction of antibiotic resistance can inform treatment strategies and mitigate the spread of resistant strains. In this study, we present a novel approach leveraging large language models (LLMs) to predict antibiotic resistance in MTB (LLMTB). Our model is trained and evaluated on genomic data from 12 185 CRyPTIC isolates and their associated resistance profiles, utilizing natural language processing techniques to capture patterns and mutations linked to resistance. The model’s architecture integrates state-of-the-art transformer-based LLMs, enabling the analysis of complex genomic sequences and the extraction of critical features relevant to antibiotic resistance. We evaluate our model’s performance using a comprehensive dataset of MTB strains,…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTuberculosis Research and Epidemiology · RNA and protein synthesis mechanisms · vaccines and immunoinformatics approaches
