# Machine learning identifies novel signatures of antifungal drug resistance in Saccharomycotina yeasts

**Authors:** Marie-Claire Harrison, David C. Rinker, Abigail L. LaBella, Dana A. Opulente, John F. Wolters, Xiaofan Zhou, Xing-Xing Shen, Marizeth Groenewald, Chris Todd Hittinger, Antonis Rokas

PMC · DOI: 10.1371/journal.pgen.1012091 · PLOS Genetics · 2026-03-17

## TL;DR

This study uses machine learning to uncover new genetic markers of antifungal drug resistance in a wide range of yeast species.

## Contribution

The study identifies novel, naturally occurring genetic variants in non-clinical yeast species that contribute to fluconazole resistance.

## Key findings

- Fluconazole resistance is widespread and can be predicted with 75.2% accuracy using genomic data.
- Key residues in the Erg11 protein associated with fluconazole resistance differ from those in clinical isolates.
- Natural variants in Erg11, though not previously linked to resistance, can directly contribute to drug resistance.

## Abstract

Antifungal drug resistance is a major challenge in fungal infection management. Numerous genomic changes are known to contribute to acquired drug resistance in clinical isolates of specific pathogens, but whether they broadly explain natural resistance across entire lineages is unknown. We leveraged genomic, ecological, and phenotypic trait data from naturally sampled strains from nearly all known species in subphylum Saccharomycotina to examine the evolution of resistance to eight antifungal drugs. The phylogenetic distribution of drug resistance varied by drug; fluconazole resistance was widespread, while 5-fluorocytosine resistance was rare, except in Lipomycetales. A random forest algorithm trained on genomic data predicted drug-resistant yeasts with 54–75% accuracy. Fluconazole resistance was consistently predicted with the highest accuracy (75.2%). Furthermore, fluconazole resistance prediction accuracy was similar between models trained on genome-wide variation in the presence and number of InterPro protein annotations across Saccharomycotina (75.2%) and those trained on amino acid sequence alignment data of Erg11, a protein known to be involved in fluconazole resistance (74.3-74.9%). Interestingly, the top Erg11 residues for predicting fluconazole resistance across Saccharomycotina do not overlap with, are not spatially close to, and are less conserved than those previously linked to resistance in clinical isolates of Candida albicans. In silico deep mutational scanning of the C. albicans Erg11 protein reveals that amino acid variants implicated in clinical cases of resistance are almost universally destabilizing while variants in our most informative residues are energetically more neutral, explaining why the latter are much more common than the former in natural populations. Importantly, previous experimental analyses of C. albicans Erg11 have shown that amino acid variation in our most informative residues, despite having never been directly implicated in clinical cases, can directly contribute to resistance. Our results suggest that studies of natural resistance in yeast species never encountered in the clinic will yield a fuller understanding of antifungal drug resistance.

Resistance to drugs is a major challenge in the treatment of fungal infections. Many fungi are naturally resistant to antifungal drugs, but the genetic variants involved are poorly characterized. We employed machine learning, structural, and evolutionary approaches to identify genetic variants associated with drug resistance across an ancient yeast lineage. By focusing on a protein known to be involved in resistance to the antifungal drug fluconazole, we identified several novel variants that were significantly associated with drug resistance, but whose evolutionary and biophysical properties differ from previously characterized clinical variants in specific human pathogens. Furthermore, previous in vitro experimental analyses have shown that several of these natural variants can directly contribute to resistance. We suggest that studies on the genetic basis of drug resistance across entire fungal lineages can complement studies of human pathogenic fungi, leading to fuller understanding of the drug resistance challenge.

## Linked entities

- **Proteins:** ERG11 (sterol 14-demethylase)
- **Chemicals:** fluconazole (PubChem CID 3365), 5-fluorocytosine (PubChem CID 3366)
- **Species:** Saccharomycotina (taxon 147537), Candida albicans (taxon 5476), Lipomycetales (taxon 3243773)

## Full-text entities

- **Genes:** ERG11 (sterol 14-demethylase) [NCBI Gene 856398] {aka CYP51}, FKS1 (1,3-beta-D-glucan synthase) [NCBI Gene 851055] {aka CND1, CWH53, ETG1, GSC1, PBR1}, ERG1 (squalene monooxygenase) [NCBI Gene 853086], FLO10 (Flo10p) [NCBI Gene 853977], UPC2 (Upc2p) [NCBI Gene 851799] {aka MOX4}, FLO9 (flocculin FLO9) [NCBI Gene 851236], FLO5 (flocculin FLO5) [NCBI Gene 856618], GSC2 (1,3-beta-glucan synthase GSC2) [NCBI Gene 852920] {aka FKS2}
- **Diseases:** fungal (MESH:D009181), infection (MESH:D007239)
- **Chemicals:** salt (MESH:D012492), Echinocandins (MESH:D054714), ergosterol (MESH:D004875), sterol (MESH:D013261), nucleoside (MESH:D009705), 5-fluorouridylic acid (MESH:C016462), voriconazole (MESH:D065819), beta-glucans (MESH:D047071), nitrogen (MESH:D009584), urea (MESH:D014508), heme (MESH:D006418), Fluconazole (MESH:D015725), micafungin (MESH:D000077551), raffinose (MESH:D011887), 5-fluorocytosine (MESH:D005437), cellobiose (MESH:D002475), carbon (MESH:D002244), acid (MESH:D000143), pyrimidine (MESH:C030986), Terbinafine (MESH:D000077291), 5v5z (-), azole (MESH:D001393), glucosamine (MESH:D005944), Amphotericin B (MESH:D000666), posaconazole (MESH:C101425), caspofungin (MESH:D000077336), galactose (MESH:D005690), itraconazole (MESH:D017964), Polyenes (MESH:D011090), allylamine (MESH:D000499), salicin (MESH:C005696)
- **Species:** Candidozyma auris (species) [taxon 498019], Saccharomyces cerevisiae (baker's yeast, species) [taxon 4932], Hibiscus heterophyllus (native rosella, species) [taxon 183244], Candida albicans (species) [taxon 5476], Saccharomycotina (budding yeasts & allies, subphylum) [taxon 147537], Homo sapiens (human, species) [taxon 9606], Nakaseomyces glabratus (species) [taxon 5478]
- **Mutations:** Y477, S506, Y132H, R467K, F145L, Y477F, A313L, A313S, V404, G464S, L321F, A313, S506Q, V404T

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC13012505/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC13012505/full.md

## References

84 references — full list in the complete paper: https://tomesphere.com/paper/PMC13012505/full.md

---
Source: https://tomesphere.com/paper/PMC13012505