EPheClass: ensemble-based phenotype classifier from 16S rRNA gene sequences
Lara Vázquez-González, Carlos Peña-Reyes, Alba Regueira-Iglesias, Carlos Balsa-Castro, Inmaculada Tomás, María J. Carreira

TL;DR
This paper introduces EPheClass, a machine learning pipeline for classifying diseases based on 16S rRNA gene data from microbiome samples, showing strong performance across multiple conditions.
Contribution
The novel contribution is an ensemble-based classification pipeline for 16S rRNA data that generalizes well across different phenotypes and sample types.
Findings
EPheClass achieved an F1 score of 0.913 in diagnosing periodontal disease using only 13 features.
The method outperformed existing approaches in diagnosing inflammatory bowel disease using the same dataset.
EPheClass showed competitive results in detecting antibiotic exposure, highlighting its generalizability.
Abstract
One area of bioinformatics that is currently attracting particular interest is the classification of polymicrobial diseases using machine learning (ML), with data obtained from high-throughput amplicon sequencing of the 16S rRNA gene in human microbiome samples. The microbial dysbiosis underlying these types of diseases is particularly challenging to classify, as the data is highly dimensional, with potentially hundreds or even thousands of predictive features. In addition, the imbalance in the composition of the microbial community is highly heterogeneous across samples. In this paper, we propose a curated pipeline for binary phenotype classification based on a count table of 16S rRNA gene amplicons, which can be applied to any microbiome. To evaluate our proposal, raw 16S rRNA gene sequences from samples of healthy and periodontally affected oral microbiomes that met certain quality…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOral microbiology and periodontitis research · Oral Health Pathology and Treatment · Oral and gingival health research
