Interpretable Classification of Bacterial Raman Spectra with Knockoff   Wavelets

Charmaine Chia; Matteo Sesia; Chi-Sing Ho; Stefanie S. Jeffrey,; Jennifer Dionne; Emmanuel J. Cand\`es; Roger T. Howe

arXiv:2006.04937·eess.SP·July 12, 2021·IEEE J. Biomed. Health Informatics

Interpretable Classification of Bacterial Raman Spectra with Knockoff Wavelets

Charmaine Chia, Matteo Sesia, Chi-Sing Ho, Stefanie S. Jeffrey,, Jennifer Dionne, Emmanuel J. Cand\`es, Roger T. Howe

PDF

1 Repo

TL;DR

This paper presents an interpretable logistic regression approach using wavelet features and knockoff variable selection for classifying bacterial Raman spectra, achieving accuracy comparable to neural networks.

Contribution

It introduces a simple, transparent model with chemically interpretable features and controlled feature selection, improving interpretability in biomedical signal classification.

Findings

01

Logistic regression with wavelet features matches neural network accuracy.

02

Knockoff-based feature selection ensures relevant, non-redundant predictors.

03

Approach is broadly applicable to other signal data requiring interpretability.

Abstract

Deep neural networks and other sophisticated machine learning models are widely applied to biomedical signal data because they can detect complex patterns and compute accurate predictions. However, the difficulty of interpreting such models is a limitation, especially for applications involving high-stakes decision, including the identification of bacterial infections. In this paper, we consider fast Raman spectroscopy data and demonstrate that a logistic regression model with carefully selected features achieves accuracy comparable to that of neural networks, while being much simpler and more transparent. Our analysis leverages wavelet features with intuitive chemical interpretations, and performs controlled variable selection with knockoffs to ensure the predictors are relevant and non-redundant. Although we focus on a particular data set, the proposed approach is broadly applicable…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

chicanagram/raman-knockoffs
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLogistic Regression