RamanBench: A Large-Scale Benchmark for Machine Learning on Raman Spectroscopy
Mario Koddenbrock, Christoph Lange, Robin Legner, Martin J\"ager, Martin K\"ogler, Mariano N. Cruz Bournazou, Peter Neubauer, Felix Biessmann, Erik Rodner

TL;DR
RamanBench is a comprehensive, reproducible benchmark that unifies diverse Raman spectroscopy datasets, evaluates multiple models, and aims to accelerate ML progress in molecular analysis applications.
Contribution
It introduces the first large-scale, standardized Raman spectroscopy benchmark with 74 datasets, evaluation protocols, and a live leaderboard to foster community-driven advancements.
Findings
TFM models outperform classical and Raman-specific methods
Time-series models are competitive with specialized approaches
No current method generalizes well across all datasets
Abstract
Machine Learning (ML) has transformed many scientific fields, yet key applications still lack standardized benchmarks. Raman spectroscopy, a widely used technique for non-invasive molecular analysis, is one such field where progress is limited by fragmented datasets, inconsistent evaluation, and models that fail to capture the structure of spectral data. We introduce RamanBench, the first large-scale, fully reproducible benchmark for ML on Raman spectroscopy, consisting of streamlined data access, evaluation protocols and code, as well as a live leaderboard. It unifies 74 datasets (including 16 first released with this benchmark) across four domains, comprising 325,668 spectra and spanning classification and regression tasks under diverse experimental conditions. We benchmark 28 models under a standardized protocol, including classical methods (e.g., PLS), Raman-specific (e.g.,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
