TL;DR
This paper introduces a robust variable selection method for classification tasks involving high-dimensional spectroscopic data, effectively handling label noise and outliers to improve model interpretability and accuracy in agri-food applications.
Contribution
It presents a novel model-based approach that simultaneously performs variable selection, outlier detection, and label noise correction in spectroscopic data classification.
Findings
Effective in identifying anomalous spectra
Reduces problem complexity and enhances robustness
Achieves high predictive accuracy with few wavelengths
Abstract
Classification of high-dimensional spectroscopic data is a common task in analytical chemistry. Well-established procedures like support vector machines (SVMs) and partial least squares discriminant analysis (PLS-DA) are the most common methods for tackling this supervised learning problem. Nonetheless, interpretation of these models remains sometimes difficult, and solutions based on feature selection are often adopted as they lead to the automatic identification of the most informative wavelengths. Unfortunately, for some delicate applications like food authenticity, mislabeled and adulterated spectra occur both in the calibration and/or validation sets, with dramatic effects on the model development, its prediction accuracy and robustness. Motivated by these issues, the present paper proposes a robust model-based method that simultaneously performs variable selection, outliers and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
