Robust variable selection in the framework of classification with label   noise and outliers: applications to spectroscopic data in agri-food

Andrea Cappozzo; Ludovic Duponchel; Francesca Greselin; Thomas Brendan; Murphy

arXiv:2010.10415·stat.AP·January 29, 2021

Robust variable selection in the framework of classification with label noise and outliers: applications to spectroscopic data in agri-food

Andrea Cappozzo, Ludovic Duponchel, Francesca Greselin, Thomas Brendan, Murphy

PDF

1 Repo

TL;DR

This paper introduces a robust variable selection method for classification tasks involving high-dimensional spectroscopic data, effectively handling label noise and outliers to improve model interpretability and accuracy in agri-food applications.

Contribution

It presents a novel model-based approach that simultaneously performs variable selection, outlier detection, and label noise correction in spectroscopic data classification.

Findings

01

Effective in identifying anomalous spectra

02

Reduces problem complexity and enhances robustness

03

Achieves high predictive accuracy with few wavelengths

Abstract

Classification of high-dimensional spectroscopic data is a common task in analytical chemistry. Well-established procedures like support vector machines (SVMs) and partial least squares discriminant analysis (PLS-DA) are the most common methods for tackling this supervised learning problem. Nonetheless, interpretation of these models remains sometimes difficult, and solutions based on feature selection are often adopted as they lead to the automatic identification of the most informative wavelengths. Unfortunately, for some delicate applications like food authenticity, mislabeled and adulterated spectra occur both in the calibration and/or validation sets, with dramatic effects on the model development, its prediction accuracy and robustness. Motivated by these issues, the present paper proposes a robust model-based method that simultaneously performs variable selection, outliers and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

AndreaCappozzo/varselTBIC
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.