Reframing preprocessing selection as model-internal calibration in near-infrared spectroscopy: A large-scale benchmark of operator-adaptive PLS and Ridge models
Gregory Beurier, Robin Reiter, Camille No\^us, Lauriane Rouan, Denis Cornet

TL;DR
This paper introduces a model-internal calibration approach for preprocessing in near-infrared spectroscopy, significantly reducing computation time while maintaining prediction accuracy, validated on a large benchmark cohort.
Contribution
It demonstrates that linear operator-adaptive calibration can replace extensive preprocessing screening, streamlining NIR spectroscopy workflows with comparable results.
Findings
Median RMSEP ratios close to 1 compared to traditional methods.
Selected models achieve higher accuracy and much faster runtimes.
Improved balanced accuracy in classification tasks.
Abstract
Preprocessing screening is often the most expensive part of a near-infrared spectroscopy calibration workflow. It works because smoothing, derivatives, detrending and related filters change the spectral directions seen by PLS or Ridge regression, but a full external search repeatedly refits nearly the same linear model. This paper studies the case where that search can be collapsed into one calibration step. For strict linear preprocessing operators, the transformed PLS cross-covariance satisfies (X A^T)^T Y = A X^T Y, and Ridge regression depends on the operator-induced kernel X A^T A X^T. These identities allow a finite operator bank to be screened inside the model while retaining original-wavelength coefficients. Sample-adaptive or fitted corrections such as SNV, MSC, EMSC and ASLS remain fold-local branches, not absorbed into the algebra. The study uses the AOM benchmark cohort:…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
