Data Augmentation of Spectral Data for Convolutional Neural Network (CNN) Based Deep Chemometrics
Esben Jannik Bjerrum, Mads Glahder, Thomas Skov

TL;DR
This paper explores the use of convolutional neural networks (CNNs) combined with spectral data augmentation and preprocessing techniques to improve drug content prediction from near infrared spectra, outperforming traditional methods.
Contribution
It introduces a novel spectral data augmentation method and demonstrates its effectiveness when combined with EMSC preprocessing in CNN-based chemometric analysis.
Findings
Data augmentation with EMSC yields the best model performance.
CNNs outperform PLS models in extrapolation tasks.
Kernel activations in CNNs resemble traditional spectral processing methods.
Abstract
Deep learning methods are used on spectroscopic data to predict drug content in tablets from near infrared (NIR) spectra. Using convolutional neural networks (CNNs), features are ex- tracted from the spectroscopic data. Extended multiplicative scatter correction (EMSC) and a novel spectral data augmentation method are benchmarked as preprocessing steps. The learned models perform better or on par with hypothetical optimal partial least squares (PLS) models for all combinations of preprocessing. Data augmentation with subsequent EMSC in combination gave the best results. The deep learning model CNNs also outperform the PLS models in an extrapolation chal- lenge created using data from a second instrument and from an analyte concentration not covered by the training data. Qualitative investigations of the CNNs kernel activations show their resemblance to wellknown data processing methods…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpectroscopy and Chemometric Analyses · Spectroscopy Techniques in Biomedical and Chemical Research · Computational Drug Discovery Methods
