Noisy, sparse, nonlinear: Navigating the Bermuda Triangle of physical inference with deep filtering
Carl Poelking, Yehia Amar, Alexei Lapkin, Lucy Colwell

TL;DR
This paper introduces a deep filtering approach to improve the robustness and interpretability of machine learning models for physical inference in noisy, sparse, and nonlinear data, with applications in catalysis and drug synergy.
Contribution
The authors develop a deep filtering method that enhances model robustness and transparency in challenging data regimes, advancing data-driven physical inference.
Findings
Deep filtering improves model robustness over standard architectures.
Sparse models reveal physicochemical reaction pharmacophores.
Method helps identify experimental bias and reaction mechanisms.
Abstract
Capturing the microscopic interactions that determine molecular reactivity poses a challenge across the physical sciences. Even a basic understanding of the underlying reaction mechanisms can substantially accelerate materials and compound design, including the development of new catalysts or drugs. Given the difficulties routinely faced by both experimental and theoretical investigations that aim to improve our mechanistic understanding of a reaction, recent advances have focused on data-driven routes to derive structure-property relationships directly from high-throughput screens. However, even these high-quality, high-volume data are noisy, sparse and biased -- placing them in a regime where machine-learning is extremely challenging. Here we show that a statistical approach based on deep filtering of nonlinear feature networks results in physicochemical models that are more robust,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMetabolomics and Mass Spectrometry Studies · Data Analysis with R
