Chemical Structure Elucidation from Mass Spectrometry by Matching Substructures
Jing Lim, Joshua Wong, Minn Xuan Wong, Lee Han Eric Tan, Hai Leong, Chieu, Davin Choo, Neng Kai Nigel Neo

TL;DR
This paper presents a neural network-based method for chemical structure elucidation from mass spectrometry data, significantly reducing the time and effort needed to identify unknown chemical threats by ranking candidate structures effectively.
Contribution
The authors introduce a data-driven approach using neural networks to predict substructures from mass spectra, improving candidate ranking accuracy in chemical structure elucidation.
Findings
Substructure classifiers achieve over 90% micro F1-score.
Correct structure found in top 20 candidates for 88% and 71% of cases.
Method accelerates chemical threat identification process.
Abstract
Chemical structure elucidation is a serious bottleneck in analytical chemistry today. We address the problem of identifying an unknown chemical threat given its mass spectrum and its chemical formula, a task which might take well trained chemists several days to complete. Given a chemical formula, there could be over a million possible candidate structures. We take a data driven approach to rank these structures by using neural networks to predict the presence of substructures given the mass spectrum, and matching these substructures to the candidate structures. Empirically, we evaluate our approach on a data set of chemical agents built for unknown chemical threat identification. We show that our substructure classifiers can attain over 90% micro F1-score, and we can find the correct structure among the top 20 candidates in 88% and 71% of test cases for two compound classes.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods · Metabolomics and Mass Spectrometry Studies · Advanced Chemical Sensor Technologies
