Comparative Analysis of Formula and Structure Prediction from Tandem Mass Spectra
Xujun Che, Xiuxia Du, Depeng Xu

TL;DR
This paper systematically evaluates current computational methods for predicting chemical formulas and structures from tandem mass spectrometry data, highlighting performance benchmarks and areas needing improvement.
Contribution
It provides a comprehensive assessment of state-of-the-art prediction algorithms, establishing realistic performance baselines and guiding future enhancements.
Findings
Identified key bottlenecks in prediction accuracy
Established performance benchmarks for formula and structure prediction
Provided guidance for improving MS-based compound prediction methods
Abstract
Liquid chromatography mass spectrometry (LC-MS)-based metabolomics and exposomics aim to measure detectable small molecules in biological samples. The results facilitate hypothesis-generating discovery of metabolic changes and disease mechanisms and provide information about environmental exposures and their effects on human health. Metabolomics and exposomics are made possible by the high resolving power of LC and high mass measurement accuracy of MS. However, a majority of the signals from such studies still cannot be identified or annotated using conventional library searching because existing spectral libraries are far from covering the vast chemical space captured by LC-MS/MS. To address this challenge and unleash the full potential of metabolomics and exposomics, a number of computational approaches have been developed to predict compounds based on tandem mass spectra. Published…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMetabolomics and Mass Spectrometry Studies · Computational Drug Discovery Methods · Cell Image Analysis Techniques
