Resolution-Adaptive Binning Enhances Machine Learning Modeling by Interbatch and Multiplatform Orbitrap-Based Shotgun Mass Spectrometry Data Integration
Hiu-Lok Ngan, Jialing Zhang, Kenneth Kin-Leung Kwan, Jacinth Wing-Sum Cheu, Li Zhong, Yike Guo, Xian Yang, Carmen Chak-Lui Wong, Hong Yan, Zongwei Cai

TL;DR
A new binning method improves machine learning models using mass spectrometry data from different batches and platforms, enhancing disease detection accuracy.
Contribution
A resolution-adaptive binning strategy is introduced for integrating Orbitrap-based shotgun MS data across batches and platforms.
Findings
The method recovers 88–99% of ground truth features in low mass regions from mixed standard solutions.
It achieves stable binning across low, mid, and high mass regions, leading to better predictive models.
A mouse hepatocellular carcinoma model identified 10 generic metabolites useful for disease detection across various sample methods.
Abstract
Machine learning (ML) modeling on mass spectrometry (MS)-based shotgun data facilitates feature selection and disease modeling. However, batch-specific models often struggle with limited transferability and generalizability, necessitating data integration from multiple batches and platforms. Traditional binning methods can either disintegrate or aggregate m/z features, making data combination unreliable. In this study, we introduce a mass resolution-adaptive binning and integration strategy to overcome these challenges. This approach recovers 88–99% of ground truth features in a low mass region (70–434 m/z) from 49 mixed standard solutions at 250, 500, and 1000 ppb. Compared to conventional methods, it demonstrates stable binning and integration across low (100–450 m/z), mid (450–900 m/z), and high (900–1500 m/z) mass regions, resulting in superior predictive models. Using a mouse model…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMass Spectrometry Techniques and Applications · Metabolomics and Mass Spectrometry Studies · Advanced Proteomics Techniques and Applications
