GELATO and SAGE: An Integrated Framework for MS Annotation
Khalifeh AlJadda, Rene Ranzinger, Melody Porterfield, Brent Weatherly,, Mohammed Korayem, John A. Miller, Khaled Rasheed, Krys J. Kochut, William S., York

TL;DR
This paper introduces GELATO, a semi-automated MSn data interpreter, and SAGE, a machine learning model that enhances glycan annotation accuracy, addressing limitations of existing tools for complex mass spectrometry data.
Contribution
The paper presents a novel integrated framework combining GELATO and SAGE to improve glycan annotation in MSn data analysis, especially for uncurated public databases and higher-order MSn data.
Findings
GELATO extends existing glycan annotation tools with automation.
SAGE learns from expert annotations to emulate human interpretation.
The framework improves annotation accuracy for complex MSn datasets.
Abstract
Several algorithms and tools have been developed to (semi) automate the process of glycan identification by interpreting Mass Spectrometric data. However, each has limitations when annotating MSn data with thousands of MS spectra using uncurated public databases. Moreover, the existing tools are not designed to manage MSn data where n > 2. We propose a novel software package to automate the annotation of tandem MS data. This software consists of two major components. The first, is a free, semi-automated MSn data interpreter called the Glycomic Elucidation and Annotation Tool (GELATO). This tool extends and automates the functionality of existing open source projects, namely, GlycoWorkbench (GWB) and GlycomeDB. The second is a machine learning model called Smart Anotation Enhancement Graph (SAGE), which learns the behavior of glycoanalysts to select annotations generated by GELATO that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGlycosylation and Glycoproteins Research · Genomics and Phylogenetic Studies · Advanced Proteomics Techniques and Applications
