GlycanML: A Multi-Task and Multi-Structure Benchmark for Glycan Machine   Learning

Minghao Xu; Yunteng Geng; Yihang Zhang; Ling Yang; Jian Tang; Wentao; Zhang

arXiv:2405.16206·cs.LG·October 2, 2024·1 cites

GlycanML: A Multi-Task and Multi-Structure Benchmark for Glycan Machine Learning

Minghao Xu, Yunteng Geng, Yihang Zhang, Ling Yang, Jian Tang, Wentao, Zhang

PDF

Open Access 1 Repo 1 Video

TL;DR

GlycanML establishes a comprehensive benchmark for glycan property prediction using diverse tasks, representations, and multi-task learning, advancing machine learning applications in glycan research.

Contribution

This work introduces the first standardized benchmark for glycan property prediction, including diverse tasks, representations, and multi-task learning frameworks.

Findings

01

Multi-relational GNNs outperform other models.

02

Multi-task learning enhances prediction accuracy.

03

Sequence and graph representations are both effective.

Abstract

Glycans are basic biomolecules and perform essential functions within living organisms. The rapid increase of functional glycan data provides a good opportunity for machine learning solutions to glycan understanding. However, there still lacks a standard machine learning benchmark for glycan property and function prediction. In this work, we fill this blank by building a comprehensive benchmark for Glycan Machine Learning (GlycanML). The GlycanML benchmark consists of diverse types of tasks including glycan taxonomy prediction, glycan immunogenicity prediction, glycosylation type prediction, and protein-glycan interaction prediction. Glycans can be represented by both sequences and graphs in GlycanML, which enables us to extensively evaluate sequence-based models and graph neural networks (GNNs) on benchmark tasks. Furthermore, by concurrently performing eight glycan taxonomy prediction…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

glycanml/glycanml
pytorchOfficial

Videos

GlycanML: A Multi-Task and Multi-Structure Benchmark for Glycan Machine Learning· slideslive

Taxonomy

TopicsGlycosylation and Glycoproteins Research · Machine Learning in Bioinformatics · Advanced Proteomics Techniques and Applications