BiScale-GTR: Fragment-Aware Graph Transformers for Multi-Scale Molecular Representation Learning
Yi Yang, Ovidiu Daescu

TL;DR
BiScale-GTR is a novel multi-scale graph transformer framework that enhances molecular property prediction by integrating fragment-aware tokenization with adaptive reasoning across multiple molecular scales.
Contribution
It introduces a chemically grounded fragment tokenization method and a multi-scale architecture that jointly captures local, substructure, and long-range molecular features.
Findings
Achieves state-of-the-art results on MoleculeNet, PharmaBench, and LRGB datasets.
Effectively captures chemically meaningful motifs for interpretability.
Outperforms existing methods in both classification and regression tasks.
Abstract
Graph Transformers have recently attracted attention for molecular property prediction by combining the inductive biases of graph neural networks (GNNs) with the global receptive field of Transformers. However, many existing hybrid architectures remain GNN-dominated, causing the resulting representations to remain heavily shaped by local message passing. Moreover, most existing methods operate at only a single structural granularity, limiting their ability to capture molecular patterns that span multiple molecular scales. We introduce BiScale-GTR, a unified framework for self-supervised molecular representation learning that combines chemically grounded fragment tokenization with adaptive multi-scale reasoning. Our method improves graph Byte Pair Encoding (BPE) tokenization to produce consistent, chemically valid, and high-coverage fragment tokens, which are used as fragment-level…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
