N-Gram Graph: Simple Unsupervised Representation for Graphs, with Applications to Molecules
Shengchao Liu, Mehmet Furkan Demirel, Yingyu Liang

TL;DR
The paper introduces N-gram graph, an unsupervised, training-free graph representation method that effectively predicts molecular properties, outperforming existing graph neural networks and traditional methods across multiple benchmarks.
Contribution
It proposes a novel simple unsupervised graph representation method based on short walks, which is equivalent to a trained graph neural network but requires no training.
Findings
Outperforms popular graph neural networks on 60 tasks
Efficient computation of graph representations
Theoretically demonstrates strong representation and prediction power
Abstract
Machine learning techniques have recently been adopted in various applications in medicine, biology, chemistry, and material engineering. An important task is to predict the properties of molecules, which serves as the main subroutine in many downstream applications such as virtual screening and drug design. Despite the increasing interest, the key challenge is to construct proper representations of molecules for learning algorithms. This paper introduces the N-gram graph, a simple unsupervised representation for molecules. The method first embeds the vertices in the molecule graph. It then constructs a compact representation for the graph by assembling the vertex embeddings in short walks in the graph, which we show is equivalent to a simple graph neural network that needs no training. The representations can thus be efficiently computed and then used with supervised learning methods…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods · Machine Learning in Materials Science · Protein Structure and Dynamics
