Hyper-Dimensional Fingerprints as Molecular Representations
Jonas Teufel, Luca Torresi, Andr\'e Eberhard, Pascal Friederich

TL;DR
Hyperdimensional fingerprints (HDF) offer a deterministic, training-free molecular representation that outperforms traditional fingerprints in property prediction and similarity preservation, especially at low dimensions.
Contribution
Introduction of hyperdimensional fingerprints (HDF), a novel algebraic method for molecular representation that maintains structural fidelity without training.
Findings
HDF outperforms conventional fingerprints in most property prediction benchmarks.
HDF preserves molecular similarity with high correlation to graph edit distance.
HDF improves Bayesian molecular optimization sample efficiency.
Abstract
Computational molecular representations underpin virtual screening, property prediction, and materials discovery. Conventional fingerprints are efficient and deterministic but lose structural information through hash-based compression, particularly at low dimensionalities. Learned representations from graph neural networks recover this expressiveness but require task-specific training and substantial computational resources. Here we introduce hyperdimensional fingerprints (HDF), which replace the learned transformations of message-passing neural networks with algebraic operations on high-dimensional vectors, producing deterministic molecular representations without any training. Across diverse property prediction benchmarks, HDF outperforms conventional fingerprints in the majority of tasks while exhibiting greater consistency across datasets and models. Crucially, HDF embeddings…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
