TL;DR
HSG-12M is a large-scale, open-source dataset of spatial multigraphs derived from the energy spectra of non-Hermitian crystals, enabling advanced graph learning and scientific discovery.
Contribution
The paper introduces Poly2Graph, a pipeline for automating spectral graph extraction, and presents HSG-12M, the first extensive dataset of spatial multigraphs from quantum physics.
Findings
Existing GNNs face challenges learning spatial multi-edges at scale.
Spectral graphs act as universal topological fingerprints for various mathematical objects.
HSG-12M enables new research in geometry-aware graph learning and condensed matter physics.
Abstract
AI is transforming scientific research by revealing new ways to understand complex physical systems, but its impact remains constrained by the lack of large, high-quality domain-specific datasets. A rich, largely untapped resource lies in non-Hermitian quantum physics, where the energy spectra of crystals form intricate geometries on the complex plane -- termed as Hamiltonian spectral graphs. Despite their significance as fingerprints for electronic behavior, their systematic study has been intractable due to the reliance on manual extraction. To unlock this potential, we introduce Poly2Graph: a high-performance, open-source pipeline that automates the mapping of 1-D crystal Hamiltonians to spectral graphs. Using this tool, we present HSG-12M: a dataset containing 11.6 million static and 5.1 million dynamic Hamiltonian spectral graphs across 1401 characteristic-polynomial classes,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
