Space-Filling Curves as a Novel Crystal Structure Representation for Machine Learning Models
Dipti Jasrasaria, Edward O. Pyzer-Knapp, Dmitrij Rappoport, and Alan, Aspuru-Guzik

TL;DR
This paper introduces a novel crystal structure representation based on space-filling curves, specifically Morton curves, to enhance machine learning predictions of solid-state properties in organic crystals.
Contribution
The paper proposes the SFC-M feature representations using Morton space-filling curves and employs Latent Semantic Indexing to reduce sparsity, improving crystal structure modeling for ML.
Findings
SFC-M representations can predict DFT energies of organic crystals.
Initial results show promise for SFC-M in solid-state property prediction.
Further exploration needed to optimize SFC-M effectiveness.
Abstract
A fundamental problem in applying machine learning techniques for chemical problems is to find suitable representations for molecular and crystal structures. While the structure representations based on atom connectivities are prevalent for molecules, two-dimensional descriptors are not suitable for describing molecular crystals. In this work, we introduce the SFC-M family of feature representations, which are based on Morton space-filling curves, as an alternative means of representing crystal structures. Latent Semantic Indexing (LSI) was employed in a novel setting to reduce sparsity of feature representations. The quality of the SFC-M representations were assessed by using them in combination with artificial neural networks to predict Density Functional Theory (DFT) single point, Ewald summed, lattice, and many-body dispersion energies of 839 organic molecular crystal unit cells…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods · Machine Learning in Materials Science · Various Chemistry Research Topics
