Persistent Dirac for molecular representation
JunJie Wee, Ginestra Bianconi, Kelin Xia

TL;DR
This paper introduces a novel molecular representation method using persistent Dirac operators, capturing geometric and topological features for effective clustering of complex molecular structures.
Contribution
It develops a rigorous computational framework based on the persistent Dirac operator, analyzing its spectral properties for molecular representation and clustering.
Findings
Successfully clusters nine types of organic-inorganic halide perovskites
Demonstrates the spectral properties encode meaningful molecular information
Shows the effectiveness of Dirac-based fingerprints in molecular analysis
Abstract
Molecular representations are of fundamental importance for the modeling and analysis of molecular systems. Representation models and in general approaches based on topological data analysis (TDA) have demonstrated great success in various steps of drug design and materials discovery. Here we develop a mathematically rigorous computational framework for molecular representation based on the persistent Dirac operator. The properties of the spectrum of the discrete weighted and unweighted Dirac matrices are systemically discussed and used to demonstrate the geometric and topological properties of both non-homology and homology eigenvectors of real molecular structures. This allows us to asses the influence of weighting schemes on the information encoded in the Dirac eigenspectrum. A series of physical persistent attributes, which characterize the spectrum of the Dirac matrices across a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopological and Geometric Data Analysis · Geochemistry and Geologic Mapping
