A Self-Attention Ansatz for Ab-initio Quantum Chemistry
Ingrid von Glehn, James S. Spencer, David Pfau

TL;DR
This paper introduces the Wavefunction Transformer, a self-attention neural network architecture that improves the accuracy of solving the many-electron Schrödinger equation in quantum chemistry, especially for larger molecules.
Contribution
The Wavefunction Transformer (Psiformer) is a novel self-attention based neural network that enhances first-principles quantum calculations without external data, outperforming previous models like FermiNet and PauliNet.
Findings
Significantly improved ground state energy calculations for large molecules
Demonstrated the effectiveness of self-attention in modeling electron correlations
Achieved qualitative leaps in accuracy over previous neural network approaches
Abstract
We present a novel neural network architecture using self-attention, the Wavefunction Transformer (Psiformer), which can be used as an approximation (or Ansatz) for solving the many-electron Schr\"odinger equation, the fundamental equation for quantum chemistry and material science. This equation can be solved from first principles, requiring no external training data. In recent years, deep neural networks like the FermiNet and PauliNet have been used to significantly improve the accuracy of these first-principle calculations, but they lack an attention-like mechanism for gating interactions between electrons. Here we show that the Psiformer can be used as a drop-in replacement for these other neural networks, often dramatically improving the accuracy of the calculations. On larger molecules especially, the ground state energy can be improved by dozens of kcal/mol, a qualitative leap…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMachine Learning in Materials Science · Spectroscopy and Quantum Chemical Studies · Advanced Chemical Physics Studies
MethodsMulti-Head Attention · Attention Is All You Need · Layer Normalization · Adam · Softmax · Dropout · Byte Pair Encoding · Position-Wise Feed-Forward Layer · Label Smoothing · Absolute Position Encodings
