Resolution-vs.-Accuracy Dilemma in Machine Learning Modeling of Electronic Excitation Spectra
Prakriti Kayastha, Sabyasachi Chakraborty, Raghunathan Ramakrishnan

TL;DR
This paper investigates the trade-off between resolution and accuracy in machine learning models for predicting electronic excitation spectra, introducing a new dataset and demonstrating improved spectral predictions with ML over traditional methods.
Contribution
The study presents a new large dataset of molecular spectra and shows that ML models trained on limited data can accurately predict high-resolution spectra, addressing the resolution-accuracy dilemma.
Findings
ML models outperform small basis set predictions in spectral accuracy
A new dataset with 12,880 molecules enables better spectral modeling
High-resolution spectra can be recovered with less than 10% training data
Abstract
In this study, we explore the potential of machine learning for modeling molecular electronic spectral intensities as a continuous function in a given wavelength range. Since presently available chemical space datasets provide excitation energies and corresponding oscillator strengths for only a few valence transitions, here, we present a new dataset -- \bigqm -- with 12,880 molecules containing up to 7 CONF atoms and report ground state and excited state properties. A publicly accessible web-based data-mining platform is presented to facilitate on-the-fly screening of several molecular properties including harmonic vibrational and electronic spectra. We present all singlet electronic transitions from the ground state calculated using the time-dependent density functional theory framework with the B97XD exchange-correlation functional and a diffuse-function augmented basis set.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods · Machine Learning in Materials Science · Various Chemistry Research Topics
