Automated Analysis of DFT Output Files for Molecular Descriptor Extraction and Reactivity Modeling
Yu-Chien Huang, Dennis Chung-Yang Huang, Yun-Cheng Tsai

TL;DR
This paper introduces DFTDescriptorPipeline, an automated workflow that extracts quantum chemical descriptors from DFT calculations to build interpretable models of molecular reactivity and properties, enhancing data-driven molecular design.
Contribution
The paper presents a novel fully automated pipeline for extracting descriptors from DFT outputs and applying multivariate linear regression to model structure-property and structure-reactivity relationships.
Findings
Validated across four diverse case studies
Demonstrated interpretability of the models
Showed broad applicability to different chemical systems
Abstract
Understanding the relationship between molecular structure and chemical reactivity or properties is fundamental to rational molecular design. Linear free energy relationships (LFERs), particularly Hammett analysis, have long served as powerful tools in organic chemistry. Recently, these approaches have been enhanced by incorporating computationally derived parameters, enabling broader applicability across diverse molecules and reactions. To facilitate and scale this process, we present DFTDescriptorPipeline, a fully automated workflow for extracting quantum chemical descriptors from Gaussian log files and constructing structure-property and structure-reactivity relationships using multivariate linear regression (MLR) models. We validate the workflow across four case studies, including photoswitchable molecules and catalytic reactions. In each case, the models provide interpretable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Computational Drug Discovery Methods · Radical Photochemical Reactions
