High-Accuracy Physical Property Prediction for Organics via Molecular Representation Learning: Bridging Data to Discovery
Qi Ou, Hongshuai Wang, Minyang Zhuang, Shangqian Chen, Lele Liu, Ning Wang, and Zhifeng Gao

TL;DR
This paper introduces a 3D transformer-based molecular representation learning model, Org-Mol, trained on 60 million structures, achieving high accuracy in predicting physical properties of organics and enabling efficient discovery of energy-saving materials.
Contribution
The work presents a large-scale pre-trained model for organic molecules that significantly improves property prediction accuracy and accelerates material discovery processes.
Findings
Model achieves R^2 > 0.95 on test data.
Successfully identifies novel coolants through high-throughput screening.
Experimental validation confirms two promising candidates.
Abstract
The ongoing energy crisis has underscored the urgent need for energy-efficient materials with high energy utilization efficiency, prompting a surge in research into organic compounds due to their environmental compatibility, cost-effective processing, and versatile modifiability. To address the high experimental costs and time-consuming nature of traditional trial-and-error methods in the discovery of highly functional organic compounds, we apply the 3D transformer-based molecular representation learning algorithm to construct a pre-trained model using 60 million semi-empirically optimized structures of small organic molecules, namely, Org-Mol, which is then fine-tuned with public experimental data to obtain prediction models for various physical properties. Despite the pre-training process relying solely on single molecular coordinates, the fine-tuned models achieves high accuracy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Computational Drug Discovery Methods · Various Chemistry Research Topics
