The Open Polymers 2026 (OPoly26) Dataset and Evaluations
Daniel S. Levine, Nicholas Liesen, Lauren Chua, James Diffenderfer, Helgi Ingolfsson, Matthew P. Kroonblawd, Nitesh Kumar, Amitesh Maiti, Supun S. Mohottalalage, Muhammed Shuaibi, Brian Van Essen, Brandon M. Wood, C. Lawrence Zitnick, Samuel M. Blau, Evan R. Antoniuk

TL;DR
The paper introduces the OPoly26 dataset, comprising over 6.57 million DFT calculations on polymeric systems, to enhance machine learning models for polymer property prediction and facilitate broader development of atomistic models.
Contribution
The creation and public release of the large-scale OPoly26 dataset, filling a gap in polymer data for machine learning applications and improving predictive performance.
Findings
Augmenting ML training with OPoly26 improves polymer prediction accuracy.
OPoly26 captures diverse polymer chemistries, including monomer types and chain architectures.
Dataset supports development of universal atomistic models for polymers.
Abstract
Polymers-macromolecular systems composed of repeating chemical units-constitute the molecular foundation of living organisms, while their synthetic counterparts drive transformative advances across medicine, consumer products, and energy technologies. While machine learning (ML) models have been trained on millions of quantum chemical atomistic simulations for materials and/or small molecular structures to enable efficient, accurate, and transferable predictions of chemical properties, polymers have largely not been included in prior datasets due to the computational expense of high quality electronic structure calculations on representative polymeric structures. Here, we address this shortcoming with the creation of the Open Polymers 2026 (OPoly26) dataset, which contains more than 6.57 million density functional theory (DFT) calculations on up to 360 atom clusters derived from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Computational Drug Discovery Methods · Block Copolymer Self-Assembly
