PolySet: Restoring the Statistical Ensemble Nature of Polymers for Machine Learning
Khalid Ferji

TL;DR
PolySet introduces an ensemble-based polymer representation that captures the stochastic nature of polymers, improving machine learning predictions of tail-sensitive properties by aligning digital models more closely with physical reality.
Contribution
It presents PolySet, a novel ensemble-based encoding method for polymers that enhances ML model accuracy by incorporating the statistical distribution of chain lengths.
Findings
PolySet retains higher-order distributional moments like Mz and Mz+1.
Improved stability and accuracy in learning tail-sensitive properties.
Framework is compatible with various molecular representations and extends to complex polymer architectures.
Abstract
Machine-learning (ML) models in polymer science typically treat a polymer as a single, perfectly defined molecular graph, even though real materials consist of stochastic ensembles of chains with distributed lengths. This mismatch between physical reality and digital representation limits the ability of current models to capture polymer behaviour. Here we introduce PolySet, a framework that represents a polymer as a finite, weighted ensemble of chains sampled from an assumed molar-mass distribution. This ensemble-based encoding is independent of chemical detail, compatible with any molecular representation and illustrated here in the homopolymer case using a minimal language model. We show that PolySet retains higher-order distributional moments (such as Mz, Mz+1), enabling ML models to learn tail-sensitive properties with greatly improved stability and accuracy. By explicitly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Block Copolymer Self-Assembly · Advanced Polymer Synthesis and Characterization
