Teaching and Evaluating LLMs to Reason About Polymer Design Related Tasks

Dikshya Mohanty; Mohammad Saqib Hasan; Syed Mostofa Monsur; Size Zheng; Benjamin Hsiao; Niranjan Balasubramanian

arXiv:2601.16312·cs.CL·May 15, 2026

Teaching and Evaluating LLMs to Reason About Polymer Design Related Tasks

Dikshya Mohanty, Mohammad Saqib Hasan, Syed Mostofa Monsur, Size Zheng, Benjamin Hsiao, Niranjan Balasubramanian

PDF

1 Repo

TL;DR

This paper introduces PolyBench, a large-scale benchmark dataset for polymer design tasks, and a knowledge-augmented reasoning distillation method to improve LLMs' reasoning about polymers.

Contribution

It provides a comprehensive dataset and a novel knowledge-augmented training method to enhance LLMs' capabilities in polymer science.

Findings

01

Small LLMs trained on PolyBench outperform similar-sized models.

02

Models trained on PolyBench perform well on external polymer benchmarks.

03

PolyBench enables better generalization and diagnostic testing for polymer reasoning.

Abstract

Research in AI4Science has shown promise in many science applications, including polymer design. However, current LLMs are ineffective in this problem space because: (i) most models lack polymer-specific knowledge, and (ii) existing aligned models have limited coverage of knowledge and capabilities relevant to polymer design. Addressing this, we introduce PolyBench, a large-scale training and test benchmark dataset of more than 125K polymer design-related tasks, leveraging a knowledge base of more than 13 million data points obtained from experimental and synthetic data sources to ensure broad coverage of polymers and their properties. For effective alignment using PolyBench, we introduce a knowledge-augmented reasoning distillation method that augments this dataset with structured CoT. Furthermore, tasks in PolyBench are organized from simple to complex analytical reasoning problems,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

StonyBrookNLP/PolyBench
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.