# Prediction and Interpretability Study of the Glass Transition Temperature of Polyimide Based on Machine Learning and Molecular Dynamics Simulations

**Authors:** Wenjia Huo, Boyang Liang, Xiang Wu, Zhenchang Zhang, Weichao Zhou, Haihong Wang, Xupeng Ran, Yaoyao Bai, Rongrong Zheng

PMC · DOI: 10.3390/polym17152083 · Polymers · 2025-07-30

## TL;DR

This paper uses machine learning and simulations to predict and interpret the glass transition temperature of polyimide materials efficiently.

## Contribution

A novel ML framework with extensive dataset and MD validation for accurate and interpretable Tg prediction of polyimides.

## Key findings

- Categorical Boosting achieved high accuracy (R²=0.895) in predicting Tg of polyimides.
- SHapley analysis showed NumRotatableBonds negatively impacts Tg.
- ML predictions matched MD simulations with 6.75% deviation but used fewer resources.

## Abstract

The utilization of machine learning (ML) has brought more opportunities for the discovery of high-performance materials with specific properties to replace traditional engineering materials. The glass transition temperature (Tg) is a crucial characteristic of polyimide (PI). But small datasets can only partially reveal structural information and decrease the ability of the models to learn from the observed data. In this investigation, a dataset comprising 1261 PIs was assembled. A quantitative structure–property relationship targeting Tg was constructed using nine regression algorithms, with the Categorical Boosting demonstrating the highest accuracy, achieving a coefficient of determination of 0.895 for the test set. SHapley Additive exPlanations analysis identified the NumRotatableBonds descriptor had a significantly negative impact on Tg. Finally, all-atom molecular dynamics (MD) simulations calculated eight PI structures to verify the accuracy of the prediction model. The ML prediction was consistent with the MD simulation, with the lowest prediction deviation of approximately 6.75%, but the time and resource consumption were tremendously reduced. These findings emphasize the significance of utilizing extensive datasets for model training. This available and interpretable ML framework provides impressive acceleration over the MD simulation and serves as a reference for the structural design of PI with the desired Tg in the future.

## Full-text entities

- **Chemicals:** PI (-)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12349611/full.md

## Figures

11 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12349611/full.md

## References

73 references — full list in the complete paper: https://tomesphere.com/paper/PMC12349611/full.md

---
Source: https://tomesphere.com/paper/PMC12349611