# Informatics-Based Design of Virtual Libraries of Polymer Nano-Composites

**Authors:** Qinrui Liu, Scott R. Broderick

PMC · DOI: 10.3390/ijms26157344 · 2025-07-30

## TL;DR

This paper presents an informatics-based method to design polymer nano-composites by predicting electrical conductivity using a QSAR model.

## Contribution

The study introduces a novel QSAR approach for nano-composites that integrates polymer chemistry and nano-additive interactions with UMAP for data variability analysis.

## Key findings

- A QSAR model was developed to predict electrical conductivity based on polymer matrix and nano-additive volume.
- UMAP analysis showed that data variability and information content are more important than data size for model training.
- Multiple training/testing splits confirmed the statistical significance of the results.

## Abstract

The purpose of this paper is to use an informatics-based analysis to develop a rational design approach to the accelerated screening of nano-composite materials. Using existing nano-composite data, we develop a quantitative structure–activity relationship (QSAR) as a function of polymer matrix chemistry and nano-additive volume, with the property predicted being electrical conductivity. The development of a QSAR for the electrical conductivity of nano-composites presents challenges in representing the polymer matrix chemistry and backbone structure, the additive content, and the interactions between the components while capturing the non-linearity of electrical conductivity with changing nano-additive volume. An important aspect of this work is designing chemistries with small training data sizes, as the uncertainty in modeling is high, and potentially the representated physics may be minimal. In this work, we explore two important components of this aspect. First, an assessment via Uniform Manifold Approximation and Projection (UMAP) is used to assess the variability provided by new data points and how much information is contributed by data, which is significantly more important than the actual data size (i.e., how much new information is provided by each data point?). The second component involves assessing multiple training/testing splits to ensure that any results are not due to a specific case but rather that the results are statistically meaningful. This work will accelerate the rational design of polymer nano-composites by fully considering the large array of possible variables while providing a high-speed screening of polymer chemistries.

## Full-text entities

- **Chemicals:** Polymer (MESH:D011108)

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12347764/full.md

---
Source: https://tomesphere.com/paper/PMC12347764