# Enhancing the Analysis of Rheological Behavior in Clinker-Aided Cementitious Systems Through Large Language Model-Based Synthetic Data Generation

**Authors:** Murat Eser, Yahya Kaya, Ali Mardani, Metin Bilgin, Mehmet Bozdemir

PMC · DOI: 10.3390/ma18153579 · Materials · 2025-07-30

## TL;DR

This study explores how large language models can improve predictions about cement and admixture compatibility by generating synthetic data.

## Contribution

This is one of the first studies to use large language models for synthetic data augmentation in cement rheology modeling.

## Key findings

- LLM-augmented models showed significantly improved predictive accuracy compared to baseline models.
- Cements produced with grinding aids had higher dynamic yield stress and viscosity than the control.
- NCART performed best among baseline models for predicting rheological properties.

## Abstract

This study investigates the parameters influencing the compatibility between cement and polycarboxylate ether (PCE) admixtures in cements produced with various types and dosages of grinding aids (GAs). A total of 29 cement types (including a control) were prepared using seven different GAs at four dosage levels, and 87 paste mixtures were produced with three PCE dosages. Rheological behavior was evaluated via the Herschel–Bulkley model, focusing on dynamic yield stress (DYS) and viscosity. The data were modeled using CNN, Random Forest (RF), and Neural Classification and Regression Tree (NCART), and each model was enhanced with synthetic data generated by Large Language Models (LLMs), resulting in CNN-LLM, RF-LLM, and NCART-LLM variants. All six variants were evaluated using R-squared, Mean Absolute Error (MAE), Root Mean Square Error (RMSE), and Logcosh. This study is among the first to use LLMs for synthetic data augmentation. It augmented the experimental dataset synthetically and analyzed the effects on the study results. Among the baseline methods, NCART achieved the best performance for both viscosity (MAE = 1.04, RMSE = 1.33, R2 = 0.84, Logcosh = 0.57) and DYS (MAE = 8.73, RMSE = 11.50, R2 = 0.77, Logcosh = 8.09). Among baseline models, NCART performed best, while LLM augmentation significantly improved all models’ predictive accuracy. It was also observed that cements produced with GA exhibited higher DYS and viscosity than the control, likely due to finer particle size distribution. Overall, the study highlights the potential of LLM-based synthetic augmentation in modeling cement admixture compatibility.

## Full-text entities

- **Chemicals:** GA (-)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12348910/full.md

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12348910/full.md

## References

61 references — full list in the complete paper: https://tomesphere.com/paper/PMC12348910/full.md

---
Source: https://tomesphere.com/paper/PMC12348910