# A Machine Learning Framework for the Reconstruction of Composite Fatigue and Fracture Properties: A Synthetic Data Study

**Authors:** Saurabh Tiwari, Aman Gupta

PMC · DOI: 10.3390/ma19061131 · 2026-03-14

## TL;DR

This paper introduces a machine learning framework to predict fatigue life and fracture toughness in natural fiber composites using synthetic data.

## Contribution

A novel ML framework is proposed for reconstructing composite properties using synthetic data with noise calibrated to experimental scatter.

## Key findings

- Gradient Boosting achieved high accuracy (R2 = 0.93) for fatigue life prediction.
- Stacking Ensemble reached 89% of the noise-ceiling for fracture toughness prediction.
- Engineered composite indicators, stress amplitude, and fiber length were identified as key features.

## Abstract

This study presents a machine learning framework for the reconstruction of fatigue life and fracture toughness in natural fiber-reinforced composites, evaluating the predictive accuracy of six regression algorithms—Random Forest, Gradient Boosting, Support Vector Machine, Neural Network, Ridge Regression, and Lasso Regression—using a controlled synthetic dataset of 600 samples generated from established Basquin fatigue and Rule of Mixtures fracture equations, incorporating stochastic noise calibrated to experimental scatter (CV = 15–50%), with log-normal noise standard deviation of 0.20 for fatigue life and Gaussian noise standard deviation of 0.15 for fracture toughness. The dataset encompasses eight natural fiber types (flax, jute, sisal, hemp, bamboo, coconut, banana, and pineapple) and five matrix systems (epoxy, polyester, PLA, vinyl ester, and polyurethane). Models were evaluated using a 70-15-15 train–validation–test split with 5-fold cross-validation and exhaustive grid search hyperparameter optimisation. Gradient Boosting achieved R2 = 0.93 for fatigue life and Stacking Ensemble achieved R2 = 0.87 for fracture toughness, representing 97% and 89% of their respective noise-ceiling values (theoretical maximum R2 of 0.96 and 0.98 given the programmed noise levels). The ML models perform supervised function approximation—learning to reconstruct the programmed generation equations rather than discovering novel physical composite behaviour—and function as automated surrogates for the governing equations. Feature importance analysis identified engineered composite indicators, stress amplitude, and fiber length as the most influential parameters. The framework provides a reproducible ML evaluation pipeline as a methodological template for future experimental composite studies.

## Full-text entities

- **Diseases:** Fatigue (MESH:D005221), Fracture (MESH:D050723)
- **Chemicals:** polyurethane (MESH:D011140), epoxy (MESH:D004853), polyester (MESH:D011091), vinyl ester (-), PLA (MESH:C033616)
- **Species:** Musa acuminata (banana, species) [taxon 4641], Ananas comosus (pineapple, species) [taxon 4615], Cannabis sativa (species) [taxon 3483]

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC13027414/full.md

---
Source: https://tomesphere.com/paper/PMC13027414