# The Creation of Artificial Data for Training a Neural Network Using the Example of a Conveyor Production Line for Flooring

**Authors:** Alexey Zaripov, Roman Kulshin, Anatoly Sidorov

PMC · DOI: 10.3390/jimaging11050168 · Journal of Imaging · 2025-05-20

## TL;DR

This paper describes a system for generating synthetic data to train neural networks for quality control on a flooring conveyor line, showing better performance than real data.

## Contribution

A novel system for generating synthetic data using digital twins to train computer vision models for industrial quality control.

## Key findings

- Synthetic data generation using digital twins significantly reduces time and cost for training neural networks.
- The YOLOv8 model trained on synthetic data achieved an mAP50 of 0.95, outperforming real data-trained models.
- Photorealistic synthetic images are effective for training when real data is limited.

## Abstract

This work is dedicated to the development of a system for generating artificial data for training neural networks used within a conveyor-based technology framework. It presents an overview of the application areas of computer vision (CV) and establishes that traditional methods of data collection and annotation—such as video recording and manual image labeling—are associated with high time and financial costs, which limits their efficiency. In this context, synthetic data represents an alternative capable of significantly reducing the time and financial expenses involved in forming training datasets. Modern methods for generating synthetic images using various tools—from game engines to generative neural networks—are reviewed. As a tool-platform solution, the concept of digital twins for simulating technological processes was considered, within which synthetic data is utilized. Based on the review findings, a generalized model for synthetic data generation was proposed and tested on the example of quality control for floor coverings on a conveyor line. The developed system provided the generation of photorealistic and diverse images suitable for training neural network models. A comparative analysis showed that the YOLOv8 model trained on synthetic data significantly outperformed the model trained on real images: the mAP50 metric reached 0.95 versus 0.36, respectively. This result demonstrates the high adequacy of the model built on the synthetic dataset and highlights the potential of using synthetic data to improve the quality of computer vision models when access to real data is limited.

## Full-text entities

- **Diseases:** brick defects (MESH:D000013), injury to (MESH:D014947), fatigue (MESH:D005221)
- **Chemicals:** Perlin (-)
- **Species:** Sus scrofa (pig, species) [taxon 9823], Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12112862/full.md

## Figures

13 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12112862/full.md

## References

54 references — full list in the complete paper: https://tomesphere.com/paper/PMC12112862/full.md

---
Source: https://tomesphere.com/paper/PMC12112862