A Synthetic Dataset for Manometry Recognition in Robotic Applications
Pedro Antonio Rabelo Saraiva, Enzo Ferreira de Souza, Joao Manoel Herrera Pinheiro, Thiago H. Segreto, Ricardo V. Godoy, Marcelo Becker

TL;DR
This paper presents a hybrid synthetic data generation pipeline combining procedural rendering and AI video synthesis to improve object detection in industrial environments, reducing data collection costs and enhancing model robustness.
Contribution
It introduces a novel hybrid data synthesis approach using BlenderProc and Cosmos-Predict2 for creating photorealistic and physically consistent training data.
Findings
Synthetic data improves detection accuracy.
A 1:1 real-to-synthetic data ratio yields optimal results.
Synthetic data generation is cost-effective and safe.
Abstract
This paper addresses the challenges of data scarcity and high acquisition costs in training robust object detection models for complex industrial environments, such as offshore oil platforms. Data collection in these hazardous settings often limits the development of autonomous inspection systems. To mitigate this issue, we propose a hybrid data synthesis pipeline that integrates procedural rendering and AI-driven video generation. The approach uses BlenderProc to produce photorealistic images with domain randomization and NVIDIA's Cosmos-Predict2 to generate physically consistent video sequences with temporal variation. A YOLO-based detector trained on a composite dataset, combining real and synthetic data, outperformed models trained solely on real images. A 1:1 ratio between real and synthetic samples achieved the highest accuracy. The results demonstrate that synthetic data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
