Training a Computer Vision Model for Commercial Bakeries with Primarily   Synthetic Images

Thomas H. Schmitt; Maximilian Bundscherer; Tobias Bocklet

arXiv:2409.20122·cs.CV·October 1, 2024

Training a Computer Vision Model for Commercial Bakeries with Primarily Synthetic Images

Thomas H. Schmitt, Maximilian Bundscherer, Tobias Bocklet

PDF

Open Access

TL;DR

This paper develops a computer vision system for bakery product detection using a large dataset and synthetic images, achieving high accuracy with state-of-the-art models.

Contribution

It introduces an expanded dataset and synthetic image generation to improve bakery product detection with YOLO models.

Findings

01

Achieved 90.3% [email protected] accuracy on test set.

02

Synthetic images effectively enhanced model robustness.

03

Extended previous work with more diverse data.

Abstract

In the food industry, reprocessing returned product is a vital step to increase resource efficiency. [SBB23] presented an AI application that automates the tracking of returned bread buns. We extend their work by creating an expanded dataset comprising 2432 images and a wider range of baked goods. To increase model robustness, we use generative models pix2pix and CycleGAN to create synthetic images. We train state-of-the-art object detection model YOLOv9 and YOLOv8 on our detection task. Our overall best-performing model achieved an average precision [email protected] of 90.3% on our test set.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpectroscopy and Chemometric Analyses · Industrial Vision Systems and Defect Detection

MethodsResidual Connection · Residual Block · Cycle Consistency Loss · Tanh Activation · HuMan(Expedia)||How do I get a human at Expedia? · PatchGAN · Dropout · GAN Least Squares Loss · Sigmoid Activation · Batch Normalization