Learning to Detect Baked Goods with Limited Supervision
Thomas H. Schmitt, Maximilian Bundscherer, Tobias Bocklet

TL;DR
This paper develops a limited supervision object detection approach for baked goods, combining weakly supervised learning and pseudo-labeling to achieve high accuracy with minimal annotation effort.
Contribution
It introduces two training workflows that enable effective detection of baked goods using limited supervision, surpassing fully supervised models in non-ideal conditions.
Findings
Achieves 0.91 mAP with only image-level supervision.
Pseudo-label fine-tuning improves performance by 19.3%.
Outperforms fully supervised models under deployment conditions.
Abstract
Monitoring leftover products provides valuable insights that can be used to optimize future production. This is especially important for German bakeries because freshly baked goods have a very short shelf life. Automating this process can reduce labor costs, improve accuracy, and streamline operations. We propose automating this process using an object detection model to identify baked goods from images. However, the large diversity of German baked goods makes fully supervised training prohibitively expensive and limits scalability. Although open-vocabulary detectors (e.g., OWLv2, Grounding DINO) offer lexibility, we demonstrate that they are insufficient for our task. While motivated by bakeries, our work addresses the broader challenges of deploying computer vision in industries, where tasks are specialized and annotated datasets are scarce. We compile dataset splits with varying…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Generative Adversarial Networks and Image Synthesis
