Temporal-consistent CAMs for Weakly Supervised Video Segmentation in Waste Sorting
Andrea Marelli, Luca Magri, Federica Arrigoni, Giacomo Boracchi

TL;DR
This paper introduces a weakly supervised video segmentation method that leverages temporal coherence to produce accurate masks in industrial waste sorting scenarios, reducing annotation costs and improving segmentation quality.
Contribution
The proposed approach integrates temporal consistency into weakly supervised segmentation training, enhancing mask accuracy in video streams without extensive manual annotations.
Findings
Improved segmentation accuracy in waste sorting videos.
Effective use of temporal coherence during classifier training.
Demonstrated benefits on real-world industrial dataset.
Abstract
In industrial settings, weakly supervised (WS) methods are usually preferred over their fully supervised (FS) counterparts as they do not require costly manual annotations. Unfortunately, the segmentation masks obtained in the WS regime are typically poor in terms of accuracy. In this work, we present a WS method capable of producing accurate masks for semantic segmentation in the case of video streams. More specifically, we build saliency maps that exploit the temporal coherence between consecutive frames in a video, promoting consistency when objects appear in different frames. We apply our method in a waste-sorting scenario, where we perform weakly supervised video segmentation (WSVS) by training an auxiliary classifier that distinguishes between videos recorded before and after a human operator, who manually removes specific wastes from a conveyor belt. The saliency maps of this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Generative Adversarial Networks and Image Synthesis · Industrial Vision Systems and Defect Detection
