Strike the Balance: On-the-Fly Uncertainty based User Interactions for Long-Term Video Object Segmentation
St\'ephane Vujasinovi\'c, Stefan Becker, Sebastian Bullinger and, Norbert Scherer-Negenborn, Michael Arens, Rainer Stiefelhagen

TL;DR
This paper presents ziVOS, a novel online video object segmentation method that balances performance and robustness by using uncertainty estimation to solicit user feedback during long-term video tracking.
Contribution
It introduces ziVOS, a new online VOS approach that actively involves user feedback based on uncertainty, and proposes Lazy-XMem as a competitive baseline for long-term scenarios.
Findings
ziVOS effectively extends object tracking duration with minimal user corrections.
Uncertainty-based interaction improves segmentation robustness in long-term videos.
The approach outperforms existing methods on the LVOS dataset.
Abstract
In this paper, we introduce a variant of video object segmentation (VOS) that bridges interactive and semi-automatic approaches, termed Lazy Video Object Segmentation (ziVOS). In contrast, to both tasks, which handle video object segmentation in an off-line manner (i.e., pre-recorded sequences), we propose through ziVOS to target online recorded sequences. Here, we strive to strike a balance between performance and robustness for long-term scenarios by soliciting user feedback's on-the-fly during the segmentation process. Hence, we aim to maximize the tracking duration of an object of interest, while requiring minimal user corrections to maintain tracking over an extended period. We propose a competitive baseline, i.e., Lazy-XMem, as a reference for future works in ziVOS. Our proposed approach uses an uncertainty estimation of the tracking state to determine whether a user interaction…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Image and Video Quality Assessment · Video Surveillance and Tracking Methods
