NOVIS: A Case for End-to-End Near-Online Video Instance Segmentation
Tim Meinhardt, Matt Feiszli, Yuchen Fan, Laura Leal-Taixe and, Rakesh Ranjan

TL;DR
NOVIS introduces a novel end-to-end near-online video instance segmentation method that outperforms existing approaches by directly predicting spatio-temporal masks and tracking instances without handcrafted heuristics.
Contribution
The paper presents NOVIS, the first near-online VIS model that is fully trainable end-to-end and surpasses all existing methods on major benchmarks.
Findings
Outperforms all existing VIS methods by large margins.
Achieves state-of-the-art results on YouTube-VIS and OVIS benchmarks.
Avoids handcrafted tracking heuristics through end-to-end training.
Abstract
Until recently, the Video Instance Segmentation (VIS) community operated under the common belief that offline methods are generally superior to a frame by frame online processing. However, the recent success of online methods questions this belief, in particular, for challenging and long video sequences. We understand this work as a rebuttal of those recent observations and an appeal to the community to focus on dedicated near-online VIS approaches. To support our argument, we present a detailed analysis on different processing paradigms and the new end-to-end trainable NOVIS (Near-Online Video Instance Segmentation) method. Our transformer-based model directly predicts spatio-temporal mask volumes for clips of frames and performs instance tracking between clips via overlap embeddings. NOVIS represents the first near-online VIS approach which avoids any handcrafted tracking heuristics.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Image and Video Quality Assessment · Video Analysis and Summarization
MethodsFocus
