Multi-Context Temporal Consistent Modeling for Referring Video Object Segmentation
Sun-Hyuk Choi, Hayoung Jo, Seong-Whan Lee

TL;DR
This paper introduces the Multi-context Temporal Consistency Module (MTCM) to improve referring video object segmentation by enhancing query consistency and context understanding, leading to more accurate segmentation results.
Contribution
The paper proposes MTCM, a novel module with an Aligner and Multi-Context Enhancer, to address query inconsistency and limited context consideration in transformer-based models.
Findings
Achieved 47.6 J&F on MeViS dataset.
Improved performance across four different models.
Enhanced query consistency and context modeling.
Abstract
Referring video object segmentation aims to segment objects within a video corresponding to a given text description. Existing transformer-based temporal modeling approaches face challenges related to query inconsistency and the limited consideration of context. Query inconsistency produces unstable masks of different objects in the middle of the video. The limited consideration of context leads to the segmentation of incorrect objects by failing to adequately account for the relationship between the given text and instances. To address these issues, we propose the Multi-context Temporal Consistency Module (MTCM), which consists of an Aligner and a Multi-Context Enhancer (MCE). The Aligner removes noise from queries and aligns them to achieve query consistency. The MCE predicts text-relevant queries by considering multi-context. We applied MTCM to four different models, increasing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Analysis and Summarization · Advanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications
