1st Place Solution for CVPR2023 BURST Long Tail and Open World Challenges
Kaer Huang

TL;DR
This paper presents the winning solution for the CVPR2023 BURST challenges, advancing video instance segmentation in long-tailed and open-world scenarios through a novel training strategy and dataset combination.
Contribution
The authors introduce LeTracker, a new method that effectively handles long-tail and open-world VIS tasks by combining multiple datasets and training strategies, achieving top benchmark results.
Findings
Achieved 14.9 HOTAall on BURST test set, ranking 1st.
Achieved 61.4 OWTAall on open-world challenge, ranking 1st.
Demonstrated effectiveness of dataset combination and training strategies.
Abstract
Currently, Video Instance Segmentation (VIS) aims at segmenting and categorizing objects in videos from a closed set of training categories that contain only a few dozen of categories, lacking the ability to handle diverse objects in real-world videos. As TAO and BURST datasets release, we have the opportunity to research VIS in long-tailed and open-world scenarios. Traditional VIS methods are evaluated on benchmarks limited to a small number of common classes, But practical applications require trackers that go beyond these common classes, detecting and tracking rare and even never-before-seen objects. Inspired by the latest MOT paper for the long tail task (Tracking Every Thing in the Wild, Siyuan Li et), for the BURST long tail challenge, we train our model on a combination of LVISv0.5 and the COCO dataset using repeat factor sampling. First, train the detector with segmentation and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning
