DAVOS: Semi-Supervised Video Object Segmentation via Adversarial Domain Adaptation
Jinshuo Zhang, Zhicheng Wang, Songyan Zhang, Gang Wei

TL;DR
This paper introduces DAVOS, a semi-supervised video object segmentation method using adversarial domain adaptation to improve performance across different datasets without requiring additional annotations.
Contribution
It presents a novel adversarial domain adaptation approach for VOS, combining appearance and motion features with supervised and unsupervised training to handle domain shift.
Findings
Achieved 82.6% mean IoU on DAVIS2016.
Significantly improved performance on FBMS59 and Youtube-Object datasets.
No extra annotations needed for domain adaptation.
Abstract
Domain shift has always been one of the primary issues in video object segmentation (VOS), for which models suffer from degeneration when tested on unfamiliar datasets. Recently, many online methods have emerged to narrow the performance gap between training data (source domain) and test data (target domain) by fine-tuning on annotations of test data which are usually in shortage. In this paper, we propose a novel method to tackle domain shift by first introducing adversarial domain adaptation to the VOS task, with supervised training on the source domain and unsupervised training on the target domain. By fusing appearance and motion features with a convolution layer, and by adding supervision onto the motion branch, our model achieves state-of-the-art performance on DAVIS2016 with 82.6% mean IoU score after supervised training. Meanwhile, our adversarial domain adaptation strategy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Adversarial Robustness in Machine Learning
MethodsVOS · Convolution
