Self-supervised Video Object Segmentation with Distillation Learning of   Deformable Attention

Quang-Trung Truong; Duc Thanh Nguyen; Binh-Son Hua; Sai-Kit Yeung

arXiv:2401.13937·cs.CV·March 19, 2024·1 cites

Self-supervised Video Object Segmentation with Distillation Learning of Deformable Attention

Quang-Trung Truong, Duc Thanh Nguyen, Binh-Son Hua, Sai-Kit Yeung

PDF

Open Access

TL;DR

This paper introduces a lightweight, self-supervised video object segmentation method using deformable attention and distillation learning, achieving state-of-the-art results while being efficient enough for low-powered devices.

Contribution

The paper proposes a novel self-supervised approach with deformable attention and distillation learning, improving adaptability to temporal changes and reducing computational complexity.

Findings

01

Achieves state-of-the-art performance on benchmark datasets.

02

Demonstrates superior memory efficiency compared to existing methods.

03

Effectively adapts to temporal variations in video sequences.

Abstract

Video object segmentation is a fundamental research problem in computer vision. Recent techniques have often applied attention mechanism to object representation learning from video sequences. However, due to temporal changes in the video data, attention maps may not well align with the objects of interest across video frames, causing accumulated errors in long-term video processing. In addition, existing techniques have utilised complex architectures, requiring highly computational complexity and hence limiting the ability to integrate video object segmentation into low-powered devices. To address these issues, we propose a new method for self-supervised video object segmentation based on distillation learning of deformable attention. Specifically, we devise a lightweight architecture for video object segmentation that is effectively adapted to temporal changes. This is enabled by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection · Advanced Neural Network Applications · Domain Adaptation and Few-Shot Learning

MethodsKnowledge Distillation · ALIGN