Directional Deep Embedding and Appearance Learning for Fast Video Object   Segmentation

Yingjie Yin; De Xu; Xingang Wang; Lei Zhang

arXiv:2002.06736·cs.CV·February 18, 2020·1 cites

Directional Deep Embedding and Appearance Learning for Fast Video Object Segmentation

Yingjie Yin, De Xu, Xingang Wang, Lei Zhang

PDF

Open Access 1 Repo

TL;DR

This paper introduces DDEAL, a fast and accurate semi-supervised video object segmentation method that avoids online fine-tuning by using directional deep embedding and appearance learning, achieving state-of-the-art results efficiently.

Contribution

The paper proposes a novel DDEAL approach with a global directional matching module and a directional appearance model, eliminating the need for online fine-tuning in VOS.

Findings

01

Achieves 74.8% J & F score on DAVIS 2017

02

Attains 71.3% G score on YouTube-VOS

03

Runs at 25 fps with high accuracy

Abstract

Most recent semi-supervised video object segmentation (VOS) methods rely on fine-tuning deep convolutional neural networks online using the given mask of the first frame or predicted masks of subsequent frames. However, the online fine-tuning process is usually time-consuming, limiting the practical use of such methods. We propose a directional deep embedding and appearance learning (DDEAL) method, which is free of the online fine-tuning process, for fast VOS. First, a global directional matching module, which can be efficiently implemented by parallel convolutional operations, is proposed to learn a semantic pixel-wise embedding as an internal guidance. Second, an effective directional appearance model based statistics is proposed to represent the target and background on a spherical embedding space for VOS. Equipped with the global directional matching module and the directional…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

YingjieYin/Directional-Deep-Embedding-and-Appearance-Learning-for-Fast-Video-Object-Segmentation
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection · Face recognition and analysis · Advanced Neural Network Applications

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings