DOOBNet: Deep Object Occlusion Boundary Detection from an Image
Guoxia Wang, Xiaohui Liang, Frederick W. B. Li

TL;DR
This paper introduces DOOBNet, a deep multi-task network with a novel loss function for improved object occlusion boundary detection, addressing class imbalance and achieving state-of-the-art results.
Contribution
The paper presents a unified end-to-end network with an attention loss to handle class imbalance and multi-scale feature learning for occlusion boundary detection.
Findings
Achieved state-of-the-art F-score of 0.702 on PIOD dataset.
Improved detection speed to 0.037 seconds per image.
Effectively addressed class imbalance with a novel loss function.
Abstract
Object occlusion boundary detection is a fundamental and crucial research problem in computer vision. This is challenging to solve as encountering the extreme boundary/non-boundary class imbalance during training an object occlusion boundary detector. In this paper, we propose to address this class imbalance by up-weighting the loss contribution of false negative and false positive examples with our novel Attention Loss function. We also propose a unified end-to-end multi-task deep object occlusion boundary detection network (DOOBNet) by sharing convolutional features to simultaneously predict object boundary and occlusion orientation. DOOBNet adopts an encoder-decoder structure with skip connection in order to automatically learn multi-scale and multi-level features. We significantly surpass the state-of-the-art on the PIOD dataset (ODS F-score of .702) and the BSDS ownership dataset…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · COVID-19 diagnosis using AI
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
