TL;DR
This paper introduces MT-ORL, a multi-task learning framework with a novel architecture and occlusion representation that significantly improves occlusion relation detection in images.
Contribution
It proposes OPNet, a new architecture that better exploits shared and task-specific features, and an orthogonal occlusion representation for improved occlusion orientation prediction.
Findings
Outperforms state-of-the-art by 6.1-8.3% in Boundary-AP
Achieves 6.5-10% improvement in Orientation-AP
Effective in capturing occlusion relations in images
Abstract
Retrieving occlusion relation among objects in a single image is challenging due to sparsity of boundaries in image. We observe two key issues in existing works: firstly, lack of an architecture which can exploit the limited amount of coupling in the decoder stage between the two subtasks, namely occlusion boundary extraction and occlusion orientation prediction, and secondly, improper representation of occlusion orientation. In this paper, we propose a novel architecture called Occlusion-shared and Path-separated Network (OPNet), which solves the first issue by exploiting rich occlusion cues in shared high-level features and structured spatial information in task-specific low-level features. We then design a simple but effective orthogonal occlusion representation (OOR) to tackle the second issue. Our method surpasses the state-of-the-art methods by 6.1%/8.3% Boundary-AP and 6.5%/10%…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
