FLOAT: Factorized Learning of Object Attributes for Improved   Multi-object Multi-part Scene Parsing

Rishubh Singh; Pranav Gupta; Pradeep Shenoy; Ravikiran; Sarvadevabhatla

arXiv:2203.16168·cs.CV·March 31, 2022·1 cites

FLOAT: Factorized Learning of Object Attributes for Improved Multi-object Multi-part Scene Parsing

Rishubh Singh, Pranav Gupta, Pradeep Shenoy, Ravikiran, Sarvadevabhatla

PDF

Open Access 1 Repo

TL;DR

FLOAT introduces a scalable factorized label space framework for multi-object multi-part scene parsing, significantly improving segmentation accuracy and handling diverse datasets with a novel inference-time zoom refinement technique.

Contribution

The paper proposes FLOAT, a novel factorized label space approach with a zoom refinement method, enhancing multi-object multi-part scene parsing performance and scalability.

Findings

01

Improves mIOU by 2-2.1% on Pascal-Part datasets.

02

Achieves 4.8-3.9% higher sqIOU compared to state-of-the-art.

03

Demonstrates effectiveness on a newly created comprehensive dataset Pascal-Part-201.

Abstract

Multi-object multi-part scene parsing is a challenging task which requires detecting multiple object classes in a scene and segmenting the semantic parts within each object. In this paper, we propose FLOAT, a factorized label space framework for scalable multi-object multi-part parsing. Our framework involves independent dense prediction of object category and part attributes which increases scalability and reduces task complexity compared to the monolithic label space counterpart. In addition, we propose an inference-time 'zoom' refinement technique which significantly improves segmentation quality, especially for smaller objects/parts. Compared to state of the art, FLOAT obtains an absolute improvement of 2.0% for mean IOU (mIOU) and 4.8% for segmentation quality IOU (sqIOU) on the Pascal-Part-58 dataset. For the larger Pascal-Part-108 dataset, the improvements are 2.1% for mIOU and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

floatseg/floatseg.github.io
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Advanced Neural Network Applications · Advanced Image and Video Retrieval Techniques