Detect-and-describe: Joint learning framework for detection and   description of objects

Addel Zafar; Umar Khalid

arXiv:2204.08828·cs.CV·April 20, 2022

Detect-and-describe: Joint learning framework for detection and description of objects

Addel Zafar, Umar Khalid

PDF

1 Repo

TL;DR

This paper introduces the Detect-and-Describe (DaD) framework, a deep learning model that jointly detects objects and predicts their attributes, enhancing understanding of both known and unseen objects.

Contribution

The paper presents a novel joint detection and attribute prediction framework that extends traditional object detection to include detailed object descriptions.

Findings

01

Achieved 97.0% AUC for attribute prediction on aPascal test set.

02

Demonstrated effectiveness in describing unseen objects.

03

Extended object detection to include attribute inference.

Abstract

Traditional object detection answers two questions; "what" (what the object is?) and "where" (where the object is?). "what" part of the object detection can be fine-grained further i.e. "what type", "what shape" and "what material" etc. This results in the shifting of the object detection tasks to the object description paradigm. Describing an object provides additional detail that enables us to understand the characteristics and attributes of the object ("plastic boat" not just boat, "glass bottle" not just bottle). This additional information can implicitly be used to gain insight into unseen objects (e.g. unknown object is "metallic", "has wheels"), which is not possible in traditional object detection. In this paper, we present a new approach to simultaneously detect objects and infer their attributes, we call it Detect and Describe (DaD) framework. DaD is a deep learning-based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

adeelz92/DaD-Framework
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.