Human-Inspired Topological Representations for Visual Object Recognition   in Unseen Environments

Ekta U. Samani; Ashis G. Banerjee

arXiv:2309.08239·cs.CV·September 18, 2023

Human-Inspired Topological Representations for Visual Object Recognition in Unseen Environments

Ekta U. Samani, Ashis G. Banerjee

PDF

Open Access

TL;DR

This paper introduces the TOPS2 descriptor and THOR2 framework, inspired by human object reasoning, to improve visual object recognition in unseen, cluttered indoor environments for mobile robots, demonstrating superior accuracy over existing methods.

Contribution

The paper presents the novel TOPS2 descriptor and THOR2 framework, combining topological and shape features, trained on synthetic data, for enhanced recognition in challenging environments.

Findings

01

THOR2 outperforms previous shape-based methods in accuracy.

02

THOR2 surpasses RGB-D ViT on OCID and UW-IS datasets.

03

Synthetic training data effectively enables real-world recognition.

Abstract

Visual object recognition in unseen and cluttered indoor environments is a challenging problem for mobile robots. Toward this goal, we extend our previous work to propose the TOPS2 descriptor, and an accompanying recognition framework, THOR2, inspired by a human reasoning mechanism known as object unity. We interleave color embeddings obtained using the Mapper algorithm for topological soft clustering with the shape-based TOPS descriptor to obtain the TOPS2 descriptor. THOR2, trained using synthetic data, achieves substantially higher recognition accuracy than the shape-based THOR framework and outperforms RGB-D ViT on two real-world datasets: the benchmark OCID dataset and the UW-IS Occluded dataset. Therefore, THOR2 is a promising step toward achieving robust recognition in low-cost robots.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Visual Attention and Saliency Detection · Image Retrieval and Classification Techniques