Human-Inspired Topological Representations for Visual Object Recognition in Unseen Environments
Ekta U. Samani, Ashis G. Banerjee

TL;DR
This paper introduces the TOPS2 descriptor and THOR2 framework, inspired by human object reasoning, to improve visual object recognition in unseen, cluttered indoor environments for mobile robots, demonstrating superior accuracy over existing methods.
Contribution
The paper presents the novel TOPS2 descriptor and THOR2 framework, combining topological and shape features, trained on synthetic data, for enhanced recognition in challenging environments.
Findings
THOR2 outperforms previous shape-based methods in accuracy.
THOR2 surpasses RGB-D ViT on OCID and UW-IS datasets.
Synthetic training data effectively enables real-world recognition.
Abstract
Visual object recognition in unseen and cluttered indoor environments is a challenging problem for mobile robots. Toward this goal, we extend our previous work to propose the TOPS2 descriptor, and an accompanying recognition framework, THOR2, inspired by a human reasoning mechanism known as object unity. We interleave color embeddings obtained using the Mapper algorithm for topological soft clustering with the shape-based TOPS descriptor to obtain the TOPS2 descriptor. THOR2, trained using synthetic data, achieves substantially higher recognition accuracy than the shape-based THOR framework and outperforms RGB-D ViT on two real-world datasets: the benchmark OCID dataset and the UW-IS Occluded dataset. Therefore, THOR2 is a promising step toward achieving robust recognition in low-cost robots.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Visual Attention and Saliency Detection · Image Retrieval and Classification Techniques
