Multimodal Hierarchical Dirichlet Process-based Active Perception
Tadahiro Taniguchi, Toshiaki Takano, Ryo Yoshino

TL;DR
This paper introduces an active perception method using the multimodal hierarchical Dirichlet process (MHDP) for robots to efficiently recognize objects by selecting optimal actions based on information gain, improving speed and accuracy.
Contribution
It presents a novel MHDP-based active perception framework with an efficient Monte Carlo approximation for information gain and a theoretical justification for greedy algorithms in action selection.
Findings
The method enables robots to recognize objects faster and more accurately.
The information gain function is submodular, allowing effective greedy optimization.
Experimental results validate the theoretical advantages of the proposed approach.
Abstract
In this paper, we propose an active perception method for recognizing object categories based on the multimodal hierarchical Dirichlet process (MHDP). The MHDP enables a robot to form object categories using multimodal information, e.g., visual, auditory, and haptic information, which can be observed by performing actions on an object. However, performing many actions on a target object requires a long time. In a real-time scenario, i.e., when the time is limited, the robot has to determine the set of actions that is most effective for recognizing a target object. We propose an MHDP-based active perception method that uses the information gain (IG) maximization criterion and lazy greedy algorithm. We show that the IG maximization criterion is optimal in the sense that the criterion is equivalent to a minimization of the expected Kullback--Leibler divergence between a final recognition…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning · Machine Learning and Algorithms
