Simultaneous Multi-View Object Recognition and Grasping in Open-Ended Domains
Hamidreza Kasaei, Sha Luo, Remo Sasso, Mohammadreza Kasaei

TL;DR
This paper introduces a deep learning system that simultaneously recognizes objects and plans grasps in open-ended environments, capable of learning new categories quickly and handling unseen objects in real-time scenarios.
Contribution
It presents a novel multi-view deep learning architecture with augmented memory for joint object recognition and grasping, enabling rapid learning of new categories without retraining.
Findings
Successfully grasps unseen objects in simulation and real-world.
Rapidly learns new object categories with few examples.
Handles open-ended environments with continual learning.
Abstract
To aid humans in everyday tasks, robots need to know which objects exist in the scene, where they are, and how to grasp and manipulate them in different situations. Therefore, object recognition and grasping are two key functionalities for autonomous robots. Most state-of-the-art approaches treat object recognition and grasping as two separate problems, even though both use visual input. Furthermore, the knowledge of the robot is fixed after the training phase. In such cases, if the robot encounters new object categories, it must be retrained to incorporate new information without catastrophic forgetting. In order to resolve this problem, we propose a deep learning architecture with an augmented memory capacity to handle open-ended object recognition and grasping simultaneously. In particular, our approach takes multi-views of an object as input and jointly estimates pixel-wise grasp…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Robot Manipulation and Learning · Multimodal Machine Learning Applications
