Kinova Gemini: Interactive Robot Grasping with Visual Reasoning and Conversational AI
Hanxiao Chen, Jiankun Wang, Max Q.-H. Meng

TL;DR
The paper introduces Kinova Gemini, a robotic system combining conversational AI and visual reasoning to assist humans with object retrieval and perception-based tasks using a Kinova Gen3 lite robot.
Contribution
It presents an integrated system that enables natural dialogue, object detection, and visual reasoning for human-robot interaction in object retrieval tasks.
Findings
Successful object detection and recognition with YOLO v3.
Effective natural language interaction for task clarification.
Perception-based pick-and-place tasks performed accurately.
Abstract
To facilitate recent advances in robotics and AI for delicate collaboration between humans and machines, we propose the Kinova Gemini, an original robotic system that integrates conversational AI dialogue and visual reasoning to make the Kinova Gen3 lite robot help people retrieve objects or complete perception-based pick-and-place tasks. When a person walks up to Kinova Gen3 lite, our Kinova Gemini is able to fulfill the user's requests in three different applications: (1) It can start a natural dialogue with people to interact and assist humans to retrieve objects and hand them to the user one by one. (2) It detects diverse objects with YOLO v3 and recognize color attributes of the item to ask people if they want to grasp it via the dialogue or enable the user to choose which specific one is required. (3) It applies YOLO v3 to recognize multiple objects and let you choose two items…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Automated Systems · Social Robot Interaction and HRI · Robotic Path Planning Algorithms
