Proactive Interaction Framework for Intelligent Social Receptionist Robots
Yang Xue, Fan Wang, Hao Tian, Min Zhao, Jiangyong Li, Haiqing Pan and, Yueqiang Dong

TL;DR
This paper introduces TFVT-HRI, an end-to-end transformer-based framework that enhances proactive human-robot interaction for reception robots by accurately interpreting scenarios and selecting from over 1000 diverse actions, improving social acceptability.
Contribution
The paper presents a novel transformer-based visual token approach for proactive HRI, enabling more nuanced and diverse robot behaviors compared to existing rule-based or limited end-to-end models.
Findings
Achieves state-of-the-art performance in action triggering and selection.
Demonstrates increased humanness and intelligence in real-world office environments.
Handles over 1000 diverse proactive behaviors.
Abstract
Proactive human-robot interaction (HRI) allows the receptionist robots to actively greet people and offer services based on vision, which has been found to improve acceptability and customer satisfaction. Existing approaches are either based on multi-stage decision processes or based on end-to-end decision models. However, the rule-based approaches require sedulous expert efforts and only handle minimal pre-defined scenarios. On the other hand, existing works with end-to-end models are limited to very general greetings or few behavior patterns (typically less than 10). To address those challenges, we propose a new end-to-end framework, the TransFormer with Visual Tokens for Human-Robot Interaction (TFVT-HRI). The proposed framework extracts visual tokens of relative objects from an RGB camera first. To ensure the correct interpretation of the scenario, a transformer decision model is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Social Robot Interaction and HRI · Video Surveillance and Tracking Methods
