Learning on the Job: Long-Term Behavioural Adaptation in Human-Robot Interactions
Francesco Del Duchetto, Marc Hanheide

TL;DR
This paper presents a reinforcement learning framework enabling robots to adapt their behavior over time in public spaces, significantly improving user engagement and tour completion rates during long-term deployments.
Contribution
It introduces an online adaptive RL approach using UCBVI and an engagement model to optimize robot behavior in real-world, long-term human-robot interactions.
Findings
22.8% increase in items visited during tours
30% increase in tour completion probability
Effective long-term behavioral adaptation demonstrated
Abstract
In this work, we propose a framework for allowing autonomous robots deployed for extended periods of time in public spaces to adapt their own behaviour online from user interactions. The robot behaviour planning is embedded in a Reinforcement Learning (RL) framework, where the objective is maximising the level of overall user engagement during the interactions. We use the Upper-Confidence-Bound Value-Iteration (UCBVI) algorithm, which gives a helpful way of managing the exploration-exploitation trade-off for real-time interactions. An engagement model trained end-to-end generates the reward function in real-time during policy execution. We test this approach in a public museum in Lincoln (UK), where the robot is deployed as a tour guide for the visitors. Results show that after a couple of months of exploration, the robot policy learned to maintain the engagement of users for longer,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Mobile Crowdsensing and Crowdsourcing · Transportation and Mobility Innovations
