SingingBot: An Avatar-Driven System for Robotic Face Singing Performance

Zhuoxiong Xu; Xuanchen Li; Yuhao Cheng; Fei Xu; Yichao Yan; Xiaokang Yang

arXiv:2601.02125·cs.RO·January 6, 2026

SingingBot: An Avatar-Driven System for Robotic Face Singing Performance

Zhuoxiong Xu, Xuanchen Li, Yuhao Cheng, Fei Xu, Yichao Yan, Xiaokang Yang

PDF

Open Access

TL;DR

SingingBot introduces an avatar-driven framework that synthesizes vivid singing avatars and transfers their expressions to robotic faces, enabling emotionally rich and synchronized robotic singing performances.

Contribution

The paper presents a novel avatar-driven approach with emotion guidance and a new metric for evaluating emotional breadth in robotic singing.

Findings

01

Achieves rich emotional expression in robotic singing

02

Maintains lip-audio synchronization effectively

03

Outperforms existing methods in emotional richness

Abstract

Equipping robotic faces with singing capabilities is crucial for empathetic Human-Robot Interaction. However, existing robotic face driving research primarily focuses on conversations or mimicking static expressions, struggling to meet the high demands for continuous emotional expression and coherence in singing. To address this, we propose a novel avatar-driven framework for appealing robotic singing. We first leverage portrait video generation models embedded with extensive human priors to synthesize vivid singing avatars, providing reliable expression and emotion guidance. Subsequently, these facial features are transferred to the robot via semantic-oriented mapping functions that span a wide expression space. Furthermore, to quantitatively evaluate the emotional richness of robotic singing, we propose the Emotion Dynamic Range metric to measure the emotional breadth within the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSocial Robot Interaction and HRI · Face recognition and analysis · Emotion and Mood Recognition