LUCY: Linguistic Understanding and Control Yielding Early Stage of Her
Heting Gao, Hang Shao, Xiong Wang, Chaofan Qiu, Yunhang Shen, Siqi, Cai, Yuchen Shi, Zihan Xu, Zuwei Long, Yike Zhang, Shaoqi Dong, Chaoyou Fu,, Ke Li, Long Ma, Xing Sun

TL;DR
LUCY is an advanced end-to-end speech model that understands and responds to emotional cues in human speech, producing natural, emotionally aware responses and integrating external tools for real-time inquiries.
Contribution
It introduces a novel E2E speech system capable of emotion sensing, natural response generation, and external tool integration, advancing emotional and functional capabilities of AI audio agents.
Findings
LUCY outperforms peer models in emotion control.
LUCY generates more natural responses judged by external models.
LUCY effectively uses external tools for real-time question answering.
Abstract
The film Her features Samantha, a sophisticated AI audio agent who is capable of understanding both linguistic and paralinguistic information in human speech and delivering real-time responses that are natural, informative and sensitive to emotional subtleties. Moving one step toward more sophisticated audio agent from recent advancement in end-to-end (E2E) speech systems, we propose LUCY, a E2E speech model that (1) senses and responds to user's emotion, (2) deliver responses in a succinct and natural style, and (3) use external tool to answer real-time inquiries. Experiment results show that LUCY is better at emotion control than peer models, generating emotional responses based on linguistic emotional instructions and responding to paralinguistic emotional cues. Lucy is also able to generate responses in a more natural style, as judged by external language models, without sacrificing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGender Studies in Language
