EVOLVE: Emotion and Visual Output Learning via LLM Evaluation
Jordan Sinclair, Christopher Reardon

TL;DR
This paper introduces EVOLVE, a framework that uses large language models and vision-language models to generate emotionally aligned verbal and nonverbal responses, enhancing social robot empathy and communication.
Contribution
It presents a novel approach integrating LLMs with vision-language models for open-ended emotional response and nonverbal behavior generation in social robots.
Findings
Improved emotional expression alignment with user input
Enhanced robot-human communication through multimodal responses
Demonstrated feasibility of LLM-driven nonverbal behavior in real-world scenarios
Abstract
Human acceptance of social robots is greatly effected by empathy and perceived understanding. This necessitates accurate and flexible responses to various input data from the user. While systems such as this can become increasingly complex as more states or response types are included, new research in the application of large language models towards human-robot interaction has allowed for more streamlined perception and reaction pipelines. LLM-selected actions and emotional expressions can help reinforce the realism of displayed empathy and allow for improved communication between the robot and user. Beyond portraying empathy in spoken or written responses, this shows the possibilities of using LLMs in actuated, real world scenarios. In this work we extend research in LLM-driven nonverbal behavior for social robots by considering more open-ended emotional response selection leveraging…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Analysis and Summarization
