UniLS: End-to-End Audio-Driven Avatars for Unified Listening and Speaking

Xuangeng Chu; Ruicong Liu; Yifei Huang; Yun Liu; Yichen Peng; Bo Zheng

arXiv:2512.09327·cs.CV·March 31, 2026

UniLS: End-to-End Audio-Driven Avatars for Unified Listening and Speaking

Xuangeng Chu, Ruicong Liu, Yifei Huang, Yun Liu, Yichen Peng, Bo Zheng

PDF

1 Repo 1 Models

TL;DR

UniLS is an end-to-end framework that generates realistic speaking and listening facial expressions from dual-track audio, improving naturalness and diversity in digital human avatars.

Contribution

It introduces a novel two-stage training paradigm for unified speak-listen facial animation driven solely by audio, enabling real-time, high-fidelity avatar generation.

Findings

01

Achieves state-of-the-art speaking accuracy.

02

Improves listening expression diversity by up to 44.1%.

03

Mitigates stiffness in listening motions.

Abstract

Generating lifelike conversational avatars requires modeling not just isolated speakers, but the dynamic, reciprocal interaction of speaking and listening. However, modeling the listener is exceptionally challenging: direct audio-driven training fails, producing stiff, static listening motions. This failure stems from a fundamental imbalance: the speaker's motion is strongly driven by speech audio, while the listener's motion primarily follows an internal motion prior and is only loosely guided by external speech. This challenge has led most methods to focus on speak-only generation. The only prior attempt at joint generation relies on extra speaker's motion to produce the listener. This design is not end-to-end, thereby hindering the real-time applicability. To address this limitation, we present UniLS, the first end-to-end framework for generating unified speak-listen expressions,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://xg-chu.site/project_unils
github

Models

🤗
xg-chu/UniLS
model

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.