RSATalker: Realistic Socially-Aware Talking Head Generation for Multi-Turn Conversation

Peng Chen; Xiaobao Wei; Yi Yang; Naiming Yao; Hui Chen; Feng Tian

arXiv:2601.10606·cs.CV·January 16, 2026

RSATalker: Realistic Socially-Aware Talking Head Generation for Multi-Turn Conversation

Peng Chen, Xiaobao Wei, Yi Yang, Naiming Yao, Hui Chen, Feng Tian

PDF

Open Access

TL;DR

RSATalker is a novel framework that combines 3D Gaussian Splatting with social relationship encoding to generate realistic, socially-aware multi-turn talking head videos efficiently.

Contribution

It introduces the first socially-aware 3DGS-based talking head generation framework supporting multi-turn conversations and constructs a new dataset with social relationship annotations.

Findings

01

Achieves state-of-the-art realism in talking head generation

02

Effectively encodes social relationships into avatar videos

03

Supports multi-turn conversations with high fidelity

Abstract

Talking head generation is increasingly important in virtual reality (VR), especially for social scenarios involving multi-turn conversation. Existing approaches face notable limitations: mesh-based 3D methods can model dual-person dialogue but lack realistic textures, while large-model-based 2D methods produce natural appearances but incur prohibitive computational costs. Recently, 3D Gaussian Splatting (3DGS) based methods achieve efficient and realistic rendering but remain speaker-only and ignore social relationships. We introduce RSATalker, the first framework that leverages 3DGS for realistic and socially-aware talking head generation with support for multi-turn conversation. Our method first drives mesh-based 3D facial motion from speech, then binds 3D Gaussians to mesh facets to render high-fidelity 2D avatar videos. To capture interpersonal dynamics, we propose a socially-aware…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis · Generative Adversarial Networks and Image Synthesis · Social Robot Interaction and HRI