ProAct: A Dual-System Framework for Proactive Embodied Social Agents
Zeyi Zhang, Zixi Kang, Ruijie Zhao, Yusen Feng, Biao Jiang, Libin Liu

TL;DR
ProAct introduces a dual-system framework for embodied social agents that enables proactive, long-term social reasoning while maintaining real-time responsiveness, improving interaction quality and social presence.
Contribution
The paper presents ProAct, a novel dual-system approach that separates reactive and proactive behaviors, allowing seamless integration of long-horizon social reasoning into real-time embodied agents.
Findings
Participants preferred ProAct over reactive systems in user studies.
ProAct improved perceived proactivity and social presence.
The framework effectively balances responsiveness with long-term social planning.
Abstract
Embodied social agents have recently advanced in generating synchronized speech and gestures. However, most interactive systems remain fundamentally reactive, responding only to current sensory inputs within a short temporal window. Proactive social behavior, in contrast, requires deliberation over accumulated context and intent inference, which conflicts with the strict latency budget of real-time interaction. We present \emph{ProAct}, a dual-system framework that reconciles this time-scale conflict by decoupling a low-latency \emph{Behavioral System} for streaming multimodal interaction from a slower \emph{Cognitive System} which performs long-horizon social reasoning and produces high-level proactive intentions. To translate deliberative intentions into continuous non-verbal behaviors without disrupting fluency, we introduce a streaming flow-matching model conditioned on intentions…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSocial Robot Interaction and HRI · Action Observation and Synchronization · Embodied and Extended Cognition
