Diffusion Forcing for Multi-Agent Interaction Sequence Modeling

Vongani H. Maluleke; Kie Horiuchi; Lea Wilken; Evonne Ng; Jitendra Malik; Angjoo Kanazawa

arXiv:2512.17900·cs.CV·March 27, 2026

Diffusion Forcing for Multi-Agent Interaction Sequence Modeling

Vongani H. Maluleke, Kie Horiuchi, Lea Wilken, Evonne Ng, Jitendra Malik, Angjoo Kanazawa

PDF

Open Access

TL;DR

MAGNet is a unified diffusion-based model that generates multi-agent interactions, capturing complex social behaviors and coordinating multiple agents over long sequences with flexible task support.

Contribution

Introduces MAGNet, a versatile autoregressive diffusion framework capable of modeling diverse multi-agent interactions within a single unified model.

Findings

01

Performs on par with specialized methods on dyadic benchmarks.

02

Extends naturally to polyadic multi-agent scenarios.

03

Generates coherent long-duration multi-agent sequences.

Abstract

Understanding and generating multi-person interactions is a fundamental challenge with broad implications for robotics and social computing. While humans naturally coordinate in groups, modeling such interactions remains difficult due to long temporal horizons, strong inter-agent dependencies, and variable group sizes. Existing motion generation methods are largely task-specific and do not generalize to flexible multi-agent generation. We introduce MAGNet (Multi-Agent Generative Network), a unified autoregressive diffusion framework for multi-agent motion generation that supports a wide range of interaction tasks through flexible conditioning and sampling. MAGNet performs dyadic and polyadic prediction, partner inpainting, partner prediction, and agentic generation all within a single model, and can autoregressively generate ultra-long sequences spanning hundreds of motion steps. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Motion and Animation · Generative Adversarial Networks and Image Synthesis · Social Robot Interaction and HRI