InTraGen: Trajectory-controlled Video Generation for Object Interactions
Zuhao Liu, Aleksandar Yanev, Ahmad Mahmood, Ivan Nikolov, Saman Motamed, Wei-Shi Zheng, Xi Wang, Lei Sun, Luc Van Gool, Danda Pani Paudel

TL;DR
InTraGen is a new pipeline that enhances trajectory-controlled video generation for realistic object interactions, introducing datasets and a metric to evaluate interaction quality.
Contribution
The paper presents InTraGen, a novel method with datasets and a metric for improved trajectory-based object interaction video generation.
Findings
Improved visual fidelity in generated videos.
Introduction of four new datasets for evaluation.
Enhanced quantitative performance in object interaction scenarios.
Abstract
Advances in video generation have significantly improved the realism and quality of created scenes. This has fueled interest in developing intuitive tools that let users leverage video generation as world simulators. Text-to-video (T2V) generation is one such approach, enabling video creation from text descriptions only. Yet, due to the inherent ambiguity in texts and the limited temporal information offered by text prompts, researchers have explored additional control signals like trajectory-guided systems, for more accurate T2V generation. Nonetheless, methods to evaluate whether T2V models can generate realistic interactions between multiple objects are lacking. We introduce InTraGen, a pipeline for improved trajectory-based generation of object interaction scenarios. We propose 4 new datasets and a novel trajectory quality metric to evaluate the performance of the proposed InTraGen.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
