Spatial-Temporal State Propagation Autoregressive Model for 4D Object Generation
Liying Yang, Jialun Liu, Jiakui Hu, Chenhao Guan, Haibin Huang, Fangqiu Yi, Chi Zhang, Yanyan Liang

TL;DR
This paper introduces 4DSTAR, a novel autoregressive model that generates 4D objects with high spatial-temporal consistency by propagating states across time and space, outperforming diffusion-based methods.
Contribution
The paper proposes a new autoregressive framework with dynamic state propagation and a 4D VQ-VAE for coherent 4D object generation, addressing limitations of existing diffusion models.
Findings
Achieves spatial-temporal consistency in 4D object generation.
Demonstrates performance competitive with diffusion models.
Introduces a dynamic state propagation mechanism for long-term dependencies.
Abstract
Generating high-quality 4D objects with spatial-temporal consistency is still formidable. Existing diffusion-based methods often struggle with spatial-temporal inconsistency, as they fail to leverage outputs from all previous timesteps to guide the generation at the current timestep. Therefore, we propose a Spatial-Temporal State Propagation AutoRegressive Model (4DSTAR), which generates 4D objects maintaining temporal-spatial consistency. 4DSTAR formulates the generation problem as the prediction of tokens that represent the 4D object. It consists of two key components: (1) The dynamic spatial-temporal state propagation autoregressive model (STAR) is proposed, which achieves spatial-temporal consistent generation. Unlike standard autoregressive models, STAR divides prediction tokens into groups based on timesteps. It models long-term dependencies by propagating spatial-temporal states…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · 3D Shape Modeling and Analysis · Face recognition and analysis
