Autoregressive Appearance Prediction for 3D Gaussian Avatars

Michael Steiner; Zhang Chen; Alexander Richard; Vasu Agrawal; Markus Steinberger; Michael Zollh\"ofer

arXiv:2604.00928·cs.CV·April 2, 2026

Autoregressive Appearance Prediction for 3D Gaussian Avatars

Michael Steiner, Zhang Chen, Alexander Richard, Vasu Agrawal, Markus Steinberger, Michael Zollh\"ofer

PDF

TL;DR

This paper introduces a 3D Gaussian Splatting avatar model that uses an appearance latent conditioned on pose, learned via an encoder, to produce stable, high-fidelity avatar renderings with smooth appearance changes.

Contribution

It presents a novel autoregressive appearance prediction method with a spatial MLP backbone and learned latent, enhancing avatar stability and realism during pose-driven rendering.

Findings

01

Improved reconstruction quality with learned appearance latent.

02

Enhanced temporal stability and smoothness in avatar appearance.

03

Robust high-fidelity avatar rendering demonstrated.

Abstract

A photorealistic and immersive human avatar experience demands capturing fine, person-specific details such as cloth and hair dynamics, subtle facial expressions, and characteristic motion patterns. Achieving this requires large, high-quality datasets, which often introduce ambiguities and spurious correlations when very similar poses correspond to different appearances. Models that fit these details during training can overfit and produce unstable, abrupt appearance changes for novel poses. We propose a 3D Gaussian Splatting avatar model with a spatial MLP backbone that is conditioned on both pose and an appearance latent. The latent is learned during training by an encoder, yielding a compact representation that improves reconstruction quality and helps disambiguate pose-driven renderings. At driving time, our predictor autoregressively infers the latent, producing temporally smooth…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.