3D Gaussian and Diffusion-Based Gaze Redirection
Abiram Panchalingam, Indu Bodala, Stuart Middleton

TL;DR
DiT-Gaze introduces a novel framework combining Diffusion Transformer, weak supervision, and orthogonality constraints to significantly improve 3D gaze redirection quality and accuracy, aiding better gaze estimator training.
Contribution
The paper presents DiT-Gaze, a new approach that enhances 3D gaze redirection by integrating diffusion models, weak supervision, and disentanglement constraints, achieving state-of-the-art results.
Findings
Reduces gaze error by 4.1% to 6.353 degrees.
Achieves higher perceptual quality in gaze redirection.
Provides a superior synthetic data generation method.
Abstract
High-fidelity gaze redirection is critical for generating augmented data to improve the generalization of gaze estimators. 3D Gaussian Splatting (3DGS) models like GazeGaussian represent the state-of-the-art but can struggle with rendering subtle, continuous gaze shifts. In this paper, we propose DiT-Gaze, a framework that enhances 3D gaze redirection models using a novel combination of Diffusion Transformer (DiT), weak supervision across gaze angles, and an orthogonality constraint loss. DiT allows higher-fidelity image synthesis, while our weak supervision strategy using synthetically generated intermediate gaze angles provides a smooth manifold of gaze directions during training. The orthogonality constraint loss mathematically enforces the disentanglement of internal representations for gaze, head pose, and expression. Comprehensive experiments show that DiT-Gaze sets a new…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaze Tracking and Assistive Technology · Visual Attention and Saliency Detection · Vestibular and auditory disorders
