GATS: Gaussian Aware Temporal Scaling Transformer for Invariant 4D Spatio-Temporal Point Cloud Representation
Jiayi Tian, Jiaze Wang

TL;DR
GATS introduces a dual invariant framework combining Gaussian aware temporal scaling and uncertainty-guided convolution to improve robustness and invariance in 4D point cloud video understanding.
Contribution
The paper proposes GATS, a novel approach that explicitly addresses distributional inconsistencies and temporal biases in 4D point cloud videos, enhancing robustness and invariance.
Findings
Improves accuracy on MSR-Action3D by +6.62%
Enhances NTU RGBD accuracy by +1.4%
Increases Synthia4D mIoU by +1.8%
Abstract
Understanding 4D point cloud videos is essential for enabling intelligent agents to perceive dynamic environments. However, temporal scale bias across varying frame rates and distributional uncertainty in irregular point clouds make it highly challenging to design a unified and robust 4D backbone. Existing CNN or Transformer based methods are constrained either by limited receptive fields or by quadratic computational complexity, while neglecting these implicit distortions. To address this problem, we propose a novel dual invariant framework, termed \textbf{Gaussian Aware Temporal Scaling (GATS)}, which explicitly resolves both distributional inconsistencies and temporal. The proposed \emph{Uncertainty Guided Gaussian Convolution (UGGC)} incorporates local Gaussian statistics and uncertainty aware gating into point convolution, thereby achieving robust neighborhood aggregation under…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Human Pose and Action Recognition · Advanced Vision and Imaging
