STS-Mixer: Spatio-Temporal-Spectral Mixer for 4D Point Cloud Video Understanding

Wenhao Li; Xueying Jiang; Gongjie Zhang; Xiaoqin Zhang; Ling Shao; Shijian Lu

arXiv:2604.11637·cs.CV·April 14, 2026

STS-Mixer: Spatio-Temporal-Spectral Mixer for 4D Point Cloud Video Understanding

Wenhao Li, Xueying Jiang, Gongjie Zhang, Xiaoqin Zhang, Ling Shao, Shijian Lu

PDF

1 Repo

TL;DR

STS-Mixer introduces a spectral domain approach to 4D point cloud video understanding, decomposing signals into frequency bands to better capture geometric and dynamic scene information.

Contribution

It proposes a novel spectral analysis framework and a unified spatio-temporal-spectral mixer for improved 4D point cloud video understanding.

Findings

01

Achieves superior performance on 3D action recognition benchmarks.

02

Outperforms existing methods in 4D semantic segmentation.

03

Effectively captures geometric details through spectral decomposition.

Abstract

4D point cloud videos capture rich spatial and temporal dynamics of scenes which possess unique values in various 4D understanding tasks. However, most existing methods work in the spatiotemporal domain where the underlying geometric characteristics of 4D point cloud videos are hard to capture, leading to degraded representation learning and understanding of 4D point cloud videos. We address the above challenge from a complementary spectral perspective. By transforming 4D point cloud videos into graph spectral signals, we can decompose them into multiple frequency bands each of which captures distinct geometric structures of point cloud videos. Our spectral analysis reveals that the decomposed low-frequency signals capture more coarse shapes while high-frequency signals encode more fine-grained geometry details. Building on these observations, we design Spatio-Temporal-Spectral Mixer…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Vegetebird/STS-Mixer
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.