CrossGLG: LLM Guides One-shot Skeleton-based 3D Action Recognition in a Cross-level Manner
Tingbing Yan, Wenzheng Zeng, Yang Xiao, Xingyu Tong, Bo Tan, Zhiwen, Fang, Zhiguo Cao, Joey Tianyi Zhou

TL;DR
CrossGLG introduces a novel approach that leverages high-level text descriptions from large language models to guide skeleton-based 3D action recognition, improving accuracy and generalization in a cross-level manner.
Contribution
The paper proposes a global-local-guided framework using LLM-generated text to enhance feature learning in one-shot skeleton-based action recognition, with a dual-branch architecture for efficient inference.
Findings
Outperforms state-of-the-art methods on three benchmarks.
Achieves significant accuracy improvements with minimal inference cost.
Can be integrated with existing skeleton encoders to boost performance.
Abstract
Most existing one-shot skeleton-based action recognition focuses on raw low-level information (e.g., joint location), and may suffer from local information loss and low generalization ability. To alleviate these, we propose to leverage text description generated from large language models (LLM) that contain high-level human knowledge, to guide feature learning, in a global-local-global way. Particularly, during training, we design prompts to gain global and local text descriptions of each action from an LLM. We first utilize the global text description to guide the skeleton encoder focus on informative joints (i.e.,global-to-local). Then we build non-local interaction between local text and joint features, to form the final global representation (i.e., local-to-global). To mitigate the asymmetry issue between the training and inference phases, we further design a dual-branch…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Gait Recognition and Analysis
MethodsFocus · Balanced Selection
