LORTSAR: Low-Rank Transformer for Skeleton-based Action Recognition

Soroush Oraki; Harry Zhuang; Jie Liang

arXiv:2407.14655·cs.CV·July 23, 2024

LORTSAR: Low-Rank Transformer for Skeleton-based Action Recognition

Soroush Oraki, Harry Zhuang, Jie Liang

PDF

Open Access

TL;DR

LORTSAR employs SVD-based low-rank approximation and fine-tuning to significantly reduce Transformer model sizes for skeleton-based action recognition, maintaining or improving accuracy on benchmark datasets.

Contribution

This paper introduces LORTSAR, a novel low-rank Transformer compression method that effectively reduces model size while preserving or enhancing recognition performance.

Findings

01

Substantial parameter reduction with negligible accuracy loss.

02

Performance improvement on NTU RGB+D datasets.

03

Enhanced model efficiency and sustainability.

Abstract

The complexity of state-of-the-art Transformer-based models for skeleton-based action recognition poses significant challenges in terms of computational efficiency and resource utilization. In this paper, we explore the application of Singular Value Decomposition (SVD) to effectively reduce the model sizes of these pre-trained models, aiming to minimize their resource consumption while preserving accuracy. Our method, LORTSAR (LOw-Rank Transformer for Skeleton-based Action Recognition), also includes a fine-tuning step to compensate for any potential accuracy degradation caused by model compression, and is applied to two leading Transformer-based models, "Hyperformer" and "STEP-CATFormer". Experimental results on the "NTU RGB+D" and "NTU RGB+D 120" datasets show that our method can reduce the number of model parameters substantially with negligible degradation or even performance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Gait Recognition and Analysis

MethodsAttention Is All You Need · Byte Pair Encoding · Layer Normalization · Label Smoothing · Linear Layer · Softmax · Position-Wise Feed-Forward Layer · Absolute Position Encodings · Multi-Head Attention · Dense Connections