MASA: Motion-aware Masked Autoencoder with Semantic Alignment for Sign   Language Recognition

Weichao Zhao; Hezhen Hu; Wengang Zhou; Yunyao Mao; Min Wang; Houqiang; Li

arXiv:2405.20666·cs.CV·June 3, 2024

MASA: Motion-aware Masked Autoencoder with Semantic Alignment for Sign Language Recognition

Weichao Zhao, Hezhen Hu, Wengang Zhou, Yunyao Mao, Min Wang, Houqiang, Li

PDF

Open Access 1 Repo

TL;DR

MASA introduces a self-supervised learning framework for sign language recognition that explicitly models motion cues and aligns global semantic information, significantly improving representation capabilities and achieving state-of-the-art results.

Contribution

The paper proposes a novel MASA framework combining motion-aware masked autoencoding and semantic alignment for enhanced sign language recognition.

Findings

01

Achieves state-of-the-art performance on four benchmarks.

02

Effectively models dynamic motion cues in sign sequences.

03

Enhances global semantic understanding in sign language recognition.

Abstract

Sign language recognition (SLR) has long been plagued by insufficient model representation capabilities. Although current pre-training approaches have alleviated this dilemma to some extent and yielded promising performance by employing various pretext tasks on sign pose data, these methods still suffer from two primary limitations: 1) Explicit motion information is usually disregarded in previous pretext tasks, leading to partial information loss and limited representation capability. 2) Previous methods focus on the local context of a sign pose sequence, without incorporating the guidance of the global meaning of lexical signs. To this end, we propose a Motion-Aware masked autoencoder with Semantic Alignment (MASA) that integrates rich motion cues and global semantic information in a self-supervised learning paradigm for SLR. Our framework contains two crucial components, i.e., a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sakura2233565548/masa
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHand Gesture Recognition Systems · Gait Recognition and Analysis · Human Pose and Action Recognition

MethodsFocus · Surrogate Lagrangian Relaxation