CanonSLR: Canonical-View Guided Multi-View Continuous Sign Language Recognition

Xu Wang; Shengeng Tang; Wan Jiang; Yaxiong Wang; Lechao Cheng; Richang Hong

arXiv:2604.18184·cs.CV·April 21, 2026

CanonSLR: Canonical-View Guided Multi-View Continuous Sign Language Recognition

Xu Wang, Shengeng Tang, Wan Jiang, Yaxiong Wang, Lechao Cheng, Richang Hong

PDF

TL;DR

CanonSLR introduces a multi-view CSLR framework using canonical-view guidance, teacher-student learning, and motion modeling to improve robustness across viewpoints in sign language recognition.

Contribution

It proposes a novel multi-view CSLR approach with a teacher-student strategy, semantic discrepancy reduction, motion modeling, and new multi-view benchmarks.

Findings

01

Outperforms existing methods on multi-view benchmarks.

02

Shows increased robustness to non-frontal viewpoints.

03

Provides a new multi-view sign language dataset pipeline.

Abstract

Continuous Sign Language Recognition (CSLR) has achieved remarkable progress in recent years; however, most existing methods are developed under single-view settings and thus remain insufficiently robust to viewpoint variations in real-world scenarios. To address this limitation, we propose CanonSLR, a canonical-view guided framework for multi-view CSLR. Specifically, we introduce a frontal-view-anchored teacher-student learning strategy, in which a teacher network trained on frontal-view data provides canonical temporal supervision for a student network trained on all viewpoints. To further reduce cross-view semantic discrepancy, we propose Sequence-Level Soft-Target Distillation, which transfers structured temporal knowledge from the frontal view to non-frontal samples, thereby alleviating gloss boundary ambiguity and category confusion caused by occlusion and projection variation. In…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.