SF-Net: Structured Feature Network for Continuous Sign Language Recognition
Zhaoyang Yang, Zhenmei Shi, Xiaoyong Shen, Yu-Wing Tai

TL;DR
SF-Net is a novel structured feature learning framework for continuous sign language recognition that effectively captures multi-level semantic information and outperforms previous methods without requiring pre-training.
Contribution
The paper introduces SF-Net, a new end-to-end trainable model that encodes features at multiple semantic levels for improved sign language recognition.
Findings
Outperforms previous sequence supervision methods in accuracy
Effective in learning multi-level semantic features
Demonstrates strong adaptability across datasets
Abstract
Continuous sign language recognition (SLR) aims to translate a signing sequence into a sentence. It is very challenging as sign language is rich in vocabulary, while many among them contain similar gestures and motions. Moreover, it is weakly supervised as the alignment of signing glosses is not available. In this paper, we propose Structured Feature Network (SF-Net) to address these challenges by effectively learn multiple levels of semantic information in the data. The proposed SF-Net extracts features in a structured manner and gradually encodes information at the frame level, the gloss level and the sentence level into the feature representation. The proposed SF-Net can be trained end-to-end without the help of other models or pre-training. We tested the proposed SF-Net on two large scale public SLR datasets collected from different continuous SLR scenarios. Results show that the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHand Gesture Recognition Systems · Hearing Impairment and Communication · Gait Recognition and Analysis
