SlowFast Network for Continuous Sign Language Recognition
Junseok Ahn, Youngjoon Jang, Joon Son Chung

TL;DR
This paper introduces a SlowFast network architecture with novel feature fusion methods for improved continuous sign language recognition, effectively capturing spatial and dynamic features at different temporal resolutions.
Contribution
It proposes a two-pathway SlowFast network with Bi-directional Feature Fusion and Pathway Feature Enhancement for better spatial and dynamic feature extraction in CSLR.
Findings
Outperforms state-of-the-art on PHOENIX14, PHOENIX14-T, and CSL-Daily datasets.
Effectively captures spatial and dynamic features at different temporal resolutions.
Enriches feature representations without increasing inference time.
Abstract
The objective of this work is the effective extraction of spatial and dynamic features for Continuous Sign Language Recognition (CSLR). To accomplish this, we utilise a two-pathway SlowFast network, where each pathway operates at distinct temporal resolutions to separately capture spatial (hand shapes, facial expressions) and dynamic (movements) information. In addition, we introduce two distinct feature fusion methods, carefully designed for the characteristics of CSLR: (1) Bi-directional Feature Fusion (BFF), which facilitates the transfer of dynamic semantics into spatial semantics and vice versa; and (2) Pathway Feature Enhancement (PFE), which enriches dynamic and spatial representations through auxiliary subnetworks, while avoiding the need for extra inference time. As a result, our model further strengthens spatial and dynamic representations in parallel. We demonstrate that the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHand Gesture Recognition Systems · Hearing Impairment and Communication · Gait Recognition and Analysis
