SignBart -- New approach with the skeleton sequence for Isolated Sign language Recognition
Tinh Nguyen, Minh Khue Phan Tran

TL;DR
This paper introduces SignBart, a novel skeleton sequence-based model for isolated sign language recognition that independently encodes x and y coordinates, achieving high accuracy with fewer parameters and demonstrating strong generalization across multiple datasets.
Contribution
The study presents a new approach using BART architecture with cross-attention to separately encode skeleton coordinates, improving efficiency and accuracy over traditional models.
Findings
Achieves 96.04% accuracy on LSA-64 dataset
Outperforms previous models with over one million parameters
Demonstrates strong generalization across WLASL and ASL-Citizen datasets
Abstract
Sign language recognition is crucial for individuals with hearing impairments to break communication barriers. However, previous approaches have had to choose between efficiency and accuracy. Such as RNNs, LSTMs, and GCNs, had problems with vanishing gradients and high computational costs. Despite improving performance, transformer-based methods were not commonly used. This study presents a new novel SLR approach that overcomes the challenge of independently extracting meaningful information from the x and y coordinates of skeleton sequences, which traditional models often treat as inseparable. By utilizing an encoder-decoder of BART architecture, the model independently encodes the x and y coordinates, while Cross-Attention ensures their interrelation is maintained. With only 749,888 parameters, the model achieves 96.04% accuracy on the LSA-64 dataset, significantly outperforming…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHand Gesture Recognition Systems · Hearing Impairment and Communication · Face recognition and analysis
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Layer Normalization · Dropout · Byte Pair Encoding · Softmax · Dense Connections · BART · Surrogate Lagrangian Relaxation
