CSF: Contrastive Semantic Features for Direct Multilingual Sign Language Generation
Tran Sy Bao

TL;DR
This paper introduces CSF, a universal semantic framework for direct multilingual sign language translation that bypasses English, utilizing a lightweight transformer for high-accuracy semantic extraction across diverse languages.
Contribution
The paper proposes CSF, a novel language-agnostic semantic representation with a comprehensive condition taxonomy, and demonstrates a lightweight model achieving high accuracy and real-time performance.
Findings
Achieved 99.03% slot extraction accuracy across four languages.
Demonstrated 99.4% accuracy on condition classification.
Enabled real-time sign language generation with 3.02ms latency.
Abstract
Sign language translation systems typically require English as an intermediary language, creating barriers for non-English speakers in the global deaf community. We present Canonical Semantic Form (CSF), a language-agnostic semantic representation framework that enables direct translation from any source language to sign language without English mediation. CSF decomposes utterances into nine universal semantic slots: event, intent, time, condition, agent, object, location, purpose, and modifier. A key contribution is our comprehensive condition taxonomy comprising 35 condition types across eight semantic categories, enabling nuanced representation of conditional expressions common in everyday communication. We train a lightweight transformer-based extractor (0.74 MB) that achieves 99.03% average slot extraction accuracy across four typologically diverse languages: English, Vietnamese,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHand Gesture Recognition Systems · Hearing Impairment and Communication · Interactive and Immersive Displays
