# An Intelligent Real-Time System for Sentence-Level Recognition of Continuous Saudi Sign Language Using Landmark-Based Temporal Modeling

**Authors:** Adel BenAbdennour, Mohammed Mukhtar, Osama Almolike, Bilal A. Khawaja, Abdulmajeed M. Alenezi

PMC · DOI: 10.3390/s26051652 · Sensors (Basel, Switzerland) · 2026-03-05

## TL;DR

This paper introduces a real-time system for translating continuous Saudi Sign Language into spoken Arabic using landmark-based features and deep learning.

## Contribution

The system enables direct sentence-level recognition of continuous SSL with real-time performance and high accuracy.

## Key findings

- The model achieves 94.2% mean sentence-level accuracy using a BiLSTM network and landmark features.
- The system supports real-time performance suitable for interactive use with natural signing.
- An LLM-based refinement stage improves linguistic fluency without affecting recognition accuracy.

## Abstract

A persistent challenge for Deaf and Hard-of-Hearing individuals is the communication gap between sign language users and the hearing community, particularly in regions with limited automated translation resources. In Saudi Arabia, this gap is amplified by the reliance on Saudi Sign Language (SSL) and the scarcity of real-time, sentence-level translation systems. This paper presents a real-time system for sentence-level recognition of continuous SSL and direct mapping to natural spoken Arabic. The proposed system operates end-to-end on live video streams or pre-recorded content, extracting spatio-temporal landmark features using the MediaPipe Holistic framework. For classification, the input feature vector consists of 225 features derived from hand and body pose landmarks. These features are processed by a Bidirectional Long Short-Term Memory (BiLSTM) network trained on the ArabSign (ArSL) dataset to perform direct sentence-level classification over a vocabulary of 50 continuous Arabic sign language sentences, supported by an idle-based segmentation mechanism that enables natural, uninterrupted signing. Experimental evaluation demonstrates robust generalization: under a Leave-One-Signer-Out (LOSO) cross-validation protocol, the model attains a mean sentence-level accuracy of 94.2%, outperforming the fixed signer-independent split baseline of 92.07%, while maintaining real-time performance suitable for interactive use. To enhance linguistic fluency, an optional post-recognition refinement stage is incorporated using a large language model (LLM), followed by text-to-speech synthesis to produce audible Arabic output; this refinement operates strictly as post-processing and is not included in the reported recognition accuracy metrics. The results demonstrate that direct sentence-level modeling, combined with landmark-based feature extraction and real-time segmentation, provides an effective and practical solution for continuous SSL sentence recognition in real-time.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12987092/full.md

## Figures

12 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12987092/full.md

## References

26 references — full list in the complete paper: https://tomesphere.com/paper/PMC12987092/full.md

---
Source: https://tomesphere.com/paper/PMC12987092