Isharah: A Large-Scale Multi-Scene Dataset for Continuous Sign Language Recognition

Sarah Alyami; Hamzah Luqman; Sadam Al-Azani; Maad Alowaifeer; Yazeed Alharbi; and Yaser Alonaizan

arXiv:2506.03615·cs.CV·June 5, 2025

Isharah: A Large-Scale Multi-Scene Dataset for Continuous Sign Language Recognition

Sarah Alyami, Hamzah Luqman, Sadam Al-Azani, Maad Alowaifeer, Yazeed Alharbi, and Yaser Alonaizan

PDF

Open Access

TL;DR

Isharah is a large, diverse, unconstrained multi-scene dataset for continuous sign language recognition, enabling more robust real-world CSLR and translation systems through extensive annotations and benchmarks.

Contribution

This paper introduces Isharah, the first large-scale, multi-scene CSLR dataset collected in real-world conditions with rich annotations and multiple benchmarks for sign language understanding.

Findings

01

Dataset contains 30,000 videos from 18 signers.

02

Includes benchmarks for signer-independent and unseen-sentence CSLR.

03

Supports development of sign language translation systems.

Abstract

Current benchmarks for sign language recognition (SLR) focus mainly on isolated SLR, while there are limited datasets for continuous SLR (CSLR), which recognizes sequences of signs in a video. Additionally, existing CSLR datasets are collected in controlled settings, which restricts their effectiveness in building robust real-world CSLR systems. To address these limitations, we present Isharah, a large multi-scene dataset for CSLR. It is the first dataset of its type and size that has been collected in an unconstrained environment using signers' smartphone cameras. This setup resulted in high variations of recording settings, camera distances, angles, and resolutions. This variation helps with developing sign language understanding models capable of handling the variability and complexity of real-world scenarios. The dataset consists of 30,000 video clips performed by 18 deaf and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHand Gesture Recognition Systems · Hearing Impairment and Communication · Interactive and Immersive Displays