Towards Large-Scale Data Mining for Data-Driven Analysis of Sign Languages
Boris Mocialov, Graham Turner, Helen Hastie

TL;DR
This paper presents a method for large-scale collection and analysis of sign language data from social media platforms, enabling better understanding and modeling of sign languages like ASL and Libras.
Contribution
It introduces a data collection pipeline that filters and analyzes social media videos to facilitate large-scale sign language research.
Findings
Collected data from TikTok, Instagram, and YouTube for ASL and Libras.
Identified patterns in sign language parameters such as orientation and location.
Compared differences and similarities between ASL and Libras signs.
Abstract
Access to sign language data is far from adequate. We show that it is possible to collect the data from social networking services such as TikTok, Instagram, and YouTube by applying data filtering to enforce quality standards and by discovering patterns in the filtered data, making it easier to analyse and model. Using our data collection pipeline, we collect and examine the interpretation of songs in both the American Sign Language (ASL) and the Brazilian Sign Language (Libras). We explore their differences and similarities by looking at the co-dependence of the orientation and location phonological parameters
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHand Gesture Recognition Systems · Hearing Impairment and Communication · Human Pose and Action Recognition
Methods7 Fastest Ways to Call American Airlines Reservations Number (USA Guide)
