Large Sign Language Models: Toward 3D American Sign Language Translation

Sen Zhang; Xiaoxiao He; Di Liu; Zhaoyang Xia; Mingyu Zhao; Chaowei Tan; Vivian Li; Bo Liu; Dimitris N. Metaxas; and Mubbasir Kapadia

arXiv:2511.08535·cs.CV·November 12, 2025

Large Sign Language Models: Toward 3D American Sign Language Translation

Sen Zhang, Xiaoxiao He, Di Liu, Zhaoyang Xia, Mingyu Zhao, Chaowei Tan, Vivian Li, Bo Liu, Dimitris N. Metaxas, and Mubbasir Kapadia

PDF

Open Access

TL;DR

This paper introduces Large Sign Language Models (LSLM) that translate 3D American Sign Language directly from spatial data, significantly improving accuracy and inclusivity in digital communication for the hearing-impaired.

Contribution

It presents a novel framework leveraging 3D data and large language models for ASL translation, moving beyond 2D video-based methods and enabling multimodal language understanding.

Findings

01

Enhanced translation accuracy from 3D sign data

02

Flexible translation via instruction-guided prompts

03

Foundation for multimodal language processing

Abstract

We present Large Sign Language Models (LSLM), a novel framework for translating 3D American Sign Language (ASL) by leveraging Large Language Models (LLMs) as the backbone, which can benefit hearing-impaired individuals' virtual communication. Unlike existing sign language recognition methods that rely on 2D video, our approach directly utilizes 3D sign language data to capture rich spatial, gestural, and depth information in 3D scenes. This enables more accurate and resilient translation, enhancing digital communication accessibility for the hearing-impaired community. Beyond the task of ASL translation, our work explores the integration of complex, embodied multimodal languages into the processing capabilities of LLMs, moving beyond purely text-based inputs to broaden their understanding of human communication. We investigate both direct translation from 3D gesture features to text and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHand Gesture Recognition Systems · Hearing Impairment and Communication · Interactive and Immersive Displays