Evaluation of Pose Estimation Systems for Sign Language Translation

Catherine O'Brien; Gerard Sant; Mathias M\"uller; Sarah Ebling

arXiv:2604.24609·cs.CL·April 28, 2026

Evaluation of Pose Estimation Systems for Sign Language Translation

Catherine O'Brien, Gerard Sant, Mathias M\"uller, Sarah Ebling

PDF

TL;DR

This paper systematically compares various pose estimation systems for sign language translation, analyzing their impact on translation quality, robustness, and stability, and provides insights into the best-performing models.

Contribution

It offers a comprehensive evaluation of pose estimators for SLT, highlighting the importance of specific models like Sapiens and SDPose for improved translation accuracy.

Findings

01

SDPose and Sapiens outperform MediaPipe in BLEU scores (~11.5 vs. 10).

02

Sapiens correctly handles all occlusion cases tested, unlike OpenPifPaf.

03

Estimators missing hand keypoints correlate with lower translation quality.

Abstract

Many sign language translation (SLT) systems operate on pose sequences instead of raw video to reduce input dimensionality, improve portability, and partially anonymize signers. The choice of pose estimator is often treated as an implementation detail, with systems defaulting to widely available tools such as MediaPipe Holistic or OpenPose. We present a systematic comparison of pose estimators for pose-based SLT, covering widely used baselines (MediaPipe Holistic, OpenPose) and newer whole-body/high-capacity models (MMPose WholeBody, OpenPifPaf, AlphaPose, SDPose, Sapiens, SMPLest-X). We quantify downstream impact by training a controlled SLT pipeline on RWTH-PHOENIX-Weather 2014 where only the pose representation varies, evaluating with BLEU and BLEURT. To contextualize translation outcomes, we analyze temporal stability, missing hand keypoints, and robustness to occlusion using…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.