SignAttention: On the Interpretability of Transformer Models for Sign   Language Translation

Pedro Alejandro Dal Bianco; Oscar Agust\'in Stanchi; Facundo Manuel; Quiroga; Franco Ronchetti; Enzo Ferrante

arXiv:2410.14506·cs.CL·October 21, 2024

SignAttention: On the Interpretability of Transformer Models for Sign Language Translation

Pedro Alejandro Dal Bianco, Oscar Agust\'in Stanchi, Facundo Manuel, Quiroga, Franco Ronchetti, Enzo Ferrante

PDF

Open Access 1 Repo

TL;DR

This paper analyzes the interpretability of Transformer models in Sign Language Translation, revealing how attention mechanisms align visual input with glosses and how focus shifts during decoding, aiding transparency.

Contribution

It provides the first comprehensive interpretability analysis of Transformer-based Sign Language Translation models, highlighting attention patterns and their implications for model transparency.

Findings

01

Models attend to frame clusters rather than individual frames

02

Diagonal alignment pattern between poses and glosses observed

03

Focus shifts from frames to previous tokens during decoding

Abstract

This paper presents the first comprehensive interpretability analysis of a Transformer-based Sign Language Translation (SLT) model, focusing on the translation from video-based Greek Sign Language to glosses and text. Leveraging the Greek Sign Language Dataset, we examine the attention mechanisms within the model to understand how it processes and aligns visual input with sequential glosses. Our analysis reveals that the model pays attention to clusters of frames rather than individual ones, with a diagonal alignment pattern emerging between poses and glosses, which becomes less distinct as the number of glosses increases. We also explore the relative contributions of cross-attention and self-attention at each decoding step, finding that the model initially relies on video frames but shifts its focus to previously predicted tokens as the translation progresses. This work contributes to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

pedroodb/sign_attention
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHand Gesture Recognition Systems · Hearing Impairment and Communication · Human Pose and Action Recognition

MethodsSoftmax · Attention Is All You Need · Focus