Hierarchical I3D for Sign Spotting

Ryan Wong; Necati Cihan Camg\"oz; Richard Bowden

arXiv:2210.00951·cs.CV·October 4, 2022

Hierarchical I3D for Sign Spotting

Ryan Wong, Necati Cihan Camg\"oz, Richard Bowden

PDF

Open Access

TL;DR

This paper introduces a hierarchical I3D model for sign spotting in continuous sign language videos, achieving state-of-the-art results by learning multi-level spatio-temporal features for precise sign localization.

Contribution

The paper proposes a novel hierarchical sign spotting approach with a hierarchical network head attached to I3D, improving sign localization in continuous videos.

Findings

01

Achieved a 0.607 F1 score on the ChaLearn 2022 Sign Spotting Challenge.

02

Top-1 winning solution in the MSSL track of the challenge.

03

Demonstrated the effectiveness of hierarchical features for sign localization.

Abstract

Most of the vision-based sign language research to date has focused on Isolated Sign Language Recognition (ISLR), where the objective is to predict a single sign class given a short video clip. Although there has been significant progress in ISLR, its real-life applications are limited. In this paper, we focus on the challenging task of Sign Spotting instead, where the goal is to simultaneously identify and localise signs in continuous co-articulated sign videos. To address the limitations of current ISLR-based models, we propose a hierarchical sign spotting approach which learns coarse-to-fine spatio-temporal sign features to take advantage of representations at various temporal levels and provide more precise sign localisation. Specifically, we develop Hierarchical Sign I3D model (HS-I3D) which consists of a hierarchical network head that is attached to the existing spatio-temporal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHand Gesture Recognition Systems · Gait Recognition and Analysis · Hearing Impairment and Communication