# Attentive Spatio-Temporal Representation Learning for Diving   Classification

**Authors:** Gagan Kanojia, Sudhakar Kumawat, Shanmuganathan Raman

arXiv: 1905.00050 · 2019-05-02

## TL;DR

This paper introduces an attention-guided LSTM neural network for classifying diving actions in videos, achieving superior accuracy and localizing divers without explicit supervision, on a large dataset of 48 diving classes.

## Contribution

It presents a novel attention-guided LSTM architecture that improves diving classification accuracy and enables diver localization without additional annotations.

## Key findings

- Outperforms state-of-the-art models by 11.54% in 2D and 4.24% in 3D frameworks.
- Effectively localizes divers in videos without explicit supervision.
- Demonstrates robustness on a large, diverse diving dataset.

## Abstract

Competitive diving is a well recognized aquatic sport in which a person dives from a platform or a springboard into the water. Based on the acrobatics performed during the dive, diving is classified into a finite set of action classes which are standardized by FINA. In this work, we propose an attention guided LSTM-based neural network architecture for the task of diving classification. The network takes the frames of a diving video as input and determines its class. We evaluate the performance of the proposed model on a recently introduced competitive diving dataset, Diving48. It contains over 18000 video clips which covers 48 classes of diving. The proposed model outperforms the classification accuracy of the state-of-the-art models in both 2D and 3D frameworks by 11.54% and 4.24%, respectively. We show that the network is able to localize the diver in the video frames during the dive without being trained with such a supervision.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1905.00050/full.md

## Figures

48 figures with captions in the complete paper: https://tomesphere.com/paper/1905.00050/full.md

## References

30 references — full list in the complete paper: https://tomesphere.com/paper/1905.00050/full.md

---
Source: https://tomesphere.com/paper/1905.00050