# Submission to ActivityNet Challenge 2019: Task B Spatio-temporal Action   Localization

**Authors:** Chunfei Ma, Joonhyang Choi, Byeongwon Lee, Seungji Yang

arXiv: 1907.10837 · 2019-07-26

## TL;DR

This paper presents an end-to-end RGB-based spatio-temporal action localization system for ActivityNet Challenge 2019, utilizing SlowFast Networks and novel data augmentation techniques to improve performance and address class imbalance.

## Contribution

It introduces an end-to-end trainable architecture using only RGB images with novel data augmentation and label subsampling methods for better SAL performance.

## Key findings

- Effective spatio-temporal feature extraction with SlowFast Networks
- Improved performance through correlation-preserving data augmentation
- Reduced overfitting with random label subsampling

## Abstract

This technical report present an overview of our system proposed for the spatio-temporal action localization(SAL) task in ActivityNet Challenge 2019. Unlike previous two-streams-based works, we focus on exploring the end-to-end trainable architecture using only RGB sequential images. To this end, we employ a previously proposed simple yet effective two-branches network called SlowFast Networks which is capable of capturing both short- and long-term spatiotemporal features. Moreover, to handle the severe class imbalance and overfitting problems, we propose a correlation-preserving data augmentation method and a random label subsampling method which have been proven to be able to reduce overfitting and improve the performance.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1907.10837/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/1907.10837/full.md

## References

18 references — full list in the complete paper: https://tomesphere.com/paper/1907.10837/full.md

---
Source: https://tomesphere.com/paper/1907.10837