# CHAM: action recognition using convolutional hierarchical attention   model

**Authors:** Shiyang Yan, Jeremy S. Smith, Wenjin Lu, Bailing Zhang

arXiv: 1705.03146 · 2017-05-22

## TL;DR

This paper introduces CHAM, a convolutional hierarchical attention model that enhances action recognition in videos by combining convolutional LSTMs with hierarchical reasoning, leading to improved accuracy on multiple datasets.

## Contribution

The paper proposes a novel convolutional hierarchical attention model that explicitly reasons on multiple granularities for action recognition in videos.

## Key findings

- Improved accuracy on UCF sports, Olympic sports, and HMDB51 datasets.
- Effective integration of convolutional LSTM with hierarchical attention.
- Demonstrated superiority over existing models in action recognition tasks.

## Abstract

Recently, the soft attention mechanism, which was originally proposed in language processing, has been applied in computer vision tasks like image captioning. This paper presents improvements to the soft attention model by combining a convolutional LSTM with a hierarchical system architecture to recognize action categories in videos. We call this model the Convolutional Hierarchical Attention Model (CHAM). The model applies a convolutional operation inside the LSTM cell and an attention map generation process to recognize actions. The hierarchical architecture of this model is able to explicitly reason on multi-granularities of action categories. The proposed architecture achieved improved results on three publicly available datasets: the UCF sports dataset, the Olympic sports dataset and the HMDB51 dataset.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1705.03146/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/1705.03146/full.md

## References

22 references — full list in the complete paper: https://tomesphere.com/paper/1705.03146/full.md

---
Source: https://tomesphere.com/paper/1705.03146