# Spatio-Temporal Action Graph Networks

**Authors:** Roei Herzig, Elad Levi, Huijuan Xu, Hang Gao, Eli Brosh, Xiaolong, Wang, Amir Globerson, Trevor Darrell

arXiv: 1812.01233 · 2019-10-01

## TL;DR

This paper introduces a novel spatio-temporal graph network for activity recognition that explicitly models object interactions, improving performance on complex datasets with limited labeled examples.

## Contribution

It proposes a disentangled inter-object graph embedding with direct edge observation, enhancing activity recognition by capturing spatial and temporal object interactions.

## Key findings

- Significantly outperforms baseline models on Charades dataset.
- Effective in recognizing multi-object interactions and near-collision events.
- Demonstrates robustness with limited labeled data.

## Abstract

Events defined by the interaction of objects in a scene are often of critical importance; yet important events may have insufficient labeled examples to train a conventional deep model to generalize to future object appearance. Activity recognition models that represent object interactions explicitly have the potential to learn in a more efficient manner than those that represent scenes with global descriptors. We propose a novel inter-object graph representation for activity recognition based on a disentangled graph embedding with direct observation of edge appearance. We employ a novel factored embedding of the graph structure, disentangling a representation hierarchy formed over spatial dimensions from that found over temporal variation. We demonstrate the effectiveness of our model on the Charades activity recognition benchmark, as well as a new dataset of driving activities focusing on multi-object interactions with near-collision events. Our model offers significantly improved performance compared to baseline approaches without object-graph representations, or with previous graph-based models.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1812.01233/full.md

## Figures

11 figures with captions in the complete paper: https://tomesphere.com/paper/1812.01233/full.md

## References

57 references — full list in the complete paper: https://tomesphere.com/paper/1812.01233/full.md

---
Source: https://tomesphere.com/paper/1812.01233