# Investigation on Combining 3D Convolution of Image Data and Optical Flow   to Generate Temporal Action Proposals

**Authors:** Patrick Schlosser, David M\"unch, Michael Arens

arXiv: 1903.04176 · 2019-03-15

## TL;DR

This paper explores various two-stream architectures combining 3D convolutions and optical flow for improved temporal action proposal generation in videos, achieving state-of-the-art results on THUMOS'14.

## Contribution

It systematically investigates and empirically optimizes four two-stream architectures integrating 3D convolutions and optical flow, outperforming previous single-stream methods.

## Key findings

- All four architectures outperform the original SST.
- Replacing Brox with FlowNet2 still improves performance.
- Optimal parameter settings are empirically determined.

## Abstract

In this paper, several variants of two-stream architectures for temporal action proposal generation in long, untrimmed videos are presented. Inspired by the recent advances in the field of human action recognition utilizing 3D convolutions in combination with two-stream networks and based on the Single-Stream Temporal Action Proposals (SST) architecture, four different two-stream architectures utilizing sequences of images on one stream and sequences of images of optical flow on the other stream are subsequently investigated. The four architectures fuse the two separate streams at different depths in the model; for each of them, a broad range of parameters is investigated systematically as well as an optimal parametrization is empirically determined. The experiments on the THUMOS'14 dataset show that all four two-stream architectures are able to outperform the original single-stream SST and achieve state of the art results. Additional experiments revealed that the improvements are not restricted to a single method of calculating optical flow by exchanging the formerly used method of Brox with FlowNet2 and still achieving improvements.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1903.04176/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/1903.04176/full.md

## References

19 references — full list in the complete paper: https://tomesphere.com/paper/1903.04176/full.md

---
Source: https://tomesphere.com/paper/1903.04176