# Knowledge Distillation for Human Action Anticipation

**Authors:** Vinh Tran, Yang Wang, Minh Hoai

arXiv: 1904.04868 · 2021-10-05

## TL;DR

This paper introduces a novel knowledge distillation framework for human action anticipation in videos, leveraging an action recognition network to improve anticipation accuracy using a new loss function and unlabeled data.

## Contribution

It presents a new knowledge distillation method with a specialized loss function for dynamic video data, enhancing action anticipation performance.

## Key findings

- Improved accuracy on JHMDB and EPIC-KITCHENS datasets.
- Effective use of unlabeled data through self-supervised learning.
- Novel loss function handling semantic shifts in videos.

## Abstract

We consider the task of training a neural network to anticipate human actions in video. This task is challenging given the complexity of video data, the stochastic nature of the future, and the limited amount of annotated training data. In this paper, we propose a novel knowledge distillation framework that uses an action recognition network to supervise the training of an action anticipation network, guiding the latter to attend to the relevant information needed for correctly anticipating the future actions. This framework is possible thanks to a novel loss function to account for positional shifts of semantic concepts in a dynamic video. The knowledge distillation framework is a form of self-supervised learning, and it takes advantage of unlabeled data. Experimental results on JHMDB and EPIC-KITCHENS dataset show the effectiveness of our approach.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1904.04868/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/1904.04868/full.md

## References

20 references — full list in the complete paper: https://tomesphere.com/paper/1904.04868/full.md

---
Source: https://tomesphere.com/paper/1904.04868