RNN Fisher Vectors for Action Recognition and Image Annotation
Guy Lev, Gil Sadeh, Benjamin Klein, Lior Wolf

TL;DR
This paper introduces a novel approach combining RNNs and Fisher Vectors to encode sequences, achieving state-of-the-art results in video action recognition and image annotation, with promising transfer learning capabilities.
Contribution
It presents a new sequence encoding method using RNNs as generative models within Fisher Vectors, improving performance in sequence-based tasks.
Findings
State-of-the-art results in action recognition and image annotation
Effective sequence encoding with RNN Fisher Vectors
Transfer learning from image annotation to video recognition
Abstract
Recurrent Neural Networks (RNNs) have had considerable success in classifying and predicting sequences. We demonstrate that RNNs can be effectively used in order to encode sequences and provide effective representations. The methodology we use is based on Fisher Vectors, where the RNNs are the generative probabilistic models and the partial derivatives are computed using backpropagation. State of the art results are obtained in two central but distant tasks, which both rely on sequences: video action recognition and image annotation. We also show a surprising transfer learning result from the task of image annotation to the task of video action recognition.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Generative Adversarial Networks and Image Synthesis · Neural Networks and Applications
