Video Action Recognition Collaborative Learning with Dynamics via   PSO-ConvNet Transformer

Nguyen Huu Phong; Bernardete Ribeiro

arXiv:2302.09187·cs.CV·September 22, 2023·1 cites

Video Action Recognition Collaborative Learning with Dynamics via PSO-ConvNet Transformer

Nguyen Huu Phong, Bernardete Ribeiro

PDF

Open Access 2 Repos

TL;DR

This paper introduces a novel PSO-ConvNet model combined with Transformers and RNNs for video human action recognition, achieving significant accuracy improvements and demonstrating the benefits of collaborative learning over individual methods.

Contribution

The paper presents a dynamic PSO-ConvNet framework integrated with temporal models for improved video action recognition, extending prior image recognition work to spatio-temporal video analysis.

Findings

01

Up to 9% accuracy improvement on UCF-101 dataset.

02

Collaborative learning outperforms non-collaborative approaches.

03

Effective capture of spatio-temporal dynamics in videos.

Abstract

Recognizing human actions in video sequences, known as Human Action Recognition (HAR), is a challenging task in pattern recognition. While Convolutional Neural Networks (ConvNets) have shown remarkable success in image recognition, they are not always directly applicable to HAR, as temporal features are critical for accurate classification. In this paper, we propose a novel dynamic PSO-ConvNet model for learning actions in videos, building on our recent work in image recognition. Our approach leverages a framework where the weight vector of each neural network represents the position of a particle in phase space, and particles share their current weight vectors and gradient estimates of the Loss function. To extend our approach to video, we integrate ConvNets with state-of-the-art temporal methods such as Transformer and Recurrent Neural Networks. Our experimental results on the UCF-101…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Advanced Neural Network Applications

MethodsTransformer