Optimization Planning for 3D ConvNets

Zhaofan Qiu; Ting Yao; Chong-Wah Ngo; Tao Mei

arXiv:2201.04021·cs.CV·January 12, 2022·6 cites

Optimization Planning for 3D ConvNets

Zhaofan Qiu, Ting Yao, Chong-Wah Ngo, Tao Mei

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces an automated optimization planning method for training 3D ConvNets, decomposing training into states and using dynamic programming to find the best sequence, leading to improved video recognition performance.

Contribution

It proposes a novel optimization planning approach for 3D ConvNets training, including a new dual-head classifier design and state transition strategy.

Findings

01

Achieved top-1 accuracy of 80.5% on Kinetics-400

02

Outperformed state-of-the-art methods on seven benchmarks

03

Demonstrated effectiveness of dynamic programming in training optimization

Abstract

It is not trivial to optimally learn a 3D Convolutional Neural Networks (3D ConvNets) due to high complexity and various options of the training scheme. The most common hand-tuning process starts from learning 3D ConvNets using short video clips and then is followed by learning long-term temporal dependency using lengthy clips, while gradually decaying the learning rate from high to low as training progresses. The fact that such process comes along with several heuristic settings motivates the study to seek an optimal "path" to automate the entire training. In this paper, we decompose the path into a series of training "states" and specify the hyper-parameters, e.g., learning rate and the length of input clips, in each state. The estimation of the knee point on the performance-epoch curve triggers the transition from one state to another. We perform dynamic programming over all the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zhaofanqiu/optimization-planning-for-3d-convnets
pytorchOfficial

Videos

Optimization Planning for 3D ConvNets· slideslive

Taxonomy

TopicsHuman Pose and Action Recognition · Video Surveillance and Tracking Methods · Advanced Neural Network Applications