Exploiting Feature and Class Relationships in Video Categorization with Regularized Deep Neural Networks
Yu-Gang Jiang, Zuxuan Wu, Jun Wang, Xiangyang Xue, Shih-Fu Chang

TL;DR
This paper introduces a regularized deep neural network framework that leverages feature and class relationships to enhance video categorization accuracy, outperforming existing methods on multiple benchmarks.
Contribution
The paper proposes a novel regularized DNN that jointly exploits feature and class relationships, improving semantic modeling in video categorization tasks.
Findings
rDNN achieves state-of-the-art performance on Hollywood2 and Columbia datasets.
The approach is efficiently implemented on GPU with affordable training costs.
A new large-scale benchmark dataset, FCVID, is introduced for future research.
Abstract
In this paper, we study the challenging problem of categorizing videos according to high-level semantics such as the existence of a particular human action or a complex event. Although extensive efforts have been devoted in recent years, most existing works combined multiple video features using simple fusion strategies and neglected the utilization of inter-class semantic relationships. This paper proposes a novel unified framework that jointly exploits the feature relationships and the class relationships for improved categorization performance. Specifically, these two types of relationships are estimated and utilized by rigorously imposing regularizations in the learning process of a deep neural network (DNN). Such a regularized DNN (rDNN) can be efficiently realized using a GPU-based implementation with an affordable training cost. Through arming the DNN with better capability of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
