CT-Net: Channel Tensorization Network for Video Classification

Kunchang Li; Xianhang Li; Yali Wang; Jun Wang; Yu Qiao

arXiv:2106.01603·cs.CV·June 4, 2021·26 cites

CT-Net: Channel Tensorization Network for Video Classification

Kunchang Li, Xianhang Li, Yali Wang, Jun Wang, Yu Qiao

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces CT-Net, a novel channel tensorization network for video classification that balances efficiency and feature interaction by tensorizing channels and integrating a tensor excitation mechanism, achieving state-of-the-art results.

Contribution

Proposes a new Channel Tensorization Network (CT-Net) that factorizes channels into multiple sub-dimensions and incorporates a Tensor Excitation mechanism for improved video classification.

Findings

01

Outperforms recent SOTA methods on Kinetics-400 and Something-Something benchmarks.

02

Achieves better accuracy and efficiency compared to existing approaches.

03

Effectively enlarges the 3D receptive field through channel tensorization.

Abstract

3D convolution is powerful for video classification but often computationally expensive, recent studies mainly focus on decomposing it on spatial-temporal and/or channel dimensions. Unfortunately, most approaches fail to achieve a preferable balance between convolutional efficiency and feature-interaction sufficiency. For this reason, we propose a concise and novel Channel Tensorization Network (CT-Net), by treating the channel dimension of input feature as a multiplication of K sub-dimensions. On one hand, it naturally factorizes convolution in a multiple dimension way, leading to a light computation burden. On the other hand, it can effectively enhance feature interaction from different channels, and progressively enlarge the 3D receptive field of such interaction to boost classification accuracy. Furthermore, we equip our CT-Module with a Tensor Excitation (TE) mechanism. It can…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Andy1621/CT-Net
pytorchOfficial

Videos

CT-Net: Channel Tensorization Network for Video Classification· slideslive

Taxonomy

TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Multimodal Machine Learning Applications

Methods*Communicated@Fast*How Do I Communicate to Expedia? · Batch Normalization · Residual Connection · Average Pooling · Global Average Pooling · Kaiming Initialization · 1x1 Convolution · Residual Block · Bottleneck Residual Block · Max Pooling