Gated-ViGAT: Efficient Bottom-Up Event Recognition and Explanation Using   a New Frame Selection Policy and Gating Mechanism

Nikolaos Gkalelis; Dimitrios Daskalakis; Vasileios Mezaris

arXiv:2301.07565·cs.CV·January 19, 2023·1 cites

Gated-ViGAT: Efficient Bottom-Up Event Recognition and Explanation Using a New Frame Selection Policy and Gating Mechanism

Nikolaos Gkalelis, Dimitrios Daskalakis, Vasileios Mezaris

PDF

Open Access 1 Repo

TL;DR

Gated-ViGAT introduces an efficient bottom-up video event recognition method that uses a novel frame selection policy and gating mechanism to reduce computation while maintaining high accuracy and explainability.

Contribution

The paper proposes a new frame sampling policy based on weighted in-degrees and a gating mechanism for early decision-making in video event recognition.

Findings

01

Significant reduction in computational complexity compared to previous methods.

02

Maintains high event recognition accuracy and explainability.

03

Effective selection of salient and diverse frames for analysis.

Abstract

In this paper, Gated-ViGAT, an efficient approach for video event recognition, utilizing bottom-up (object) information, a new frame sampling policy and a gating mechanism is proposed. Specifically, the frame sampling policy uses weighted in-degrees (WiDs), derived from the adjacency matrices of graph attention networks (GATs), and a dissimilarity measure to select the most salient and at the same time diverse frames representing the event in the video. Additionally, the proposed gating mechanism fetches the selected frames sequentially, and commits early-exiting when an adequately confident decision is achieved. In this way, only a few frames are processed by the computationally expensive branch of our network that is responsible for the bottom-up information extraction. The experimental evaluation on two large, publicly available video datasets (MiniKinetics, ActivityNet) demonstrates…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

bmezaris/gated-vigat
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Human Pose and Action Recognition · Multimodal Machine Learning Applications