GraphVid: It Only Takes a Few Nodes to Understand a Video

Eitan Kosman; Dotan Di Castro

arXiv:2207.01375·cs.CV·July 21, 2022

GraphVid: It Only Takes a Few Nodes to Understand a Video

Eitan Kosman, Dotan Di Castro

PDF

Open Access

TL;DR

GraphVid introduces a graph-based video representation using superpixels and GCNs, significantly reducing computational costs while maintaining competitive performance, enabling efficient video understanding on limited hardware.

Contribution

The paper presents a novel superpixel-based graph representation for videos processed with GCNs, achieving high efficiency and comparable accuracy with fewer resources.

Findings

01

Reduces computational requirements 10-fold

02

Maintains competitive accuracy on Kinetics-400 and Charades

03

Enables resource-limited hardware to perform video analysis

Abstract

We propose a concise representation of videos that encode perceptually meaningful features into graphs. With this representation, we aim to leverage the large amount of redundancies in videos and save computations. First, we construct superpixel-based graph representations of videos by considering superpixels as graph nodes and create spatial and temporal connections between adjacent superpixels. Then, we leverage Graph Convolutional Networks to process this representation and predict the desired output. As a result, we are able to train models with much fewer parameters, which translates into short training periods and a reduction in computation resource requirements. A comprehensive experimental study on the publicly available datasets Kinetics-400 and Charades shows that the proposed method is highly cost-effective and uses limited commodity hardware during training and inference. It…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Graph Neural Networks · Visual Attention and Saliency Detection · Image and Video Quality Assessment