Spiking Variational Graph Representation Inference for Video Summarization

Wenrui Li; Wei Han; Liang-Jian Deng; Ruiqin Xiong; Xiaopeng Fan

arXiv:2508.15389·cs.CV·August 22, 2025

Spiking Variational Graph Representation Inference for Video Summarization

Wenrui Li, Wei Han, Liang-Jian Deng, Ruiqin Xiong, Xiaopeng Fan

PDF

Open Access

TL;DR

This paper introduces SpiVG, a novel spiking neural network-based framework for video summarization that captures global temporal dependencies, reduces noise impact, and improves summarization accuracy across multiple datasets.

Contribution

The paper presents a new Spiking Variational Graph Network with a keyframe extractor, dynamic graph reasoning, and variational inference modules, advancing video summarization techniques.

Findings

01

Outperforms existing methods on SumMe, TVSum, VideoXum, and QFVS datasets.

02

Effectively captures global temporal dependencies and semantic coherence.

03

Reduces noise influence during multi-channel feature fusion.

Abstract

With the rise of short video content, efficient video summarization techniques for extracting key information have become crucial. However, existing methods struggle to capture the global temporal dependencies and maintain the semantic coherence of video content. Additionally, these methods are also influenced by noise during multi-channel feature fusion. We propose a Spiking Variational Graph (SpiVG) Network, which enhances information density and reduces computational complexity. First, we design a keyframe extractor based on Spiking Neural Networks (SNN), leveraging the event-driven computation mechanism of SNNs to learn keyframe features autonomously. To enable fine-grained and adaptable reasoning across video frames, we introduce a Dynamic Aggregation Graph Reasoner, which decouples contextual object consistency from semantic perspective coherence. We present a Variational…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Analysis and Summarization · Multimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis