Self-Attention Based Generative Adversarial Networks For Unsupervised Video Summarization
Maria Nektaria Minaidi, Charilaos Papaioannou, Alexandros Potamianos

TL;DR
This paper introduces SUM-GAN-AED, a novel unsupervised video summarization model that leverages self-attention and GANs to improve temporal frame selection, achieving state-of-the-art results on multiple datasets.
Contribution
The paper presents a new self-attention based GAN architecture for unsupervised video summarization, enhancing temporal modeling and outperforming existing methods.
Findings
Outperforms state-of-the-art on SumMe dataset.
Achieves comparable results on TVSum and COGNIMUSE datasets.
Demonstrates the effectiveness of self-attention in video frame selection.
Abstract
In this paper, we study the problem of producing a comprehensive video summary following an unsupervised approach that relies on adversarial learning. We build on a popular method where a Generative Adversarial Network (GAN) is trained to create representative summaries, indistinguishable from the originals. The introduction of the attention mechanism into the architecture for the selection, encoding and decoding of video frames, shows the efficacy of self-attention and transformer in modeling temporal relationships for video summarization. We propose the SUM-GAN-AED model that uses a self-attention mechanism for frame selection, combined with LSTMs for encoding and decoding. We evaluate the performance of the SUM-GAN-AED model on the SumMe, TVSum and COGNIMUSE datasets. Experimental results indicate that using a self-attention mechanism as the frame selection mechanism outperforms the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Analysis and Summarization · Music and Audio Processing · Natural Language Processing Techniques
