EdgeVidSum: Real-Time Personalized Video Summarization at the Edge

Ghulam Mujtaba; Eun-Seok Ryu

arXiv:2506.03171·cs.CV·June 5, 2025

EdgeVidSum: Real-Time Personalized Video Summarization at the Edge

Ghulam Mujtaba, Eun-Seok Ryu

PDF

Open Access

TL;DR

EdgeVidSum is a lightweight, real-time video summarization method that runs on edge devices, providing personalized summaries while ensuring user privacy through local processing and efficient neural architectures.

Contribution

The paper introduces a novel thumbnail-based hierarchical approach for real-time, personalized video summarization directly on resource-constrained edge devices.

Findings

01

Operates in real-time on Jetson Nano

02

Reduces computational complexity significantly

03

Maintains high semantic relevance in summaries

Abstract

EdgeVidSum is a lightweight method that generates personalized, fast-forward summaries of long-form videos directly on edge devices. The proposed approach enables real-time video summarization while safeguarding user privacy through local data processing using innovative thumbnail-based techniques and efficient neural architectures. Unlike conventional methods that process entire videos frame by frame, the proposed method uses thumbnail containers to significantly reduce computational complexity without sacrificing semantic relevance. The framework employs a hierarchical analysis approach, where a lightweight 2D CNN model identifies user-preferred content from thumbnails and generates timestamps to create fast-forward summaries. Our interactive demo highlights the system's ability to create tailored video summaries for long-form videos, such as movies, sports events, and TV shows, based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Analysis and Summarization · Human Pose and Action Recognition · Advanced Image and Video Retrieval Techniques