EdgeVidSum: Real-Time Personalized Video Summarization at the Edge
Ghulam Mujtaba, Eun-Seok Ryu

TL;DR
EdgeVidSum is a lightweight, real-time video summarization method that runs on edge devices, providing personalized summaries while ensuring user privacy through local processing and efficient neural architectures.
Contribution
The paper introduces a novel thumbnail-based hierarchical approach for real-time, personalized video summarization directly on resource-constrained edge devices.
Findings
Operates in real-time on Jetson Nano
Reduces computational complexity significantly
Maintains high semantic relevance in summaries
Abstract
EdgeVidSum is a lightweight method that generates personalized, fast-forward summaries of long-form videos directly on edge devices. The proposed approach enables real-time video summarization while safeguarding user privacy through local data processing using innovative thumbnail-based techniques and efficient neural architectures. Unlike conventional methods that process entire videos frame by frame, the proposed method uses thumbnail containers to significantly reduce computational complexity without sacrificing semantic relevance. The framework employs a hierarchical analysis approach, where a lightweight 2D CNN model identifies user-preferred content from thumbnails and generates timestamps to create fast-forward summaries. Our interactive demo highlights the system's ability to create tailored video summaries for long-form videos, such as movies, sports events, and TV shows, based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Analysis and Summarization · Human Pose and Action Recognition · Advanced Image and Video Retrieval Techniques
