PRISM: Perceptual Recognition for Identifying Standout Moments in Human-Centric Keyframe Extraction

Mert Can Cakmak; Nitin Agarwal; Diwash Poudel

arXiv:2506.19168·cs.CV·June 25, 2025

PRISM: Perceptual Recognition for Identifying Standout Moments in Human-Centric Keyframe Extraction

Mert Can Cakmak, Nitin Agarwal, Diwash Poudel

PDF

TL;DR

PRISM is a lightweight, interpretable, and perceptually-aligned keyframe extraction framework that efficiently identifies impactful moments in videos, aiding content moderation and summarization without relying on deep learning.

Contribution

It introduces a novel perceptual color difference-based method for keyframe extraction that is training-free, computationally efficient, and effective across diverse video datasets.

Findings

01

Achieves high accuracy and fidelity in keyframe extraction

02

Maintains high compression ratios across datasets

03

Effective in both structured and unstructured video content

Abstract

Online videos play a central role in shaping political discourse and amplifying cyber social threats such as misinformation, propaganda, and radicalization. Detecting the most impactful or "standout" moments in video content is crucial for content moderation, summarization, and forensic analysis. In this paper, we introduce PRISM (Perceptual Recognition for Identifying Standout Moments), a lightweight and perceptually-aligned framework for keyframe extraction. PRISM operates in the CIELAB color space and uses perceptual color difference metrics to identify frames that align with human visual sensitivity. Unlike deep learning-based approaches, PRISM is interpretable, training-free, and computationally efficient, making it well suited for real-time and resource-constrained environments. We evaluate PRISM on four benchmark datasets: BBC, TVSum, SumMe, and ClipShots, and demonstrate that it…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.