STEC: A Reference-Free Spatio-Temporal Entropy Coverage Metric for Evaluating Sampled Video Frames

Shih-Yao Lin

arXiv:2601.13974·cs.CV·January 21, 2026

STEC: A Reference-Free Spatio-Temporal Entropy Coverage Metric for Evaluating Sampled Video Frames

Shih-Yao Lin

PDF

Open Access

TL;DR

STEC is a new, reference-free metric that evaluates the quality of sampled video frames by measuring their spatio-temporal information coverage, helping to improve video understanding tasks.

Contribution

We introduce STEC, a lightweight, non-reference metric that assesses sampled video frames based on spatial information, temporal coverage, and redundancy, filling a gap in evaluation methods.

Findings

01

STEC effectively differentiates various sampling strategies.

02

It reveals robustness patterns not captured by average performance.

03

STEC serves as a diagnostic tool for frame sampling quality.

Abstract

Frame sampling is a fundamental component in video understanding and video--language model pipelines, yet evaluating the quality of sampled frames remains challenging. Existing evaluation metrics primarily focus on perceptual quality or reconstruction fidelity, and are not designed to assess whether a set of sampled frames adequately captures informative and representative video content. We propose Spatio-Temporal Entropy Coverage (STEC), a simple and non-reference metric for evaluating the effectiveness of video frame sampling. STEC builds upon Spatio-Temporal Frame Entropy (STFE), which measures per-frame spatial information via entropy-based structural complexity, and evaluates sampled frames based on their temporal coverage and redundancy. By jointly modeling spatial information strength, temporal dispersion, and non-redundancy, STEC provides a principled and lightweight measure…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Analysis and Summarization · Multimodal Machine Learning Applications · Human Pose and Action Recognition