VisTopics: A Visual Semantic Unsupervised Approach to Topic Modeling of Video and Image Data

Ayse D Lokmanoglu; Dror Walter

arXiv:2505.14868·cs.IR·September 18, 2025

VisTopics: A Visual Semantic Unsupervised Approach to Topic Modeling of Video and Image Data

Ayse D Lokmanoglu, Dror Walter

PDF

TL;DR

VisTopics is an innovative unsupervised framework that combines semantic analysis and clustering to extract meaningful topics from large-scale visual datasets, facilitating media studies and understanding visual narratives.

Contribution

It introduces a novel end-to-end pipeline integrating frame deduplication and semantic clustering, advancing the analysis of visual media through combined computational techniques.

Findings

01

Reduced large video datasets to key frames for analysis

02

Identified 35 distinct topics across diverse visual content

03

Validated semantic clustering with human coding accuracy

Abstract

Understanding visual narratives is crucial for examining the evolving dynamics of media representation. This study introduces VisTopics, a computational framework designed to analyze large-scale visual datasets through an end-to-end pipeline encompassing frame extraction, deduplication, and semantic clustering. Applying VisTopics to a dataset of 452 NBC News videos resulted in reducing 11,070 frames to 6,928 deduplicated frames, which were then semantically analyzed to uncover 35 topics ranging from political events to environmental crises. By integrating Latent Dirichlet Allocation with caption-based semantic analysis, VisTopics demonstrates its potential to unravel patterns in visual framing across diverse contexts. This approach enables longitudinal studies and cross-platform comparisons, shedding light on the intersection of media, technology, and public discourse. The study…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.