I Spy With My Little Eye: A Minimum Cost Multicut Investigation of   Dataset Frames

Katharina Prasse; Isaac Bravo; Stefanie Walter; Margret Keuper

arXiv:2412.01296·cs.CV·December 3, 2024

I Spy With My Little Eye: A Minimum Cost Multicut Investigation of Dataset Frames

Katharina Prasse, Isaac Bravo, Stefanie Walter, Margret Keuper

PDF

Open Access 1 Repo

TL;DR

This paper presents a novel approach to visual framing analysis by formulating image clustering as a Minimum Cost Multicut Problem, leveraging different embedding spaces to improve automated detection of visual frames in social science datasets.

Contribution

It introduces the use of the Minimum Cost Multicut formulation for image clustering in visual frame analysis and compares embedding spaces for optimal clustering performance.

Findings

01

DINOv2 is effective for broad visual frames

02

ConvNeXt V2 detects fine-grain differences

03

Multicut clustering improves visual frame detection

Abstract

Visual framing analysis is a key method in social sciences for determining common themes and concepts in a given discourse. To reduce manual effort, image clustering can significantly speed up the annotation process. In this work, we phrase the clustering task as a Minimum Cost Multicut Problem [MP]. Solutions to the MP have been shown to provide clusterings that maximize the posterior probability, solely from provided local, pairwise probabilities of two images belonging to the same cluster. We discuss the efficacy of numerous embedding spaces to detect visual frames and show its superiority over other clustering methods. To this end, we employ the climate change dataset \textit{ClimateTV} which contains images commonly used for visual frame analysis. For broad visual frames, DINOv2 is a suitable embedding space, while ConvNeXt V2 returns a larger number of clusters which contain…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kathpra/mp4visualframedetection
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications

MethodsConvNeXt · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings