Assessing Graphical Perception of Image Embedding Models using Channel   Effectiveness

Soohyun Lee; Minsuk Chang; Seokhyeon Park; Jinwook Seo

arXiv:2407.20845·cs.CV·July 31, 2024

Assessing Graphical Perception of Image Embedding Models using Channel Effectiveness

Soohyun Lee, Minsuk Chang, Seokhyeon Park, Jinwook Seo

PDF

Open Access

TL;DR

This paper introduces a new framework to evaluate how image embedding models perceive graphical components in charts, focusing on channel accuracy and discriminability, to better understand and improve model comprehension of complex visual data.

Contribution

It proposes a novel evaluation method for assessing the graphical perception of image embedding models, emphasizing channel effectiveness in chart understanding tasks.

Findings

01

CLIP perceives channel accuracy differently from humans

02

CLIP shows unique discriminability in length, tilt, and curvature channels

03

Framework paves the way for broader benchmarks for visual encoders

Abstract

Recent advancements in vision models have greatly improved their ability to handle complex chart understanding tasks, like chart captioning and question answering. However, it remains challenging to assess how these models process charts. Existing benchmarks only roughly evaluate model performance without evaluating the underlying mechanisms, such as how models extract image embeddings. This limits our understanding of the model's ability to perceive fundamental graphical components. To address this, we introduce a novel evaluation framework to assess the graphical perception of image embedding models. For chart comprehension, we examine two main aspects of channel effectiveness: accuracy and discriminability of various visual channels. Channel accuracy is assessed through the linearity of embeddings, measuring how well the perceived magnitude aligns with the size of the stimulus.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCell Image Analysis Techniques

MethodsContrastive Language-Image Pre-training