LLM-based Hierarchical Concept Decomposition for Interpretable   Fine-Grained Image Classification

Renyi Qu; Mark Yatskar

arXiv:2405.18672·cs.CV·June 4, 2024

LLM-based Hierarchical Concept Decomposition for Interpretable Fine-Grained Image Classification

Renyi Qu, Mark Yatskar

PDF

Open Access

TL;DR

This paper introduces Hi-CoDe, a hierarchical concept decomposition framework that enhances interpretability in fine-grained image classification by structuring visual concepts and using simple classifiers, maintaining high accuracy.

Contribution

The novel Hi-CoDe framework combines structured concept hierarchies with linear classifiers, improving interpretability without sacrificing performance in image classification.

Findings

01

Achieves state-of-the-art interpretability in fine-grained classification.

02

Provides clear insights into decision-making processes.

03

Maintains competitive accuracy with complex models.

Abstract

(Renyi Qu's Master's Thesis) Recent advancements in interpretable models for vision-language tasks have achieved competitive performance; however, their interpretability often suffers due to the reliance on unstructured text outputs from large language models (LLMs). This introduces randomness and compromises both transparency and reliability, which are essential for addressing safety issues in AI systems. We introduce \texttt{Hi-CoDe} (Hierarchical Concept Decomposition), a novel framework designed to enhance model interpretability through structured concept analysis. Our approach consists of two main components: (1) We use GPT-4 to decompose an input image into a structured hierarchy of visual concepts, thereby forming a visual concept tree. (2) We then employ an ensemble of simple linear classifiers that operate on concept-specific features derived from CLIP to perform…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Retrieval and Classification Techniques · Machine Learning and Data Classification

MethodsLinear Layer · Byte Pair Encoding · Label Smoothing · Adam · Attention Is All You Need · Residual Connection · Position-Wise Feed-Forward Layer · Multi-Head Attention · Dropout · Dense Connections