How well does CLIP understand texture?

Chenyun Wu; Subhransu Maji

arXiv:2203.11449·cs.CV·November 8, 2022·1 cites

How well does CLIP understand texture?

Chenyun Wu, Subhransu Maji

PDF

Open Access 1 Repo

TL;DR

This paper evaluates CLIP's understanding of texture in natural images through zero-shot classification, compositional property representation, and fine-grained categorization, revealing its strengths and limitations in texture comprehension.

Contribution

It provides a comprehensive analysis of CLIP's ability to understand and utilize texture information in natural language descriptions and image classification tasks.

Findings

01

CLIP performs well on zero-shot texture classification tasks.

02

CLIP can represent compositional texture properties like color and pattern.

03

Texture information can aid fine-grained bird species categorization.

Abstract

We investigate how well CLIP understands texture in natural images described by natural language. To this end, we analyze CLIP's ability to: (1) perform zero-shot learning on various texture and material classification datasets; (2) represent compositional properties of texture such as red dots or yellow stripes on the Describable Texture in Detail(DTDD) dataset; and (3) aid fine-grained categorization of birds in photographs described by color and texture of their body parts.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

chenyunwu/clip_texture
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Retrieval and Classification Techniques

MethodsContrastive Language-Image Pre-training