Describing Textures using Natural Language
Chenyun Wu, Mikayla Timm, Subhransu Maji

TL;DR
This paper introduces a new dataset and analysis methods for describing textures in natural images using language, revealing current models' limitations and enabling improved interpretability and fine-grained categorization.
Contribution
The paper provides a novel dataset with rich texture descriptions, systematically evaluates models for language grounding, and demonstrates improved interpretability and categorization using texture attributes.
Findings
Current models capture some texture properties but miss compositional details.
Synthetic textures can be generated with descriptive control.
Texture attributes improve fine-grained categorization.
Abstract
Textures in natural images can be characterized by color, shape, periodicity of elements within them, and other attributes that can be described using natural language. In this paper, we study the problem of describing visual attributes of texture on a novel dataset containing rich descriptions of textures, and conduct a systematic study of current generative and discriminative models for grounding language to images on this dataset. We find that while these models capture some properties of texture, they fail to capture several compositional properties, such as the colors of dots. We provide critical analysis of existing models by generating synthetic but realistic textures with different descriptions. Our dataset also allows us to train interpretable models and generate language-based explanations of what discriminative features are learned by deep networks for fine-grained…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning
