On Synthetic Texture Datasets: Challenges, Creation, and Curation
Blaine Hoak, Patrick McDaniel

TL;DR
This paper introduces a new large-scale, diverse texture dataset generated using text-to-image models, addressing challenges in texture synthesis, quality validation, and bias detection, to support texture-based machine learning tasks.
Contribution
The authors develop a novel pipeline for generating high-quality, diverse texture images using text prompts and diffusion models, creating the Prompted Textures Dataset (PTD) with over 246,000 images.
Findings
Texture images trigger high NSFW filter sensitivity, revealing biases.
The dataset is validated as high quality and diverse through metrics and human evaluation.
Generated textures support broad texture-based machine learning applications.
Abstract
The influence of textures on machine learning models has been an ongoing investigation, specifically in texture bias/learning, interpretability, and robustness. However, due to the lack of large and diverse texture data available, the findings in these works have been limited, as more comprehensive evaluations have not been feasible. Image generative models are able to provide data creation at scale, but utilizing these models for texture synthesis has been unexplored and poses additional challenges both in creating accurate texture images and validating those images. In this work, we introduce an extensible methodology and corresponding new dataset for generating high-quality, diverse texture images capable of supporting a broad set of texture-based tasks. Our pipeline consists of: (1) developing prompts from a range of descriptors to serve as input to text-to-image models, (2)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Retrieval and Classification Techniques · Image Processing and 3D Reconstruction · Generative Adversarial Networks and Image Synthesis
MethodsSparse Evolutionary Training · Diffusion
