SteelDefectX: A Multi-Form Vision-Language Dataset and Benchmark for Steel Surface Defect Analysis

Shuxian Zhao; Jie Gui; Baosheng Yu; Dacheng Tao

arXiv:2603.21824·cs.CV·May 11, 2026

SteelDefectX: A Multi-Form Vision-Language Dataset and Benchmark for Steel Surface Defect Analysis

Shuxian Zhao, Jie Gui, Baosheng Yu, Dacheng Tao

PDF

1 Repo

TL;DR

SteelDefectX introduces a comprehensive vision-language dataset and benchmark for steel surface defect analysis, enabling more nuanced semantic understanding and evaluation of vision-language models in industrial settings.

Contribution

It provides a multi-form textual annotation dataset with diverse defect descriptions and establishes a benchmark for various vision-language tasks in steel defect analysis.

Findings

01

Structured attributes yield stable semantic alignment.

02

Natural language descriptions enhance transferability.

03

Textual representation design impacts model performance.

Abstract

Steel surface defect analysis is critical for industrial quality control, yet existing benchmarks rely primarily on label-only annotations, limiting fine-grained semantic understanding and systematic evaluation of vision-language models. To address this gap, we introduce SteelDefectX, a vision-language dataset with multi-form textual annotations for steel surface defect analysis, comprising 7,778 images across 25 defect categories. At the class level, the dataset provides defect names, representative visual attributes, and industrial causes. At the sample level, each image is annotated with three forms of textual representations: (1) free-form natural language descriptions, (2) structured attribute annotations, and (3) template-based sentences. These annotations provide flexible textual supervision with varying levels of expressiveness and controllability. We further establish a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Zhaosxian/SteelDefectX
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.