PipeMFL-240K: A Large-scale Dataset and Benchmark for Object Detection in Pipeline Magnetic Flux Leakage Imaging
Tianyi Qu, Songxiao Yang, Haolin Wang, Huadong Song, Xiaoting Guo, Wenguang Hu, Guanlin Liu, Honghe Chen, Yafei Ou

TL;DR
PipeMFL-240K is a large, annotated dataset and benchmark for object detection in pipeline Magnetic Flux Leakage images, addressing real-world challenges and enabling progress in automated pipeline inspection.
Contribution
It introduces the first large-scale public dataset and benchmark for pipeline MFL object detection, facilitating fair comparison and reproducible research.
Findings
Modern detectors struggle with MFL data complexity.
The dataset contains 249,320 images with 200,020 annotations.
Results highlight significant room for improvement in detection algorithms.
Abstract
Pipeline integrity is critical to industrial safety and environmental protection, with Magnetic Flux Leakage (MFL) detection being a primary non-destructive testing technology. Despite the promise of deep learning for automating MFL interpretation, progress toward reliable models has been constrained by the absence of a large-scale public dataset and benchmark, making fair comparison and reproducible evaluation difficult. We introduce \textbf{PipeMFL-240K}, a large-scale, meticulously annotated dataset and benchmark for complex object detection in pipeline MFL pseudo-color images. PipeMFL-240K reflects real-world inspection complexity and poses several unique challenges: (i) an extremely long-tailed distribution over \textbf{12} categories, (ii) a high prevalence of tiny objects that often comprise only a handful of pixels and (iii) substantial intra-class variability. The dataset…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
