LPCAN: Lightweight Pyramid Cross-Attention Network for Rail Surface Defect Detection Using RGB-D Data
Jackie Alex, Guoqiang Huan

TL;DR
LPCANet is a lightweight, efficient neural network that combines RGB-D data and novel modules to accurately detect rail surface defects, outperforming existing methods in speed and accuracy.
Contribution
The paper introduces LPCANet, a novel lightweight pyramid cross-attention network that effectively fuses RGB-D data for rail defect detection with superior performance and efficiency.
Findings
Achieves state-of-the-art accuracy with only 9.90 million parameters.
Demonstrates high inference speed of 162.60 fps.
Outperforms 18 existing methods on multiple datasets.
Abstract
This paper addresses the limitations of current vision-based rail defect detection methods, including high computational complexity, excessive parameter counts, and suboptimal accuracy. We propose a Lightweight Pyramid Cross-Attention Network (LPCANet) that leverages RGB-D data for efficient and accurate defect identification. The architecture integrates MobileNetv2 as a backbone for RGB feature extraction with a lightweight pyramid module (LPM) for depth processing, coupled with a cross-attention mechanism (CAM) for multimodal fusion and a spatial feature extractor (SFE) for enhanced structural analysis. Comprehensive evaluations on three unsupervised RGB-D rail datasets (NEU-RSDDS-AUG, RSDD-TYPE1, RSDD-TYPE2) demonstrate that LPCANet achieves state-of-the-art performance with only 9.90 million parameters, 2.50 G FLOPs, and 162.60 fps inference speed. Compared to 18 existing methods,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRailway Engineering and Dynamics · Infrastructure Maintenance and Monitoring · Advanced Neural Network Applications
