Automated Detection of Defects on Metal Surfaces using Vision Transformers
Toqa Alaa, Mostafa Kotb, Arwa Zakaria, Mariam Diab, and Walid Gomaa

TL;DR
This paper presents a deep learning model based on Vision Transformers for automated detection and localization of surface defects in metal manufacturing, aiming to improve efficiency and accuracy over manual inspection.
Contribution
It introduces a novel ViT-based architecture that simultaneously classifies and localizes defects on metal surfaces, enhancing defect detection capabilities.
Findings
Achieved high classification accuracy in defect detection
Reduced localization errors with low MSE and MAE
Demonstrated potential for operational efficiency improvements
Abstract
Metal manufacturing often results in the production of defective products, leading to operational challenges. Since traditional manual inspection is time-consuming and resource-intensive, automatic solutions are needed. The study utilizes deep learning techniques to develop a model for detecting metal surface defects using Vision Transformers (ViTs). The proposed model focuses on the classification and localization of defects using a ViT for feature extraction. The architecture branches into two paths: classification and localization. The model must approach high classification accuracy while keeping the Mean Square Error (MSE) and Mean Absolute Error (MAE) as low as possible in the localization process. Experimental results show that it can be utilized in the process of automated defects detection, improve operational efficiency, and reduce errors in metal manufacturing.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIndustrial Vision Systems and Defect Detection
