High-Resolution Vision Transformers for Pixel-Level Identification of Structural Components and Damage
Kareem Eltouny, Seyedomid Sajedi, and Xiao Liang

TL;DR
This paper introduces a high-resolution vision transformer-based semantic segmentation network that efficiently processes detailed inspection images, preserving fine damage features and global context for civil structure assessment.
Contribution
The study proposes a novel vision transformer architecture with Laplacian pyramids that effectively handles high-resolution images without losing detail or increasing computational load.
Findings
Achieved accurate pixel-wise damage detection on bridge inspection images.
Outperformed traditional downsampling methods in preserving fine details.
Demonstrated computational efficiency in processing high-resolution visual data.
Abstract
Visual inspection is predominantly used to evaluate the state of civil structures, but recent developments in unmanned aerial vehicles (UAVs) and artificial intelligence have increased the speed, safety, and reliability of the inspection process. In this study, we develop a semantic segmentation network based on vision transformers and Laplacian pyramids scaling networks for efficiently parsing high-resolution visual inspection images. The massive amounts of collected high-resolution images during inspections can slow down the investigation efforts. And while there have been extensive studies dedicated to the use of deep learning models for damage segmentation, processing high-resolution visual data can pose major computational difficulties. Traditionally, images are either uniformly downsampled or partitioned to cope with computational demands. However, the input is at risk of losing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInfrastructure Maintenance and Monitoring · Non-Destructive Testing Techniques · Industrial Vision Systems and Defect Detection
MethodsMulti-Head Attention · Attention Is All You Need · Softmax · Linear Layer · Dense Connections · Residual Connection · Layer Normalization · Vision Transformer
