High-Resolution Vision Transformers for Pixel-Level Identification of   Structural Components and Damage

Kareem Eltouny; Seyedomid Sajedi; and Xiao Liang

arXiv:2308.03006·cs.CV·August 8, 2023

High-Resolution Vision Transformers for Pixel-Level Identification of Structural Components and Damage

Kareem Eltouny, Seyedomid Sajedi, and Xiao Liang

PDF

Open Access

TL;DR

This paper introduces a high-resolution vision transformer-based semantic segmentation network that efficiently processes detailed inspection images, preserving fine damage features and global context for civil structure assessment.

Contribution

The study proposes a novel vision transformer architecture with Laplacian pyramids that effectively handles high-resolution images without losing detail or increasing computational load.

Findings

01

Achieved accurate pixel-wise damage detection on bridge inspection images.

02

Outperformed traditional downsampling methods in preserving fine details.

03

Demonstrated computational efficiency in processing high-resolution visual data.

Abstract

Visual inspection is predominantly used to evaluate the state of civil structures, but recent developments in unmanned aerial vehicles (UAVs) and artificial intelligence have increased the speed, safety, and reliability of the inspection process. In this study, we develop a semantic segmentation network based on vision transformers and Laplacian pyramids scaling networks for efficiently parsing high-resolution visual inspection images. The massive amounts of collected high-resolution images during inspections can slow down the investigation efforts. And while there have been extensive studies dedicated to the use of deep learning models for damage segmentation, processing high-resolution visual data can pose major computational difficulties. Traditionally, images are either uniformly downsampled or partitioned to cope with computational demands. However, the input is at risk of losing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsInfrastructure Maintenance and Monitoring · Non-Destructive Testing Techniques · Industrial Vision Systems and Defect Detection

MethodsMulti-Head Attention · Attention Is All You Need · Softmax · Linear Layer · Dense Connections · Residual Connection · Layer Normalization · Vision Transformer