Object Detection using Oriented Window Learning Vi-sion Transformer: Roadway Assets Recognition
Taqwa Alhadidi, Ahmed Jaber, Shadi Jaradat, Huthaifa I Ashqar, and, Mohammed Elhenawy

TL;DR
This paper introduces OWL-ViT, a novel vision transformer model with oriented window learning, tailored for roadway asset detection in transportation systems, demonstrating high efficiency and robustness across diverse scenarios.
Contribution
It presents a new method integrating OWL-ViT within a one-shot learning framework for effective roadway asset recognition, addressing challenges of data variability and object diversity.
Findings
High detection accuracy across multiple roadway assets
Robust performance under varying resolutions and contexts
Enhanced detection consistency and reliability
Abstract
Object detection is a critical component of transportation systems, particularly for applications such as autonomous driving, traffic monitoring, and infrastructure maintenance. Traditional object detection methods often struggle with limited data and variability in object appearance. The Oriented Window Learning Vision Transformer (OWL-ViT) offers a novel approach by adapting window orientations to the geometry and existence of objects, making it highly suitable for detecting diverse roadway assets. This study leverages OWL-ViT within a one-shot learning framework to recognize transportation infrastructure components, such as traffic signs, poles, pavement, and cracks. This study presents a novel method for roadway asset detection using OWL-ViT. We conducted a series of experiments to evaluate the performance of the model in terms of detection consistency, semantic flexibility, visual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Industrial Vision Systems and Defect Detection · Vehicle License Plate Recognition
MethodsResidual Connection · Softmax · Layer Normalization · Byte Pair Encoding · Label Smoothing · Adam · Attention Is All You Need · Linear Layer · Multi-Head Attention · Position-Wise Feed-Forward Layer
