Towards Light Weight Object Detection System

Dharma KC; Venkata Ravi Kiran Dayana; Meng-Lin Wu; Venkateswara Rao; Cherukuri; Hau Hwang

arXiv:2210.03861·cs.CV·October 11, 2022·1 cites

Towards Light Weight Object Detection System

Dharma KC, Venkata Ravi Kiran Dayana, Meng-Lin Wu, Venkateswara Rao, Cherukuri, Hau Hwang

PDF

Open Access

TL;DR

This paper introduces a lightweight transformer-based approach for object detection that reduces latency, improves accuracy through multi-resolution feature fusion, and offers a generalized architecture for future design.

Contribution

It proposes an approximation of self-attention layers, a transformer encoder for feature fusion, and the gFormer abstraction to advance lightweight object detection.

Findings

01

Reduced latency in transformer-based detection systems

02

Enhanced accuracy with multi-resolution feature fusion

03

Provided a flexible architecture for designing new transformers

Abstract

Transformers are a popular choice for classification tasks and as backbones for object detection tasks. However, their high latency brings challenges in their adaptation to lightweight object detection systems. We present an approximation of the self-attention layers used in the transformer architecture. This approximation reduces the latency of the classification system while incurring minimal loss in accuracy. We also present a method that uses a transformer encoder layer for multi-resolution feature fusion. This feature fusion improves the accuracy of the state-of-the-art lightweight object detection system without significantly increasing the number of parameters. Finally, we provide an abstraction for the transformer architecture called Generalized Transformer (gFormer) that can guide the design of novel transformer-like architectures.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Infrared Target Detection Methodologies

MethodsAttention Is All You Need · Linear Layer · Position-Wise Feed-Forward Layer · Residual Connection · Dropout · Softmax · Label Smoothing · Multi-Head Attention · Adam · Dense Connections