CF-DETR: Coarse-to-Fine Transformer for Real-Time Object Detection
Woojin Shin, Donghwa Kang, Byeongyun Park, Brent Byunghoon Kang, Jinkyu Lee, Hyeongboo Baek

TL;DR
CF-DETR introduces a coarse-to-fine Transformer architecture with a real-time scheduling framework to improve accuracy and meet strict timing constraints in autonomous vehicle perception systems.
Contribution
It proposes a novel Transformer design and scheduling method that dynamically balances accuracy and real-time requirements for AV object detection.
Findings
Successfully meets real-time deadlines for critical objects.
Achieves higher overall detection accuracy.
Demonstrates effectiveness on embedded and AV platforms.
Abstract
Detection Transformers (DETR) are increasingly adopted in autonomous vehicle (AV) perception systems due to their superior accuracy over convolutional networks. However, concurrently executing multiple DETR tasks presents significant challenges in meeting firm real-time deadlines (R1) and high accuracy requirements (R2), particularly for safety-critical objects, while navigating the inherent latency-accuracy trade-off under resource constraints. Existing real-time DNN scheduling approaches often treat models generically, failing to leverage Transformer-specific properties for efficient resource allocation. To address these challenges, we propose CF-DETR, an integrated system featuring a novel coarse-to-fine Transformer architecture and a dedicated real-time scheduling framework NPFP**. CF-DETR employs three key strategies (A1: coarse-to-fine inference, A2: selective fine inference, A3:…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Autonomous Vehicle Technology and Safety
MethodsAttention Is All You Need · Linear Layer · Dense Connections · Convolution · Softmax · Feedforward Network · Position-Wise Feed-Forward Layer · Absolute Position Encodings · Detection Transformer · Label Smoothing
