Butter: Frequency Consistency and Hierarchical Fusion for Autonomous Driving Object Detection

Xiaojian Lin; Wenxin Zhang; Yuchu Jiang; Wangyu Wu; Yiran Guo; Kangxu Wang; Zongzheng Zhang; Guijin Wang; Lei Jin; Hao Zhao

arXiv:2507.13373·cs.CV·August 1, 2025

Butter: Frequency Consistency and Hierarchical Fusion for Autonomous Driving Object Detection

Xiaojian Lin, Wenxin Zhang, Yuchu Jiang, Wangyu Wu, Yiran Guo, Kangxu Wang, Zongzheng Zhang, Guijin Wang, Lei Jin, Hao Zhao

PDF

1 Models

TL;DR

Butter is a novel object detection framework for autonomous driving that enhances hierarchical feature consistency and fusion, leading to improved accuracy and efficiency in detecting pedestrians, vehicles, and traffic signs.

Contribution

It introduces FAFCE and PHFFNet modules to refine multi-scale features and integrate hierarchical information, addressing feature inconsistency and semantic gaps in existing detectors.

Findings

01

Outperforms existing methods on BDD100K, KITTI, and Cityscapes datasets.

02

Improves detection accuracy while reducing model complexity.

03

Achieves a balance between robustness and computational efficiency.

Abstract

Hierarchical feature representations play a pivotal role in computer vision, particularly in object detection for autonomous driving. Multi-level semantic understanding is crucial for accurately identifying pedestrians, vehicles, and traffic signs in dynamic environments. However, existing architectures, such as YOLO and DETR, struggle to maintain feature consistency across different scales while balancing detection precision and computational efficiency. To address these challenges, we propose Butter, a novel object detection framework designed to enhance hierarchical feature representations for improving detection robustness. Specifically, Butter introduces two key innovations: Frequency-Adaptive Feature Consistency Enhancement (FAFCE) Component, which refines multi-scale feature consistency by leveraging adaptive frequency filtering to enhance structural and boundary precision, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
Christopher-Lim/Butter
model· ♡ 1
♡ 1

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.