Ultra-High Resolution Segmentation via Boundary-Enhanced Patch-Merging Transformer
Haopeng Sun, Yingwei Zhang, Lumin Xu, Sheng Jin, Yiqiang Chen

TL;DR
This paper introduces BPT, a novel transformer-based method for ultra-high resolution image segmentation that effectively combines global context and fine boundary details without extra computational cost.
Contribution
The paper proposes the Boundary-enhanced Patch-merging Transformer (BPT), integrating a dynamic token allocation mechanism and boundary information to improve UHR segmentation.
Findings
BPT outperforms previous state-of-the-art methods on multiple benchmarks.
BPT achieves superior segmentation accuracy without additional computational overhead.
The boundary-enhanced module effectively captures fine details in UHR images.
Abstract
Segmentation of ultra-high resolution (UHR) images is a critical task with numerous applications, yet it poses significant challenges due to high spatial resolution and rich fine details. Recent approaches adopt a dual-branch architecture, where a global branch learns long-range contextual information and a local branch captures fine details. However, they struggle to handle the conflict between global and local information while adding significant extra computational cost. Inspired by the human visual system's ability to rapidly orient attention to important areas with fine details and filter out irrelevant information, we propose a novel UHR segmentation method called Boundary-enhanced Patch-merging Transformer (BPT). BPT consists of two key components: (1) Patch-Merging Transformer (PMT) for dynamically allocating tokens to informative regions to acquire global and local…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsImage Processing Techniques and Applications · CCD and CMOS Imaging Sensors · Advanced Image Processing Techniques
MethodsAttention Is All You Need · Linear Layer · ADaptive gradient method with the OPTimal convergence rate · Adam · Layer Normalization · Dropout · Position-Wise Feed-Forward Layer · Label Smoothing · Dense Connections · Byte Pair Encoding
