Rethinking Learnable Tree Filter for Generic Feature Transform
Lin Song, Yanwei Li, Zhengkai Jiang, Zeming Li, Xiangyu Zhang, Hongbin, Sun, Jian Sun, Nanning Zheng

TL;DR
This paper introduces an improved learnable tree filter that relaxes geometric constraints, enabling better long-range dependency modeling and structural detail preservation in various vision tasks, with significant empirical performance gains.
Contribution
It reformulates the learnable tree filter as a Markov Random Field with a learnable unary term and proposes a differentiable spanning tree algorithm, enhancing flexibility and robustness.
Findings
Achieves 82.1% mIoU on Cityscapes semantic segmentation
Demonstrates consistent improvements in object detection and instance segmentation
Extends the method to multiple vision tasks with linear complexity
Abstract
The Learnable Tree Filter presents a remarkable approach to model structure-preserving relations for semantic segmentation. Nevertheless, the intrinsic geometric constraint forces it to focus on the regions with close spatial distance, hindering the effective long-range interactions. To relax the geometric constraint, we give the analysis by reformulating it as a Markov Random Field and introduce a learnable unary term. Besides, we propose a learnable spanning tree algorithm to replace the original non-differentiable one, which further improves the flexibility and robustness. With the above improvements, our method can better capture long-range dependencies and preserve structural details with linear complexity, which is extended to several vision tasks for more generic feature transform. Extensive experiments on object detection/instance segmentation demonstrate the consistent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Advanced Neural Network Applications · Video Surveillance and Tracking Methods
