PolyMaX: General Dense Prediction with Mask Transformer
Xuan Yang, Liangzhe Yuan, Kimberly Wilber, Astuti Sharma, Xiuye Gu,, Siyuan Qiao, Stephanie Debats, Huisheng Wang, Hartwig Adam, Mikhail, Sirotenko, Liang-Chieh Chen

TL;DR
PolyMaX introduces a unified mask transformer framework for various dense prediction tasks, achieving state-of-the-art results by generalizing cluster-prediction methods to continuous outputs like depth and surface normals.
Contribution
The paper generalizes mask transformer-based cluster prediction to all dense prediction tasks, unifying discrete and continuous output predictions within a single framework.
Findings
State-of-the-art performance on NYUD-v2 benchmarks
Effective unification of dense prediction tasks
Inspired by success of discretization in depth estimation
Abstract
Dense prediction tasks, such as semantic segmentation, depth estimation, and surface normal prediction, can be easily formulated as per-pixel classification (discrete outputs) or regression (continuous outputs). This per-pixel prediction paradigm has remained popular due to the prevalence of fully convolutional networks. However, on the recent frontier of segmentation task, the community has been witnessing a shift of paradigm from per-pixel prediction to cluster-prediction with the emergence of transformer architectures, particularly the mask transformers, which directly predicts a label for a mask instead of a pixel. Despite this shift, methods based on the per-pixel prediction paradigm still dominate the benchmarks on the other dense prediction tasks that require continuous outputs, such as depth estimation and surface normal prediction. Motivated by the success of DORN and AdaBins…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
PolyMaX: General Dense Prediction With Mask Transformer· youtube
Taxonomy
TopicsAdvanced Neural Network Applications · Image Enhancement Techniques · Advanced Vision and Imaging
