PAUMER: Patch Pausing Transformer for Semantic Segmentation

Evann Courdier; Prabhu Teja Sivaprasad; Fran\c{c}ois Fleuret

arXiv:2311.00586·cs.CV·November 2, 2023·1 cites

PAUMER: Patch Pausing Transformer for Semantic Segmentation

Evann Courdier, Prabhu Teja Sivaprasad, Fran\c{c}ois Fleuret

PDF

Open Access

TL;DR

PAUMER introduces a patch pausing strategy for segmentation transformers, significantly improving efficiency by dynamically stopping computation on less informative patches based on prediction entropy, adaptable at inference.

Contribution

It proposes a novel patch pausing mechanism using entropy-based criteria, enabling efficient and adaptable segmentation transformer inference.

Findings

01

Achieves about 50% higher throughput on Cityscapes and ADE20K datasets.

02

Maintains high segmentation quality with minimal mIoU drop.

03

Demonstrates effective dynamic computation control in segmentation transformers.

Abstract

We study the problem of improving the efficiency of segmentation transformers by using disparate amounts of computation for different parts of the image. Our method, PAUMER, accomplishes this by pausing computation for patches that are deemed to not need any more computation before the final decoder. We use the entropy of predictions computed from intermediate activations as the pausing criterion, and find this aligns well with semantics of the image. Our method has a unique advantage that a single network trained with the proposed strategy can be effortlessly adapted at inference to various run-time requirements by modulating its pausing parameters. On two standard segmentation datasets, Cityscapes and ADE20K, we show that our method operates with about a $50%$ higher throughput with an mIoU drop of about $0.65%$ and $4.6%$ respectively.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Visual Attention and Saliency Detection · Multimodal Machine Learning Applications