Progressive Token Length Scaling in Transformer Encoders for Efficient   Universal Segmentation

Abhishek Aich; Yumin Suh; Samuel Schulter; Manmohan Chandraker

arXiv:2404.14657·cs.CV·April 1, 2025

Progressive Token Length Scaling in Transformer Encoders for Efficient Universal Segmentation

Abhishek Aich, Yumin Suh, Samuel Schulter, Manmohan Chandraker

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces PRO-SCALE, a method that progressively reduces token length in transformer encoders for segmentation, significantly lowering computational costs while maintaining performance.

Contribution

PRO-SCALE is a novel strategy that adaptively scales token length across encoder layers, improving efficiency in transformer-based segmentation models.

Findings

01

52% reduction in encoder GFLOPs without performance loss

02

27% overall GFLOPs reduction with maintained accuracy

03

Demonstrated flexibility across different architectural configurations

Abstract

A powerful architecture for universal segmentation relies on transformers that encode multi-scale image features and decode object queries into mask predictions. With efficiency being a high priority for scaling such models, we observed that the state-of-the-art method Mask2Former uses 50% of its compute only on the transformer encoder. This is due to the retention of a full-length token-level representation of all backbone feature scales at each encoder layer. With this observation, we propose a strategy termed PROgressive Token Length SCALing for Efficient transformer encoders (PRO-SCALE) that can be plugged-in to the Mask2Former segmentation architecture to significantly reduce the computational cost. The underlying principle of PRO-SCALE is: progressively scale the length of the tokens with the layers of the encoder. This allows PRO-SCALE to reduce computations by a large margin…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

abhishekaich27/proscale-pytorch
pytorchOfficial

Videos

Progressive Token Length Scaling in Transformer Encoders for Efficient Universal Segmentation· slideslive

Taxonomy

TopicsAdvancements in Photolithography Techniques · Optical measurement and interference techniques · Industrial Vision Systems and Defect Detection