GRASP: Guided Residual Adapters with Sample-wise Partitioning

Felix N\"utzel; Mischa Dombrowski; Bernhard Kainz

arXiv:2512.01675·cs.CV·May 13, 2026

GRASP: Guided Residual Adapters with Sample-wise Partitioning

Felix N\"utzel, Mischa Dombrowski, Bernhard Kainz

PDF

TL;DR

GRASP introduces a novel partitioning method with residual adapters to improve long-tail class performance in flow-matching transformers, significantly enhancing fidelity, diversity, and tail-class coverage.

Contribution

It proposes a static, sample-wise partitioning approach with residual adapters that improves long-tail class generation without altering the core flow-matching objective.

Findings

01

Reduces overall FID by up to 80%

02

Increases tail-class coverage by up to 44%

03

Outperforms alternative methods in medical and ImageNet-LT datasets

Abstract

Text-to-image flow matching transformers degrade sharply in long-tail settings: tail-class outputs collapse in fidelity and diversity, limiting their value as synthetic augmentation for rare conditions. We trace this to low head-versus-tail gradient alignment during fine-tuning, an optimization-level pathology that conditioning- and sampling-side interventions do not address. We propose GRASP (Guided Residual Adapters with Sample-wise Partitioning): a deterministic partition of the conditioning space, paired with group-specific residual adapters in the transformer feedforward layers, that leaves the flow-matching objective and the sampler untouched. In conditional flow matching, condition values index distinct sets of probability paths, so partitioning along the conditioning is the structurally correct factorization suitable as gradient alignment proxy. Because the partition is static,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.