ORION: ORthonormal Text Encoding for Universal VLM AdaptatION

Omprakash Chakraborty; Jose Dolz; Ismail Ben Ayed

arXiv:2602.19530·cs.CV·March 30, 2026

ORION: ORthonormal Text Encoding for Universal VLM AdaptatION

Omprakash Chakraborty, Jose Dolz, Ismail Ben Ayed

PDF

TL;DR

ORION is a fine-tuning framework for text encoders that enhances vision-language models by promoting orthogonality among class representations, leading to improved task discriminability across multiple benchmarks.

Contribution

It introduces a novel orthogonality-based loss for fine-tuning text encoders using only class names, improving the quality of textual prototypes for VLMs.

Findings

01

Consistently improves performance across 11 benchmarks.

02

Enhances various VLM backbones in zero-shot, few-shot, and test-time adaptation.

03

Provides a probabilistic interpretation of the orthogonality penalty.

Abstract

Vision language models (VLMs) have demonstrated remarkable generalization across diverse tasks, yet their performance remains constrained by the quality and geometry of the textual prototypes used to represent classes. Standard zero shot classifiers, derived from frozen text encoders and handcrafted prompts, may yield correlated or weakly separated embeddings that limit task specific discriminability. We introduce ORION, a text encoder fine tuning framework that improves pretrained VLMs using only class names. Our method optimizes, via low rank adaptation, a novel loss integrating two terms, one promoting pairwise orthogonality between the textual representations of the classes of a given task and the other penalizing deviations from the initial class prototypes. Furthermore, we provide a probabilistic interpretation of our orthogonality penalty, connecting it to the general maximum…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.