BiCLIP: Domain Canonicalization via Structured Geometric Transformation

Pranav Mantini; Shishir K. Shah

arXiv:2603.08942·cs.CV·April 14, 2026

BiCLIP: Domain Canonicalization via Structured Geometric Transformation

Pranav Mantini, Shishir K. Shah

PDF

1 Repo

TL;DR

BiCLIP introduces a simple, low-parameter framework that applies structured geometric transformations to improve cross-domain alignment in vision-language models, achieving state-of-the-art results across multiple benchmarks.

Contribution

The paper proposes BiCLIP, a novel method that leverages geometric transformations for domain canonicalization, enhancing zero-shot domain adaptation in vision-language models.

Findings

01

BiCLIP outperforms existing methods on 11 benchmarks.

02

The learned transformations exhibit orthogonality and specific angular distributions.

03

Structured geometric alignment is key to robust domain adaptation.

Abstract

Recent advances in vision-language models (VLMs) have demonstrated remarkable zero-shot capabilities, yet adapting these models to specialized domains remains a significant challenge. Building on recent theoretical insights suggesting that independently trained VLMs are related by a canonical transformation, we extend this understanding to the concept of domains. We hypothesize that image features across disparate domains are related by a canonicalized geometric transformation that can be recovered using a small set of anchors. Few-shot classification provides a natural setting for this alignment, as the limited labeled samples serve as the anchors required to estimate this transformation. Motivated by this hypothesis, we introduce BiCLIP, a framework that applies a targeted transformation to multimodal features to enhance cross-modal alignment. Our approach is characterized by its…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

QuantitativeImagingLaboratory/BilinearCLIP
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.