TRUST: Leveraging Text Robustness for Unsupervised Domain Adaptation

Mattia Litrico; Mario Valerio Giuffrida; Sebastiano Battiato; Devis Tuia

arXiv:2508.06452·cs.CV·August 11, 2025

TRUST: Leveraging Text Robustness for Unsupervised Domain Adaptation

Mattia Litrico, Mario Valerio Giuffrida, Sebastiano Battiato, Devis Tuia

PDF

Open Access

TL;DR

TRUST introduces a novel unsupervised domain adaptation method that leverages language modality robustness, using caption-based pseudo-labels and multimodal contrastive learning to improve vision model adaptation across complex domain shifts.

Contribution

The paper proposes TRUST, a new approach that exploits language modality robustness, uses uncertainty-aware pseudo-labeling, and employs multimodal contrastive learning for improved domain adaptation.

Findings

01

Outperforms previous methods on DomainNet and GeoNet datasets.

02

Sets new state-of-the-art in complex domain shifts.

03

Effectively mitigates pseudo-label errors using uncertainty estimation.

Abstract

Recent unsupervised domain adaptation (UDA) methods have shown great success in addressing classical domain shifts (e.g., synthetic-to-real), but they still suffer under complex shifts (e.g. geographical shift), where both the background and object appearances differ significantly across domains. Prior works showed that the language modality can help in the adaptation process, exhibiting more robustness to such complex shifts. In this paper, we introduce TRUST, a novel UDA approach that exploits the robustness of the language modality to guide the adaptation of a vision model. TRUST generates pseudo-labels for target samples from their captions and introduces a novel uncertainty estimation strategy that uses normalised CLIP similarity scores to estimate the uncertainty of the generated pseudo-labels. Such estimated uncertainty is then used to reweight the classification loss, mitigating…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Topic Modeling · Multimodal Machine Learning Applications