Randomized-MLP Regularization Improves Domain Adaptation and Interpretability in DINOv2

Joel Valdivia Ortega; Lorenz Lamm; Franziska Eckardt; Benedikt Schworm; Marion Jasnin; Tingying Peng

arXiv:2511.05509·cs.CV·November 11, 2025

Randomized-MLP Regularization Improves Domain Adaptation and Interpretability in DINOv2

Joel Valdivia Ortega, Lorenz Lamm, Franziska Eckardt, Benedikt Schworm, Marion Jasnin, Tingying Peng

PDF

Open Access 1 Video

TL;DR

This paper introduces Randomized-MLP regularization for DINOv2, which enhances interpretability and domain adaptation in vision transformers, especially in medical imaging, without sacrificing performance.

Contribution

It proposes a contrastive learning-based RMLP regularization method for fine-tuning ViTs, improving interpretability and domain robustness in vision models.

Findings

01

RMLP improves interpretability of attention maps.

02

RMLP maintains or enhances downstream performance.

03

Mathematical analysis provides insights into RMLP's role.

Abstract

Vision Transformers (ViTs), such as DINOv2, achieve strong performance across domains but often repurpose low-informative patch tokens in ways that reduce the interpretability of attention and feature maps. This challenge is especially evident in medical imaging, where domain shifts can degrade both performance and transparency. In this paper, we introduce Randomized-MLP (RMLP) regularization, a contrastive learning-based method that encourages more semantically aligned representations. We use RMLPs when fine-tuning DINOv2 to both medical and natural image modalities, showing that it improves or maintains downstream performance while producing more interpretable attention maps. We also provide a mathematical analysis of RMLPs, offering insights into its role in enhancing ViT-based models and advancing our understanding of contrastive learning.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Randomized-MLP Regularization Improves Domain Adaptation and Interpretability in DINOv2· slideslive

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Visual Attention and Saliency Detection · Generative Adversarial Networks and Image Synthesis