Language-Induced Priors for Domain Adaptation
Qiyuan Chen, Jiayu Zhou, Raed Al Kontar

TL;DR
This paper introduces a novel probabilistic framework that uses textual descriptions of target domains to improve source relevance identification in domain adaptation, especially in data-scarce scenarios.
Contribution
It proposes the Language-Induced Prior (LIP), leveraging large language models to guide source selection and improve adaptation performance.
Findings
LIP guides source relevance in weak signal regimes.
The estimator matches an oracle MSE under correct prior.
Framework is validated on Gaussian, predictive, and prescriptive tasks.
Abstract
Domain adaptation faces a fundamental paradox in the cold-start regime. When target data is scarce, statistical methods fail to distinguish relevant source domains from irrelevant ones, which often leads to negative transfer. In this paper, we address this challenge by leveraging expert textual descriptions of the target domain, a resource that is often available but overlooked. We propose a probabilistic framework that translates these semantic descriptions into a choice model, namely a Language-Induced Prior (LIP), that learns the preferences from a pretrained Large Language Model (LLM). The LIP is then integrated into an Expectation-Maximization algorithm to identify source relevance. Methodologically, this framework is compatible with any parametric model where a likelihood is available. It allows the LIP to guide the selection of sources when target signals are weak, while…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
