Task Expansion and Cross Refinement for Open-World Conditional Modeling
Shreyas Bhat Brahmavar, Qiyang Liu, Yang Li, Junier Oliva

TL;DR
This paper introduces TEXR, a semi-supervised framework that expands and refines datasets to improve open-world conditional modeling across diverse tasks and datasets.
Contribution
TEXR is a novel semi-supervised approach that synthesizes and refines data to enhance the coverage and performance of open-world conditional models.
Findings
TEXR improves zero-shot, few-shot, and many-shot performance across benchmarks.
Structured data synthesis and cross refinement reduce bias and enhance model accuracy.
The framework effectively enlarges task coverage in open-world settings.
Abstract
Open-world conditional modeling (OCM), requires a single model to answer arbitrary conditional queries across heterogeneous datasets, where observed variables and targets vary and arise from a vast open-ended task universe. Because any finite collection of real-world datasets covers only a small fraction of this space, we propose Task Expansion and Cross Refinement (TEXR), a semi-supervised framework that enlarges effective task coverage through structured synthesis and refinement of semantic data contexts. TEXR first generates diverse uninstantiated dataset schemas and weakly instantiates them via structured probabilistic generators guided by large language models. It then performs cross-model refinement by training on disjoint data partitions and revising synthetic values across splits to reduce confirmation bias and improve pseudo-value quality. The refined synthetic datasets are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Topic Modeling · Explainable Artificial Intelligence (XAI)
