Treasure Hunt: Real-time Targeting of the Long Tail using Training-Time Markers
Daniel D'souza, Julia Kreutzer, Adrien Morisot, Ahmet \"Ust\"un, Sara Hooker

TL;DR
This paper introduces a training protocol that explicitly incorporates data and task markers to improve model performance on long-tail, underrepresented use cases, offering better control and significant gains in specialized domains.
Contribution
It presents a novel training approach using explicit markers for data and task characteristics, enhancing controllability and performance on rare and underrepresented tasks.
Findings
Average 5.7% improvement in open-ended generation quality.
Over 9.1% gains in underrepresented domains.
Up to 14.1% relative lift on specialized tasks.
Abstract
One of the most profound challenges of modern machine learning is performing well on the long-tail of rare and underrepresented features. Large general-purpose models are trained for many tasks, but work best on high-frequency use cases. After training, it is hard to adapt a model to perform well on specific use cases underrepresented in the training corpus. Relying on prompt engineering or few-shot examples to maximize the output quality on a particular test case can be frustrating, as models can be highly sensitive to small changes, react in unpredicted ways or rely on a fixed system prompt for maintaining performance. In this work, we ask: "Can we optimize our training protocols to both improve controllability and performance on underrepresented use cases at inference time?" We revisit the divide between training and inference techniques to improve long-tail performance while…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsStock Market Forecasting Methods · Time Series Analysis and Forecasting
MethodsSparse Evolutionary Training · Balanced Selection
