SIEVE: Sample-Efficient Parametric Learning from Natural Language
Parth Asawa, Alexandros G. Dimakis, Matei Zaharia

TL;DR
SIEVE is a novel method that enables sample-efficient parametric learning from natural language context, requiring as few as three examples by leveraging synthetic data generation and context decomposition.
Contribution
The paper introduces SIEVE, a new approach that improves parametric learning efficiency from natural language with minimal data, using a synthetic data pipeline and context distillation.
Findings
SIEVE outperforms prior methods with only three query examples.
SIEVE effectively internalizes context into model weights.
The approach improves reasoning tasks requiring context understanding.
Abstract
Natural language context-such as instructions, knowledge, or feedback-contains rich signal for adapting language models. While in-context learning provides adaptation via the prompt, parametric learning persists into model weights and can improve performance further, though is data hungry and heavily relies on either high-quality traces or automated verifiers. We propose SIEVE, a method for sample-efficient parametric learning from natural language context that requires as few as three query examples. SIEVE uses a novel synthetic data generation pipeline, SIEVE-GEN, that leverages the insight that context is decomposable. Decomposing context allows us to generate higher quality rollouts by pairing synthetic queries with only the applicable context rather than the entirety, then using context distillation to internalize context into the model. We evaluate in reasoning settings where…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
