An Empirical Study of Extrapolation in Text Generation with Scalar Control
Aashi Jain, Taylor Berg-Kirkpatrick

TL;DR
This study empirically evaluates how different scalar control input encoding methods affect the ability of text generation models to extrapolate to unseen control values, finding that simple scalar inputs often perform best.
Contribution
It provides a comprehensive comparison of scalar input encoding strategies for extrapolation in text generation, highlighting the effectiveness of the simplest approach.
Findings
Direct scalar inputs enable reliable extrapolation.
Learnable and sinusoidal embeddings do not outperform simple inputs.
Simple scalar input approach is most effective for unseen control ranges.
Abstract
We conduct an empirical evaluation of extrapolation performance when conditioning on scalar control inputs like desired output length, desired edit from an input sentence, and desired sentiment across three text generation tasks. Specifically, we examine a zero-shot setting where models are asked to generalize to ranges of control values not seen during training. We focus on evaluating popular embedding methods for scalar inputs, including both learnable and sinusoidal embeddings, as well as simpler approaches. Surprisingly, our findings indicate that the simplest strategy of using scalar inputs directly, without further encoding, most reliably allows for successful extrapolation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Software Engineering Research
