Extrapolative Controlled Sequence Generation via Iterative Refinement
Vishakh Padmakumar, Richard Yuanzhe Pang, He He, Ankur P. Parikh

TL;DR
This paper introduces Iterative Controlled Extrapolation (ICE), a method that makes local sequence edits iteratively to generate sequences with attributes beyond training data, outperforming existing methods in language and protein tasks.
Contribution
The paper presents ICE, a novel iterative approach for extrapolative sequence generation that effectively handles out-of-distribution attribute values through local edits.
Findings
ICE outperforms state-of-the-art methods in natural language and protein tasks.
ICE effectively enables extrapolation beyond training attribute ranges.
The approach is simple yet highly effective in controlled sequence generation.
Abstract
We study the problem of extrapolative controlled generation, i.e., generating sequences with attribute values beyond the range seen in training. This task is of significant importance in automated design, especially drug discovery, where the goal is to design novel proteins that are \textit{better} (e.g., more stable) than existing sequences. Thus, by definition, the target sequences and their attribute values are out of the training distribution, posing challenges to existing methods that aim to directly generate the target sequence. Instead, in this work, we propose Iterative Controlled Extrapolation (ICE) which iteratively makes local edits to a sequence to enable extrapolation. We train the model on synthetically generated sequence pairs that demonstrate small improvement in the attribute value. Results on one natural language task (sentiment analysis) and two protein engineering…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsSoftware Engineering Research · Topic Modeling · Machine Learning in Materials Science
