Look the Other Way: Designing 'Positive' Molecules with Negative Data via Task Arithmetic
R{\i}za \"Oz\c{c}elik, Sarah de Ruiter, Francesca Grisoni

TL;DR
This paper introduces molecular task arithmetic, a novel approach that leverages negative data to generate positive molecules, improving diversity and success rates in molecular design tasks without needing positive examples.
Contribution
The paper presents a new method called molecular task arithmetic that uses negative data to guide the generation of positive molecules, enhancing diversity and effectiveness in molecular design.
Findings
Generated more diverse molecules than positive-data-trained models.
Successfully applied to dual-objective and few-shot design tasks.
Maintained complex properties like good docking scores.
Abstract
The scarcity of molecules with desirable properties (i.e., `positive' molecules) is an inherent bottleneck for generative molecule design. To sidestep such obstacle, here we propose molecular task arithmetic: training a model on diverse and abundant negative examples to learn 'property directions' - without accessing any positively labeled data - and moving models in the opposite property directions to generate positive molecules. When analyzed on 33 design experiments with distinct molecular entities (small molecules, proteins), model architectures, and scales, molecular task arithmetic generated more diverse and successful designs than models trained on positive molecules in general. Moreover, we employed molecular task arithmetic in dual-objective and few-shot design tasks. We find that molecular task arithmetic can consistently increase the diversity of designs while maintaining…
Peer Reviews
Decision·Submitted to ICLR 2026
- Clever use of negative data. The key insight—most real medicinal chemistry data is overwhelmingly negative, so we should learn from what “doesn’t work” and then invert the direction—is both practical and underexplored. It directly speaks to the data imbalance that plagues many molecular design tasks. - Empirically effective on some tasks. On the tested single-property and docking-based tasks, the method can recover a nontrivial number of “successful” and diverse solutions, sometimes outperfor
- Property space is too easy / narrow. Most experiments are on relatively simple, monotonic, and well-behaved properties (e.g., logP high/low, TPSA high/low, counts of functional features). Even the docking objectives look like “flat-bottom” tasks—once you’re in a good-enough region, the signal stops being very discriminative. This makes it unclear whether the proposed method would still work on sharper, more structured, or conflicting objectives. The paper should include harder benchmarks such
Generally, the writing is very clear and communicates the arguments and ideas without issue, even to a less experienced reader. The concept of task arithmetic is very elegant, and its application to molecular optimization feels almost intuitive because of the quality of the explanations. The experiment design and results are very extensive, which additionally strengthens the main point of the paper. The graphs/tables are clear and easy to read.
The lack of mention of chemical LMs, like ChemBERTa or ether0, takes away from the credibility of the article. Though the results are impressive and interesting, LSTMs lack much of the expressive power of LMs. Though task arithmetic appears to work in the case of LLMs on natural language, it’s not clear whether that will extrapolate to models working on the chemical domain, especially given the apparent challenge with this generation task. Additionally,
1. The idea of going in the opposite direction from the "bad" model is very interesting and innovative. Being able to unlock potentially new behavior from entirely unseen domains could prove to be of huge significance and could open new avenues of research and application. The proposed solution is very elegant. 2. The central premise of the lack of "positive" samples in scientific domains, especially drug design, is a significant problem. 3. It is very interesting that the model obtains good s
1. "Current training strategies rely on rare, positive molecules." This is not true in general. This may be true for the case of fine-tuning approaches. But there is a long line of work in latent space and post-hoc optimization (such as Bayesian optimization, generative algorithms, or gradient-based steering) for generative design. So will this may be a strong candidate for a transfer learning strategy for molecule design models; the aforementioned techniques that do not require transfer learnin
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVarious Chemistry Research Topics · Computational Drug Discovery Methods
