Can AI Scientist Agents Learn from Lab-in-the-Loop Feedback? Evidence from Iterative Perturbation Discovery
Gilles Wainrib, Barbara Bodinier, Haitem Dakhli, Josep Monserrat, Almudena Espin Perez, Sabrina Carpentier, Roberta Codato, John Klein

TL;DR
This study demonstrates that large language models can genuinely learn from lab-in-the-loop feedback in scientific experiments, significantly improving discovery outcomes when models are sufficiently capable.
Contribution
The paper provides evidence that LLM-based agents can utilize experimental feedback effectively, showing that in-context learning depends on model capability and structured feedback.
Findings
Access to feedback increases discoveries by 53.4%.
Performance gain disappears with permuted feedback, confirming feedback-driven learning.
Upgrading models reduces hallucination rates and enhances feedback utilization.
Abstract
Recent work has questioned whether large language models (LLMs) can perform genuine in-context learning (ICL) for scientific experimental design, with prior studies suggesting that LLM-based agents exhibit no sensitivity to experimental feedback. We shed new light on this question by carrying out 800 independently replicated experiments on iterative perturbation discovery in Cell Painting high-content screening. We compare an LLM agent that iteratively updates its hypotheses using experimental feedback to a zero-shot baseline that relies solely on pretraining knowledge retrieval. Access to feedback yields a increase in discoveries per feature on average (). To test whether this improvement arises from genuine feedback-driven learning rather than prompt-induced recall of pretraining knowledge, we introduce a random feedback control in which hit/miss labels are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
