What If: Generating Code to Answer Simulation Questions
Gal Peretz, Kira Radinsky

TL;DR
This paper introduces a neural program synthesis method that generates and executes code in a domain-specific language to answer complex chemistry and biology process questions, significantly improving accuracy over existing methods.
Contribution
It presents a novel dataset, a domain-specific language for process representation, and a reinforcement learning approach with a semantic reward for improved simulation-based question answering.
Findings
Achieved 88% accuracy on simulation questions.
Outperformed state-of-the-art neural program synthesis methods.
Outperformed end-to-end text-based approaches.
Abstract
Many texts, especially in chemistry and biology, describe complex processes. We focus on texts that describe a chemical reaction process and questions that ask about the process's outcome under different environmental conditions. To answer questions about such processes, one needs to understand the interactions between the different entities involved in the process and to simulate their state transitions during the process execution under different conditions. A state transition is defined as the memory modification the program does to the variables during the execution. We hypothesize that generating code and executing it to simulate the process will allow answering such questions. We, therefore, define a domain-specific language (DSL) to represent processes. We contribute to the community a unique dataset curated by chemists and annotated by computer scientists. The dataset is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Software Engineering Research · Natural Language Processing Techniques
