Exploring Interaction Paradigms for LLM Agents in Scientific Visualization
Jackson Vonderhorst, Kuangshi Ai, Haichao Miao, Shusen Liu, Chaoli Wang

TL;DR
This study evaluates various LLM agent interaction paradigms in scientific visualization tasks, highlighting tradeoffs in performance, efficiency, and robustness across different modes and memory configurations.
Contribution
It provides a comprehensive comparison of three primary LLM interaction paradigms and analyzes their strengths, limitations, and the impact of interaction modalities and memory in SciVis workflows.
Findings
General-purpose coding agents have the highest success rates but are computationally costly.
Domain-specific agents are more efficient and stable but less flexible.
Persistent memory improves performance, especially in CLI and GUI settings.
Abstract
This paper examines how different types of large language model (LLM) agents perform on scientific visualization (SciVis) tasks, where users generate visualization workflows from natural-language instructions. We compare three primary interaction paradigms, including domain-specific agents with structured tool use, computer-use agents, and general-purpose coding agents, by evaluating eight representative agents across 15 benchmark tasks and measuring visualization quality, efficiency, robustness, and computational cost. We further analyze interaction modalities, including code scripts and model context protocol (MCP) or API calls for structured tool use, as well as command-line interfaces (CLI) and graphical user interfaces (GUI) for more general interaction, while additionally studying the effect of persistent memory in selected agents. The results reveal clear tradeoffs across…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
