Illuminating LLM Coding Agents: Visual Analytics for Deeper Understanding and Enhancement
Junpeng Wang, Yuzhong Chen, Menghai Pan, Chin-Chia Michael Yeh, Mahashweta Das

TL;DR
This paper presents a visual analytics system for analyzing large language model-based coding agents, enabling detailed comparison and understanding of their debugging, process, and behavior across different models to improve development and debugging efficiency.
Contribution
The paper introduces a novel visual analytics system tailored for examining LLM coding agents, supporting multi-level analysis to enhance understanding and debugging of these agents.
Findings
System enables detailed code evolution analysis
Facilitates comparison of different solution processes
Provides insights into LLM behavioral variations
Abstract
Coding agents powered by large language models (LLMs) have gained traction for automating code generation through iterative problem-solving with minimal human involvement. Despite the emergence of various frameworks, e.g., LangChain, AutoML, and AIDE, ML scientists still struggle to effectively review and adjust the agents' coding process. The current approach of manually inspecting individual outputs is inefficient, making it difficult to track code evolution, compare coding iterations, and identify improvement opportunities. To address this challenge, we introduce a visual analytics system designed to enhance the examination of coding agent behaviors. Focusing on the AIDE framework, our system supports comparative analysis across three levels: (1) Code-Level Analysis, which reveals how the agent debugs and refines its code over iterations; (2) Process-Level Analysis, which contrasts…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimedia Communication and Technology
