VisualCoder: Guiding Large Language Models in Code Execution with   Fine-grained Multimodal Chain-of-Thought Reasoning

Cuong Chi Le; Hoang-Chau Truong-Vinh; Huy Nhat Phan; Dung Duy Le; Tien; N. Nguyen; Nghi D. Q. Bui

arXiv:2410.23402·cs.SE·February 11, 2025

VisualCoder: Guiding Large Language Models in Code Execution with Fine-grained Multimodal Chain-of-Thought Reasoning

Cuong Chi Le, Hoang-Chau Truong-Vinh, Huy Nhat Phan, Dung Duy Le, Tien, N. Nguyen, Nghi D. Q. Bui

PDF

Open Access 1 Video

TL;DR

VisualCoder enhances large language models' ability to reason about code by integrating multimodal Chain-of-Thought reasoning with visual Control Flow Graphs, leading to improved program behavior prediction and error detection.

Contribution

It introduces a novel multimodal CoT approach that combines code with visual CFGs, addressing dynamic reasoning challenges in code analysis.

Findings

01

Improved accuracy in program behavior prediction

02

Enhanced error detection capabilities

03

Better output generation in code reasoning tasks

Abstract

Predicting program behavior and reasoning about code execution remain significant challenges in software engineering, particularly for large language models (LLMs) designed for code analysis. While these models excel at understanding static syntax, they often struggle with dynamic reasoning tasks. We introduce VisualCoder, a simple yet effective approach that enhances code reasoning by integrating multimodal Chain-of-Thought (CoT) reasoning with a visual Control Flow Graph (CFG). By aligning code snippets with their corresponding CFGs, VisualCoder provides deeper insights into execution flows. We address challenges in multimodal CoT integration through a reference mechanism, ensuring consistency between code and its execution path, thereby improving performance in program behavior prediction, error detection, and output generation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

VisualCoder: Guiding Large Language Models in Code Execution with Fine-grained Multimodal Chain-of-Thought Reasoning· underline

Taxonomy

TopicsSoftware Engineering Research · Natural Language Processing Techniques · Model-Driven Software Engineering Techniques