Bridging Code Semantic and LLMs: Semantic Chain-of-Thought Prompting for Code Generation
Yingwei Ma, Yue Yu, Shanshan Li, Yu Jiang, Yong Guo, Yuanliang Zhang,, Yutao Xie, Xiangke Liao

TL;DR
This paper introduces Semantic Chain-of-Thought (SeCoT), a novel prompting method that incorporates semantic information of code into LLMs to improve code generation accuracy, achieving state-of-the-art results on benchmarks.
Contribution
SeCoT automates the integration of code semantic features into LLMs via in-context learning, enhancing code generation without complex static or dynamic analysis.
Findings
SeCoT significantly improves code generation accuracy.
Achieves state-of-the-art results on HumanEval, HumanEval-ET, and MBPP benchmarks.
Effective with both ChatGPT and WizardCoder models.
Abstract
Large language models (LLMs) have showcased remarkable prowess in code generation. However, automated code generation is still challenging since it requires a high-level semantic mapping between natural language requirements and codes. Most existing LLMs-based approaches for code generation rely on decoder-only causal language models often treate codes merely as plain text tokens, i.e., feeding the requirements as a prompt input, and outputing code as flat sequence of tokens, potentially missing the rich semantic features inherent in source code. To bridge this gap, this paper proposes the "Semantic Chain-of-Thought" approach to intruduce semantic information of code, named SeCoT. Our motivation is that the semantic information of the source code (\eg data flow and control flow) describes more precise program execution behavior, intention and function. By guiding LLM consider and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Topic Modeling · Scientific Computing and Data Management
