Improving Natural Language Capability of Code Large Language Model
Wei Li, Daoguang Zan, Bei Guan, Ailun Yu, Xiaolin Chen and, Yongji Wang

TL;DR
This paper introduces a novel framework that enhances code large language models by integrating natural language processing tools to better understand user requirements and generate more accurate code, validated on a new multilingual benchmark.
Contribution
The paper presents an innovative framework combining natural language understanding modules with code LLMs, addressing the gap in natural language capabilities of code generation models.
Findings
Framework significantly improves code generation accuracy.
Effective across five natural languages in the new MultiNL-H benchmark.
Experimental results demonstrate the framework's robustness and versatility.
Abstract
Code large language models (Code LLMs) have demonstrated remarkable performance in code generation. Nonetheless, most existing works focus on boosting code LLMs from the perspective of programming capabilities, while their natural language capabilities receive less attention. To fill this gap, we thus propose a novel framework, comprising two modules: AttentionExtractor, which is responsible for extracting key phrases from the user's natural language requirements, and AttentionCoder, which leverages these extracted phrases to generate target code to solve the requirement. This framework pioneers an innovative idea by seamlessly integrating code LLMs with traditional natural language processing tools. To validate the effectiveness of the framework, we craft a new code generation benchmark, called MultiNL-H, covering five natural languages. Extensive experimental results demonstrate the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Topic Modeling · Software System Performance and Reliability
MethodsFocus
