Knowledge-Aware Code Generation with Large Language Models
Tao Huang, Zhihong Sun, Zhi Jin, Ge Li, Chen Lyu

TL;DR
This paper introduces KareCoder, a knowledge-aware approach that enhances large language models' ability to solve novel programming problems by integrating a specialized knowledge library, significantly improving performance on unseen tasks.
Contribution
The paper presents a novel dataset, CodeF, and a knowledge integration method, KareCoder, to improve LLMs' problem-solving on unfamiliar programming challenges.
Findings
KareCoder improves Pass@1 by 23.3% on CodeF.
It outperforms direct ChatGPT code generation on novel problems.
It maintains strong performance on previously encountered problems.
Abstract
Large Language Models (LLMs) perform well on basic programming problems. However, they encounter challenges when dealing with complex tasks involving the use of diverse algorithmic and data structure skills, particularly programming competition-level problems. Notably, ChatGPT exhibits proficient performance on problems it has encountered during its pre-training phase, but this performance deteriorates when faced with novel problems. Consequently, enhancing the ability of LLMs to address unfamiliar problems has emerged as a pivotal research focus. The problem-solving process of LLMs mirrors human programmers' approach to a certain extent. When confronted with new programming tasks, human programmers engage in task planning and code writing with the previously acquired knowledge about algorithms and data structures. Despite having learned such knowledge, LLMs struggle to effectively…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Speech and dialogue systems · Natural Language Processing Techniques
