Statically Contextualizing Large Language Models with Typed Holes
Andrew Blinn, Xiang Li, June Hyung Kim, Cyrus Omar

TL;DR
This paper presents a method to improve large language model code completion by integrating type and binding context from language servers, enhancing accuracy especially in complex, real-world programming scenarios.
Contribution
It introduces a novel approach combining IDE-like static context with LLMs for more accurate code synthesis, validated through a new benchmark and extension to the LSP.
Findings
Type-aware context significantly improves code completion quality.
The approach reduces hallucinations and broken code in LLM outputs.
Porting techniques to TypeScript demonstrates broader applicability.
Abstract
Large language models (LLMs) have reshaped the landscape of program synthesis. However, contemporary LLM-based code completion systems often hallucinate broken code because they lack appropriate context, particularly when working with definitions not in the training data nor near the cursor. This paper demonstrates that tight integration with the type and binding structure of a language, as exposed by its language server, can address this contextualization problem in a token-efficient manner. In short, we contend that AIs need IDEs, too! In particular, we integrate LLM code generation into the Hazel live program sketching environment. The Hazel Language Server identifies the type and typing context of the hole being filled, even in the presence of errors, ensuring that a meaningful program sketch is always available. This allows prompting with codebase-wide contextual information not…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
