Prompt-tuned Code Language Model as a Neural Knowledge Base for Type Inference in Statically-Typed Partial Code
Qing Huang, Zhiqiang Yuan, Zhenchang Xing, Xiwei Xu, Liming Zhu,, Qinghua Lu

TL;DR
This paper introduces a prompt-tuned code language model that acts as a neural knowledge base to perform type inference on partial code, overcoming limitations of symbolic methods and enabling fuzzy, efficient type resolution.
Contribution
It formulates type inference as a fill-in-the-blank task using a prompt-tuned masked language model trained on source code, offering a lightweight, neural alternative to symbolic approaches.
Findings
Effective type inference on partial code from GitHub and Stack Overflow
Supports fuzzy neural type inference with minimal compilation requirements
Outperforms symbolic methods in handling unseen API names
Abstract
Partial code usually involves non-fully-qualified type names (non-FQNs) and undeclared receiving objects. Resolving the FQNs of these non-FQN types and undeclared receiving objects (referred to as type inference) is the prerequisite to effective search and reuse of partial code. Existing dictionary-lookup based methods build a symbolic knowledge base of API names and code contexts, which involve significant compilation overhead and are sensitive to unseen API names and code context variations. In this paper, we formulate type inference as a cloze-style fill-in-blank language task. Built on source code naturalness, our approach fine-tunes a code masked language model (MLM) as a neural knowledge base of code elements with a novel "pre-train, prompt and predict" paradigm from raw source code. Our approach is lightweight and has minimum requirements on code compilation. Unlike existing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software System Performance and Reliability · Web Data Mining and Analysis
