Automated Prompt Generation for Code Intelligence: An Empirical study and Experience in WeChat

Kexing Ji; Shiyun Fu; Cuiyun Gao; Yujia Chen; Zezhou Yang; Chaozheng Wang; Yuetang Deng

arXiv:2511.03136·cs.SE·November 6, 2025

Automated Prompt Generation for Code Intelligence: An Empirical study and Experience in WeChat

Kexing Ji, Shiyun Fu, Cuiyun Gao, Yujia Chen, Zezhou Yang, Chaozheng Wang, Yuetang Deng

PDF

Open Access

TL;DR

This paper investigates automated prompt generation for code intelligence tasks using large code models, demonstrating significant performance improvements through instruction generation and multi-step reasoning techniques, validated on open-source and industrial datasets.

Contribution

It introduces a novel combined APG approach for code tasks, empirically evaluates existing methods, and demonstrates substantial performance gains in both open-source and industrial scenarios.

Findings

01

Both instruction generation and multi-step reasoning significantly improve performance.

02

The combined APG approach outperforms basic prompts across multiple metrics.

03

Industrial validation shows large MRR improvements on WeChat-Bench.

Abstract

Large Code Models (LCMs) show potential in code intelligence, but their effectiveness is greatly influenced by prompt quality. Current prompt design is mostly manual, which is time-consuming and highly dependent on specific LCMs and tasks. While automated prompt generation (APG) exists in NLP, it is underexplored for code intelligence. This creates a gap, as automating the prompt process is essential for developers facing diverse tasks and black-box LCMs. To mitigate this, we empirically investigate two important parts of APG: Instruction Generation (IG) and Multi-Step Reasoning (MSR). IG provides a task-related description to instruct LCMs, while MSR guides them to produce logical steps before the final answer. We evaluate widely-used APG methods for each part on four open-source LCMs and three code intelligence tasks: code translation (PL-PL), code summarization (PL-NL), and API…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research · Software Testing and Debugging Techniques · Topic Modeling