CodeExp: Explanatory Code Document Generation
Haotian Cui, Chenglong Wang, Junjie Huang, Jeevana Priya Inala, Todd, Mytkowicz, Bo Wang, Jianfeng Gao, Nan Duan

TL;DR
This paper introduces CodeExp, a new task for generating detailed, implementation-level code explanations, along with a refined dataset, evaluation metrics, and baseline models that outperform previous high-level summaries.
Contribution
The paper presents a novel code explanation generation task, a large-scale refined dataset, human-evaluation criteria, and effective fine-tuning strategies for improved explanation quality.
Findings
Refined dataset improves model performance over larger unrefined data.
Fine-tuned models generate human-like, well-structured long docstrings.
Evaluation metrics align closely with human assessments.
Abstract
Developing models that can automatically generate detailed code explanation can greatly benefit software maintenance and programming education. However, existing code-to-text generation models often produce only high-level summaries of code that do not capture implementation-level choices essential for these scenarios. To fill in this gap, we propose the code explanation generation task. We first conducted a human study to identify the criteria for high-quality explanatory docstring for code. Based on that, we collected and refined a large-scale code docstring corpus and formulated automatic evaluation metrics that best match human assessments. Finally, we present a multi-stage fine-tuning strategy and baseline models for the task. Our experiments show that (1) our refined training dataset lets models achieve better performance in the explanation generation tasks compared to larger…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Topic Modeling · Software System Performance and Reliability
