Improving Stack Overflow question title generation with copying enhanced CodeBERT model and bi-modal information
Fengji Zhang, Xiao Yu, Jacky Keung, Fuyang Li, Zhiwen Xie, Zhen Yang,, Caoyuan Ma, Zhimin Zhang

TL;DR
This paper introduces CCBERT, a novel deep learning model that leverages bi-modal information and copying mechanisms to improve automatic question title generation for Stack Overflow, outperforming existing models.
Contribution
The paper presents CCBERT, a new encoder-decoder model using CodeBERT and copy attention to enhance question title generation from full question bodies.
Findings
CCBERT outperforms baseline models on a large dataset.
It maintains high performance on code-only and low-resource datasets.
Human evaluation confirms improved readability and relevance.
Abstract
Context: Stack Overflow is very helpful for software developers who are seeking answers to programming problems. Previous studies have shown that a growing number of questions are of low quality and thus obtain less attention from potential answerers. Gao et al. proposed an LSTM-based model (i.e., BiLSTM-CC) to automatically generate question titles from the code snippets to improve the question quality. However, only using the code snippets in the question body cannot provide sufficient information for title generation, and LSTMs cannot capture the long-range dependencies between tokens. Objective: This paper proposes CCBERT, a deep learning based novel model to enhance the performance of question title generation by making full use of the bi-modal information of the entire question body. Method: CCBERT follows the encoder-decoder paradigm and uses CodeBERT to encode the question body…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Expert finding and Q&A systems · Software Engineering Research
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Dropout · Layer Normalization · Position-Wise Feed-Forward Layer · Adam · Dense Connections · CodeBERT · Byte Pair Encoding
