PTM4Tag+: Tag Recommendation of Stack Overflow Posts with Pre-trained Models
Junda He, Bowen Xu, Zhou Yang, DongGyun Han, Chengran Yang, Jiakun, Liu, Zhipeng Zhao, David Lo

TL;DR
This paper introduces PTM4Tag+, a novel framework utilizing pre-trained language models for accurate and efficient tag recommendation on Stack Overflow posts, significantly outperforming previous methods.
Contribution
It proposes a triplet architecture leveraging multiple PTMs, especially CodeT5, for improved tag prediction, and evaluates smaller models for faster inference with minimal performance loss.
Findings
CodeT5 achieves the best performance among tested PTMs.
PTM4Tag+ outperforms state-of-the-art CNN-based approaches.
Smaller PTMs maintain over 93.96% of performance with reduced inference time.
Abstract
Stack Overflow is one of the most influential Software Question & Answer (SQA) websites, hosting millions of programming-related questions and answers. Tags play a critical role in efficiently organizing the contents in Stack Overflow and are vital to support a range of site operations, e.g., querying relevant content. Poorly selected tags often raise problems like tag ambiguity and tag explosion. Thus, a precise and accurate automated tag recommendation technique is demanded. Inspired by the recent success of pre-trained models (PTMs) in natural language processing (NLP), we present PTM4Tag+, a tag recommendation framework for Stack Overflow posts that utilizes PTMs in language modeling. PTM4Tag+ is implemented with a triplet architecture, which considers three key components of a post, i.e., Title, Description, and Code, with independent PTMs. We utilize a number of popular…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Advanced Text Analysis Techniques · Web Data Mining and Analysis
