PTM4Tag: Sharpening Tag Recommendation of Stack Overflow Posts with   Pre-trained Models

Junda He; Bowen Xu; Zhou Yang; DongGyun Han; Chengran Yang; and David; Lo

arXiv:2203.10965·cs.SE·May 12, 2022

PTM4Tag: Sharpening Tag Recommendation of Stack Overflow Posts with Pre-trained Models

Junda He, Bowen Xu, Zhou Yang, DongGyun Han, Chengran Yang, and David, Lo

PDF

1 Repo

TL;DR

This paper introduces PTM4Tag, a novel framework using pre-trained language models with a triplet architecture to improve tag recommendation accuracy for Stack Overflow posts, addressing noise and redundancy issues.

Contribution

It is the first to leverage pre-trained language models specifically for tag recommendation in Stack Overflow, demonstrating superior performance over existing deep learning methods.

Findings

01

CodeBERT achieves the best performance among tested PTMs.

02

Using all post components yields the highest accuracy.

03

Title is the most influential component for tag prediction.

Abstract

Stack Overflow is often viewed as the most influential Software Question Answer (SQA) website with millions of programming-related questions and answers. Tags play a critical role in efficiently structuring the contents in Stack Overflow and are vital to support a range of site operations, e.g., querying relevant contents. Poorly selected tags often introduce extra noise and redundancy, which leads to tag synonym and tag explosion problems. Thus, an automated tag recommendation technique that can accurately recommend high-quality tags is desired to alleviate the problems mentioned above. Inspired by the recent success of pre-trained language models (PTMs) in natural language processing (NLP), we present PTM4Tag, a tag recommendation framework for Stack Overflow posts that utilize PTMs with a triplet architecture, which models the components of a post, i.e., Title, Description, and Code…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

soarsmu/ptm4tag
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.