CUPID: Leveraging ChatGPT for More Accurate Duplicate Bug Report Detection
Ting Zhang, Ivana Clairine Irsan, Ferdian Thung, David Lo

TL;DR
This paper introduces CUPID, a hybrid approach combining traditional methods with ChatGPT to improve duplicate bug report detection, especially in smaller datasets, achieving state-of-the-art results and significant performance gains.
Contribution
CUPID leverages ChatGPT to enhance traditional duplicate bug report detection, outperforming existing deep learning and traditional methods, particularly on smaller datasets.
Findings
CUPID achieves Recall Rate@10 of 0.602 to 0.654 across datasets.
CUPID improves over previous state-of-the-art by 5-8%.
CUPID surpasses deep learning approaches by up to 82%.
Abstract
Duplicate bug report detection (DBRD) is a long-standing challenge in both academia and industry. Over the past decades, researchers have proposed various approaches to detect duplicate bug reports more accurately. With the recent advancement of deep learning, researchers have also proposed several deep learning-based approaches to address the DBRD task. In the bug repositories with many bug reports, deep learning-based approaches have shown promising performance. However, in the bug repositories with a smaller number of bug reports, i.e., around 10k, the existing deep learning approaches show worse performance than the traditional approaches. Traditional approaches have limitations, too, e.g., they are usually based on the bag-of-words model, which cannot capture the semantics of bug reports. To address these aforementioned challenges, we seek to leverage a state-of-the-art large…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Advanced Malware Detection Techniques · Hate Speech and Cyberbullying Detection
