Labeling questions inside issue trackers
Aidin Rasti

TL;DR
This paper presents an automated method to identify and label questions in issue trackers, reducing manual triage effort and spam in open source project management.
Contribution
It introduces a pattern-based text cleaning process combined with a classification approach to automatically label unrelated questions in issue trackers.
Findings
Achieved over 81% accuracy in labeling questions.
Effectively removed noise like logs and error messages from issue texts.
Reduced manual effort in issue triage for large open source projects.
Abstract
One of the issues faced by the maintainers of popular open source software is the triage of newly reported issues. Many of the issues submitted to issue trackers are questions. Many people ask questions on issue trackers about their problem instead of using a proper QA website like StackOverflow. This may seem insignificant but for many of the big projects with thousands of users, this leads to spamming of the issue tracker. Reading and labeling these unrelated issues manually is a serious time consuming task and these unrelated questions add to the burden. In fact, most often maintainers demand to not submit questions in the issue tracker. To address this problem, first, we leveraged dozens of patterns to clean text of issues, we removed noises like logs, stack traces, environment variables, error messages, etc. Second, we have implemented a classification-based approach to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFocus Groups and Qualitative Methods · Public Relations and Crisis Communication · Privacy, Security, and Data Protection
