TL;DR
This paper analyzes various link types in issue tracking systems, revealing their structures, characteristics, and challenges in predicting non-duplicate links, and evaluates the robustness of existing duplicate detection methods across different link categories.
Contribution
It provides a comprehensive analysis of diverse link types in JIRA, introduces a categorization scheme, and assesses the limitations of current duplicate detection approaches in distinguishing link types.
Findings
Duplication links often form simple, two-component graphs.
Composition links predominantly form hierarchical trees (97.7%).
Deep learning approaches confuse link types, reducing accuracy by up to 12%.
Abstract
Software projects use Issue Tracking Systems (ITS) like JIRA to track issues and organize the workflows around them. Issues are often inter-connected via different links such as the default JIRA link types Duplicate, Relate, Block, or Subtask. While previous research has mostly focused on analyzing and predicting duplication links, this work aims at understanding the various other link types, their prevalence, and characteristics towards a more reliable link type prediction. For this, we studied 607,208 links connecting 698,790 issues in 15 public JIRA repositories. Besides the default types, the custom types Depend, Incorporate, Split, and Cause were also common. We manually grouped all 75 link types used in the repositories into five general categories: General Relation, Duplication, Composition, Temporal / Causal, and Workflow. Comparing the structures of the corresponding graphs, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
