An Alternative Issue Tracking Dataset of Public Jira Repositories
Lloyd Montgomery, Clara L\"uders, Walid Maalej

TL;DR
This paper introduces a new, extensive dataset of public Jira repositories, providing valuable data for research on issue tracking, evolution, and cross-tool analysis in software engineering.
Contribution
The paper releases a large, publicly accessible Jira dataset with detailed issue and change data, filling a gap in available repositories for research.
Findings
Dataset includes 16 Jira instances, 1822 projects, 2.7 million issues.
Contains 32 million changes, 9 million comments, and 1 million issue links.
Enables new research opportunities in issue evolution and cross-tool analysis.
Abstract
Organisations use issue tracking systems (ITSs) to track and document their projects' work in units called issues. This style of documentation encourages evolutionary refinement, as each issue can be independently improved, commented on, linked to other issues, and progressed through the organisational workflow. Commonly studied ITSs so far include GitHub, GitLab, and Bugzilla, while Jira, one of the most popular ITS in practice with a wealth of additional information, has yet to receive similar attention. Unfortunately, diverse public Jira datasets are rare, likely due to the difficulty in finding and accessing these repositories. With this paper, we release a dataset of 16 public Jiras with 1822 projects, spanning 2.7 million issues with a combined total of 32 million changes, 9 million comments, and 1 million issue links. We believe this Jira dataset will lead to many fruitful…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
