Using Consensual Biterms from Text Structures of Requirements and Code   to Improve IR-Based Traceability Recovery

Hui Gao; Hongyu Kuang; Kexin Sun; Xiaoxing Ma; Alexander Egyed,; Patrick M\"ader; Guoping Rong; Dong Shao; He Zhang

arXiv:2209.01734·cs.SE·September 7, 2022

Using Consensual Biterms from Text Structures of Requirements and Code to Improve IR-Based Traceability Recovery

Hui Gao, Hongyu Kuang, Kexin Sun, Xiaoxing Ma, Alexander Egyed,, Patrick M\"ader, Guoping Rong, Dong Shao, He Zhang

PDF

1 Repo

TL;DR

This paper introduces a novel method using consensual biterms from text structures of requirements and code to enhance IR-based traceability recovery, significantly improving the accuracy of automated link detection in software artifacts.

Contribution

It proposes a new approach to extract and filter co-occurring word pairs from requirements and code texts to improve IR-based traceability recovery performance.

Findings

01

Outperforms baseline by 21.9% in AP and 9.3% in MAP when used alone.

02

Enhances IR techniques by enriching input corpus and IR calculations.

03

Collaborates effectively with other strategies, outperforming baselines by 5.9% in AP.

Abstract

Traceability approves trace links among software artifacts based on whether two artifacts are related by system functionalities. The traces are valuable for software development, but are difficult to obtain manually. To cope with the costly and fallible manual recovery, automated approaches are proposed to recover traces through textual similarities among software artifacts, such as those based on Information Retrieval (IR). However, the low quality & quantity of artifact texts negatively impact the calculated IR values, thus greatly hindering the performance of IR-based approaches. In this study, we propose to extract co-occurred word pairs from the text structures of both requirements and code (i.e., consensual biterms) to improve IR-based traceability recovery. We first collect a set of biterms based on the part-of-speech of requirement texts, and then filter them through the code…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

huialex/tarot
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.