Large-scale information retrieval in software engineering -- an experience report from industrial application
Michael Unterkalmsteiner, Tony Gorschek, Robert Feldt, Niklas Lavesson

TL;DR
This paper reports on applying and evaluating information retrieval techniques for test case selection in an industrial software engineering setting, highlighting challenges and lessons learned from large-scale experiments.
Contribution
It provides an empirical case study on the application of IR techniques in industry, revealing scalability issues and methodological challenges.
Findings
Scaling IR techniques to industry data is difficult.
Latent semantic analysis faces particular challenges in industrial contexts.
There is a lack of research on scalable parameter optimization for IR in software engineering.
Abstract
Software Engineering activities are information intensive. Research proposes Information Retrieval (IR) techniques to support engineers in their daily tasks, such as establishing and maintaining traceability links, fault identification, and software maintenance. We describe an engineering task, test case selection, and illustrate our problem analysis and solution discovery process. The objective of the study is to gain an understanding of to what extent IR techniques (one potential solution) can be applied to test case selection and provide decision support in a large-scale, industrial setting. We analyze, in the context of the studied company, how test case selection is performed and design a series of experiments evaluating the performance of different IR techniques. Each experiment provides lessons learned from implementation, execution, and results, feeding to its successor. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
