TL;DR
This paper introduces S3M, a deep learning-based Siamese network for measuring similarity between stack traces in crash reports, improving bug report aggregation accuracy.
Contribution
It presents the first deep learning approach for stack trace similarity using a Siamese biLSTM architecture, outperforming existing methods.
Findings
S3M outperforms state-of-the-art on open-source and private datasets.
Stack trace trimming affects similarity measurement quality.
Deep learning enhances crash report clustering accuracy.
Abstract
Automatic crash reporting systems have become a de-facto standard in software development. These systems monitor target software, and if a crash occurs they send details to a backend application. Later on, these reports are aggregated and used in the development process to 1) understand whether it is a new or an existing issue, 2) assign these bugs to appropriate developers, and 3) gain a general overview of the application's bug landscape. The efficiency of report aggregation and subsequent operations heavily depends on the quality of the report similarity metric. However, a distinctive feature of this kind of report is that no textual input from the user (i.e., bug description) is available: it contains only stack trace information. In this paper, we present S3M ("extreme") -- the first approach to computing stack trace similarity based on deep learning. It is based on a siamese…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsTanh Activation · Sigmoid Activation · Long Short-Term Memory · Bidirectional LSTM
