Exploring the Effectiveness of Abstract Syntax Tree Patterns for Algorithm Recognition
Denis Neum\"uller, Florian Sihler, Raphael Straub, Matthias Tichy

TL;DR
This paper presents a novel approach using abstract syntax tree patterns for automatic algorithm recognition, demonstrating higher accuracy than large language models and clone detection tools.
Contribution
The authors introduce a domain-specific language and matching algorithm for recognizing algorithms in code, with a prototype evaluated on benchmark datasets.
Findings
Achieved an average F1-score of 0.74 on algorithm recognition.
Outperformed Codellama with an F1-score of 0.35.
Outperformed code clone detection tools in recall.
Abstract
The automated recognition of algorithm implementations can support many software maintenance and re-engineering activities by providing knowledge about the concerns present in the code base. Moreover, recognizing inefficient algorithms like Bubble Sort and suggesting superior alternatives from a library can help in assessing and improving the quality of a system. Approaches from related work suffer from usability as well as scalability issues and their accuracy is not evaluated. In this paper, we investigate how well our approach based on the abstract syntax tree of a program performs for automatic algorithm recognition. To this end, we have implemented a prototype consisting of: A domain-specific language designed to capture the key features of an algorithm and used to express a search pattern on the abstract syntax tree, a matching algorithm to find these features, and an initial…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
