TL;DR
funcGNN introduces a novel graph neural network approach to estimate program similarity by analyzing control flow graphs, achieving higher accuracy and efficiency than traditional methods, and enabling applications in software engineering tasks.
Contribution
This is the first application of graph neural networks on labeled CFGs for program similarity estimation, providing a scalable and accurate alternative to NP-hard graph edit distance calculations.
Findings
Lower error rate of 0.00194 in GED estimation
23 times faster than traditional GED approximation methods
Effective generalization to unseen programs
Abstract
Program similarity is a fundamental concept, central to the solution of software engineering tasks such as software plagiarism, clone identification, code refactoring and code search. Accurate similarity estimation between programs requires an in-depth understanding of their structure, semantics and flow. A control flow graph (CFG), is a graphical representation of a program which captures its logical control flow and hence its semantics. A common approach is to estimate program similarity by analysing CFGs using graph similarity measures, e.g. graph edit distance (GED). However, graph edit distance is an NP-hard problem and computationally expensive, making the application of graph similarity techniques to complex software programs impractical. This study intends to examine the effectiveness of graph neural networks to estimate program similarity, by analysing the associated control…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsGraph Neural Network
