Neural architectures for resolving references in program code
Gerg\H{o} Szalay, Gergely Zsolt Kov\'acs, S\'andor Teleki, Bal\'azs Pint\'er, Tibor Gregorics

TL;DR
This paper introduces new sequence-to-sequence architectures for reference resolution in code, outperforming existing models in synthetic benchmarks and reducing errors in real-world decompilation tasks.
Contribution
The paper presents novel architectures that significantly improve robustness and scalability for reference resolution in programming languages.
Findings
Our models handle examples ten times longer than baselines.
Error rate in decompiling switch statements decreased by 42%.
All architectural components are essential for performance.
Abstract
Resolving and rewriting references is fundamental in programming languages. Motivated by a real-world decompilation task, we abstract reference rewriting into the problems of direct and indirect indexing by permutation. We create synthetic benchmarks for these tasks and show that well-known sequence-to-sequence machine learning architectures are struggling on these benchmarks. We introduce new sequence-to-sequence architectures for both problems. Our measurements show that our architectures outperform the baselines in both robustness and scalability: our models can handle examples that are ten times longer compared to the best baseline. We measure the impact of our architecture in the real-world task of decompiling switch statements, which has an indexing subtask. According to our measurements, the extended model decreases the error rate by 42%. Multiple ablation studies show that all…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
