TL;DR
This paper introduces a general path-based syntactic representation of programs using AST paths, improving the prediction of program properties across multiple languages and models.
Contribution
It presents a novel, language-agnostic, path-based program representation that enhances learning effectiveness for various prediction tasks.
Findings
Outperforms task-specific handcrafted representations
Works across JavaScript, Java, Python, and C#
Effective with different learning algorithms like CRF and word2vec
Abstract
Predicting program properties such as names or expression types has a wide range of applications. It can ease the task of programming and increase programmer productivity. A major challenge when learning from programs is . We present a for learning from programs. Our representation is purely syntactic and extracted automatically. The main idea is to represent a program using paths in its abstract syntax tree (AST). This allows a learning model to leverage the structured nature of code rather than treating it as a flat sequence of tokens. We show that this representation is general and can: (i) cover different prediction tasks, (ii) drive different learning algorithms (for both generative and discriminative models), and (iii) work across different programming…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
