Learning Lenient Parsing & Typing via Indirect Supervision
Toufique Ahmed, Premkumar Devanbu, Vincent Hellendoorn

TL;DR
This paper introduces a novel, indirectly supervised transformer-based approach to train a lenient parser and typer for imperfect code fragments, improving parsing and error detection in real-world coding scenarios.
Contribution
The paper presents a new method that leverages large-scale, mostly correct code data and artificial corruption to train a lenient parser without human-curated datasets.
Findings
Achieves reasonable performance on STACKOVERFLOW fragments
Outperforms previous student error correction tools with 77% accuracy
Performs well on long, complex code snippets
Abstract
Both professional coders and teachers frequently deal with imperfect (fragmentary, incomplete, ill-formed) code. Such fragments are common in STACKOVERFLOW; students also frequently produce ill-formed code, for which instructors, TAs (or students themselves) must find repairs. In either case, the developer experience could be greatly improved if such code could somehow be parsed & typed; this makes such code more amenable to use within IDEs and allows early detection and repair of potential errors. We introduce a lenient parser, which can parse & type fragments, even ones with simple errors. Training a machine learner to leniently parse and type imperfect code requires a large training set including many pairs of imperfect code and its repair; such training sets are limited by human effort and curation. In this paper, we present a novel, indirectly supervised, approach to train a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
