Novice Type Error Diagnosis with Natural Language Models

Chuqin Geng; Haolin Ye; Yixuan Li; Tianyu Han; Brigitte Pientka; and; Xujie Si

arXiv:2210.03682·cs.PL·October 10, 2022

Novice Type Error Diagnosis with Natural Language Models

Chuqin Geng, Haolin Ye, Yixuan Li, Tianyu Han, Brigitte Pientka, and, Xujie Si

PDF

Open Access

TL;DR

This paper explores using natural language models to diagnose type errors in novice programming, demonstrating significant improvements over previous data-driven methods without relying on hand-engineered features.

Contribution

It introduces a novel end-to-end natural language model approach for type error localization that outperforms existing data-driven methods in accuracy.

Findings

01

Language model predicts type errors correctly 62% of the time.

02

Outperforms previous state-of-the-art by 11%.

03

Structural probes explain performance differences.

Abstract

Strong static type systems help programmers eliminate many errors without much burden of supplying type annotations. However, this flexibility makes it highly non-trivial to diagnose ill-typed programs, especially for novice programmers. Compared to classic constraint solving and optimization-based approaches, the data-driven approach has shown great promise in identifying the root causes of type errors with higher accuracy. Instead of relying on hand-engineered features, this work explores natural language models for type error localization, which can be trained in an end-to-end fashion without requiring any features. We demonstrate that, for novice type error diagnosis, the language model-based approach significantly outperforms the previous state-of-the-art data-driven approach. Specifically, our model could predict type errors correctly 62% of the time, outperforming the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research · Software Testing and Debugging Techniques · Software System Performance and Reliability