On the Applicability of Language Models to Block-Based Programs
Elisabeth Griebl, Benedikt Fein, Florian Oberm\"uller, Gordon Fraser,, Ren\'e Just

TL;DR
This paper investigates how well language models like n-grams and transformers can be applied to Scratch, a block-based programming language, for tasks like code completion and bug detection, despite its syntactic simplicity.
Contribution
It demonstrates the feasibility of using language models on block-based languages and provides insights into their predictability and potential for tooling improvements.
Findings
Blocks inhibit predictability of code models
Language models are feasible for Scratch tasks
Results support future tooling development
Abstract
Block-based programming languages like Scratch are increasingly popular for programming education and end-user programming. Recent program analyses build on the insight that source code can be modelled using techniques from natural language processing. Many of the regularities of source code that support this approach are due to the syntactic overhead imposed by textual programming languages. This syntactic overhead, however, is precisely what block-based languages remove in order to simplify programming. Consequently, it is unclear how well this modelling approach performs on block-based programming languages. In this paper, we investigate the applicability of language models for the popular block-based programming language Scratch. We model Scratch programs using n-gram models, the most essential type of language model, and transformers, a popular deep learning model. Evaluation on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Teaching and Learning Programming · Online Learning and Analytics
