Learning Program Embeddings to Propagate Feedback on Student Code
Chris Piech, Jonathan Huang, Andy Nguyen, Mike Phulsuksombati, Mehran, Sahami, Leonidas Guibas

TL;DR
This paper introduces a neural network-based method to encode student programs as linear maps between embedded pre- and postconditions, enabling scalable feedback propagation in large online courses.
Contribution
It presents a novel neural encoding of programs as linear maps and an algorithm to propagate feedback efficiently across massive student submissions.
Findings
Effective feedback propagation on Code.org Hour of Code data
Successful application to Stanford CS1 student submissions
Scalable approach for large-scale online education feedback
Abstract
Providing feedback, both assessing final work and giving hints to stuck students, is difficult for open-ended assignments in massive online classes which can range from thousands to millions of students. We introduce a neural network method to encode programs as a linear mapping from an embedded precondition space to an embedded postcondition space and propose an algorithm for feedback at scale using these linear maps as features. We apply our algorithm to assessments from the Code.org Hour of Code and Stanford University's CS1 course, where we propagate human comments on student assignments to orders of magnitude more submissions.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOnline Learning and Analytics · Teaching and Learning Programming · Machine Learning and Algorithms
