PyEvalAI: AI-assisted evaluation of Jupyter Notebooks for immediate personalized feedback
Nils Wandel, David Stotko, Alexander Schier, Reinhard Klein

TL;DR
PyEvalAI is an open-source AI-powered system that automates grading of Jupyter notebooks, providing immediate personalized feedback while preserving privacy and giving tutors full control.
Contribution
It introduces a privacy-preserving, open-source evaluation system combining unit tests and local language models for grading Jupyter notebooks.
Findings
Improves feedback speed and grading efficiency in university courses.
Maintains privacy by hosting models locally.
Supports Markdown, LaTeX, and Python code in notebooks.
Abstract
Grading student assignments in STEM courses is a laborious and repetitive task for tutors, often requiring a week to assess an entire class. For students, this delay of feedback prevents iterating on incorrect solutions, hampers learning, and increases stress when exercise scores determine admission to the final exam. Recent advances in AI-assisted education, such as automated grading and tutoring systems, aim to address these challenges by providing immediate feedback and reducing grading workload. However, existing solutions often fall short due to privacy concerns, reliance on proprietary closed-source models, lack of support for combining Markdown, LaTeX and Python code, or excluding course tutors from the grading process. To overcome these limitations, we introduce PyEvalAI, an AI-assisted evaluation system, which automatically scores Jupyter notebooks using a combination of unit…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Scientific Computing and Data Management · Mathematics, Computing, and Information Processing
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
