Expos\'ia: Teaching and Assessment of Academic Writing Skills for Research Project Proposals and Peer Feedback
Dennis Zyska, Alla Rozovskaya, Ilia Kuznetsov, Iryna Gurevych

TL;DR
Exposía is a novel dataset linking student research proposals and peer feedback, enabling research on automated scoring and assessment of academic writing using large language models.
Contribution
The paper introduces Exposía, the first dataset connecting writing and feedback in higher education, and benchmarks LLMs for automated scoring of proposals and reviews.
Findings
Different LLMs excel at scoring proposals versus reviews.
Closed-source models outperform open-weight models in this task.
Multi-aspect scoring prompts are most effective for classroom use.
Abstract
We present Expos\'ia, the first public dataset that connects writing and feedback in higher education, enabling research on educationally grounded computational approaches to teaching and evaluating academic writing. Expos\'ia includes student research project proposals and peer and instructor feedback consisting of comments and free-text reviews. The dataset was collected in the "Introduction to Scientific Work" course of the Computer Science. Expos\'ia reflects the multi-stage nature of the academic writing process that includes drafting, receiving feedback, and revising the writing based on the feedback received. Both the project proposals and peer feedback are accompanied by human assessment scores based on a fine-grained, pedagogically-grounded schema for writing and feedback assessment that we develop. We use Expos\'ia to benchmark state-of-the-art large language models (LLMs)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
