An Investigation into the Pedagogical Features of Documents
Emily Sheng, Prem Natarajan, Jonathan Gordon, and Gully Burns

TL;DR
This paper introduces the concept of pedagogical roles in documents to computationally analyze their educational utility, creating an annotated corpus and testing baseline prediction methods for these roles.
Contribution
It defines pedagogical roles, creates the first annotated corpus for these roles, and evaluates baseline NLP techniques for automatic role prediction.
Findings
Successfully annotated a corpus of pedagogical roles
Baseline models can predict pedagogical roles with moderate accuracy
Provides a foundation for future research on educational document analysis
Abstract
Characterizing the content of a technical document in terms of its learning utility can be useful for applications related to education, such as generating reading lists from large collections of documents. We refer to this learning utility as the "pedagogical value" of the document to the learner. While pedagogical value is an important concept that has been studied extensively within the education domain, there has been little work exploring it from a computational, i.e., natural language processing (NLP), perspective. To allow a computational exploration of this concept, we introduce the notion of "pedagogical roles" of documents (e.g., Tutorial and Survey) as an intermediary component for the study of pedagogical value. Given the lack of available corpora for our exploration, we create the first annotated corpus of pedagogical roles and use it to test baseline techniques for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
