AGGA: A Dataset of Academic Guidelines for Generative AI and Large   Language Models

Junfeng Jiao; Saleh Afroogh; Kevin Chen; David Atkinson; Amit; Dhurandhar

arXiv:2501.02063·cs.CL·March 19, 2025

AGGA: A Dataset of Academic Guidelines for Generative AI and Large Language Models

Junfeng Jiao, Saleh Afroogh, Kevin Chen, David Atkinson, Amit, Dhurandhar

PDF

Open Access 1 Repo

TL;DR

AGGA is a comprehensive dataset of 80 academic guidelines for Generative AI and LLMs, collected from diverse global universities, supporting NLP tasks and benchmarking in requirements engineering.

Contribution

This paper presents AGGA, a novel, large-scale dataset of academic guidelines for GAIs and LLMs, enabling new research and evaluation in requirements engineering and related NLP tasks.

Findings

01

Dataset includes 188,674 words from diverse institutions

02

Supports tasks like ambiguity detection and requirements categorization

03

Enables benchmarking for academic guidelines in AI use

Abstract

This study introduces AGGA, a dataset comprising 80 academic guidelines for the use of Generative AIs (GAIs) and Large Language Models (LLMs) in academic settings, meticulously collected from official university websites. The dataset contains 188,674 words and serves as a valuable resource for natural language processing tasks commonly applied in requirements engineering, such as model synthesis, abstraction identification, and document structure assessment. Additionally, AGGA can be further annotated to function as a benchmark for various tasks, including ambiguity detection, requirements categorization, and the identification of equivalent requirements. Our methodologically rigorous approach ensured a thorough examination, with a selection of universities that represent a diverse range of global institutions, including top-ranked universities across six continents. The dataset…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

salehafroogh/agga_notebook
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling