Unlocking Model Insights: A Dataset for Automated Model Card Generation

Shruti Singh; Hitesh Lodwal; Husain Malwat; Rakesh Thakur; Mayank; Singh

arXiv:2309.12616·cs.CL·September 25, 2023

Unlocking Model Insights: A Dataset for Automated Model Card Generation

Shruti Singh, Hitesh Lodwal, Husain Malwat, Rakesh Thakur, Mayank, Singh

PDF

Open Access

TL;DR

This paper introduces a dataset of question-answer pairs about ML models to automate model card generation, aiming to improve transparency and reduce manual effort in documenting model details.

Contribution

It provides a new dataset for training models to automatically generate comprehensive model cards from research papers.

Findings

01

Current LMs show limited understanding of research papers.

02

Automated model card generation can be improved with specialized datasets.

03

The dataset enables training models to better extract model details from papers.

Abstract

Language models (LMs) are no longer restricted to ML community, and instruction-tuned LMs have led to a rise in autonomous AI agents. As the accessibility of LMs grows, it is imperative that an understanding of their capabilities, intended usage, and development cycle also improves. Model cards are a popular practice for documenting detailed information about an ML model. To automate model card generation, we introduce a dataset of 500 question-answer pairs for 25 ML models that cover crucial aspects of the model, such as its training configurations, datasets, biases, architecture details, and training resources. We employ annotators to extract the answers from the original paper. Further, we explore the capabilities of LMs in generating model cards by answering questions. Our initial experiments with ChatGPT-3.5, LLaMa, and Galactica showcase a significant gap in the understanding of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques

MethodsGalactica