Automatic Generation of Model and Data Cards: A Step Towards Responsible   AI

Jiarui Liu; Wenkai Li; Zhijing Jin; Mona Diab

arXiv:2405.06258·cs.CL·June 21, 2024·1 cites

Automatic Generation of Model and Data Cards: A Step Towards Responsible AI

Jiarui Liu, Wenkai Li, Zhijing Jin, Mona Diab

PDF

Open Access 1 Repo 1 Datasets 1 Video

TL;DR

This paper introduces an automated approach using Large Language Models to generate comprehensive, objective, and faithful model and data cards, addressing documentation gaps in AI to promote responsible and accountable AI practices.

Contribution

It presents CardGen, a novel pipeline for automated documentation, and CardBench, a large dataset for training and evaluation, improving AI transparency.

Findings

01

Enhanced completeness of generated cards

02

Improved objectivity and faithfulness

03

Effective retrieval-based generation pipeline

Abstract

In an era of model and data proliferation in machine learning/AI especially marked by the rapid advancement of open-sourced technologies, there arises a critical need for standardized consistent documentation. Our work addresses the information incompleteness in current human-generated model and data cards. We propose an automated generation approach using Large Language Models (LLMs). Our key contributions include the establishment of CardBench, a comprehensive dataset aggregated from over 4.8k model cards and 1.4k data cards, coupled with the development of the CardGen pipeline comprising a two-step retrieval process. Our approach exhibits enhanced completeness, objectivity, and faithfulness in generated model and data cards, a significant step in responsible AI documentation practices ensuring better accountability and traceability.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jiarui-liu/AutomatedModelCardGeneration
noneOfficial

Datasets

Jerry999/CardBench
dataset· 108 dl
108 dl

Videos

Automatic Generation of Model and Data Cards: A Step Towards Responsible AI· underline

Taxonomy

TopicsBusiness Process Modeling and Analysis · Model-Driven Software Engineering Techniques · Simulation Techniques and Applications