What's documented in AI? Systematic Analysis of 32K AI Model Cards
Weixin Liang, Nazneen Rajani, Xinyu Yang, Ezinwanne Ozoani, Eric Wu,, Yiqun Chen, Daniel Scott Smith, James Zou

TL;DR
This study analyzes 32,111 AI model cards on Hugging Face to understand documentation practices, revealing uneven informativeness and showing that detailed model cards can moderately boost model popularity.
Contribution
It provides a large-scale systematic analysis of AI model documentation practices and evaluates the impact of detailed model cards on model downloads.
Findings
Most models have model cards, but with uneven detail.
Sections on environmental impact, limitations, and evaluation are often underfilled.
Adding detailed model cards moderately increases download rates.
Abstract
The rapid proliferation of AI models has underscored the importance of thorough documentation, as it enables users to understand, trust, and effectively utilize these models in various applications. Although developers are encouraged to produce model cards, it's not clear how much information or what information these cards contain. In this study, we conduct a comprehensive analysis of 32,111 AI model documentations on Hugging Face, a leading platform for distributing and deploying AI models. Our investigation sheds light on the prevailing model card documentation practices. Most of the AI models with substantial downloads provide model cards, though the cards have uneven informativeness. We find that sections addressing environmental impact, limitations, and evaluation exhibit the lowest filled-out rates, while the training section is the most consistently filled-out. We analyze the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare · Advanced Data Processing Techniques
