ArchiveGPT: A human-centered evaluation of using a vision language model for image cataloguing
Line Abele, Gerrit Anders, Tolgahan Ayd{\i}n, J\"urgen Buder, Helen Fischer, Dominik Kimmel, Markus Huff

TL;DR
This study evaluates the effectiveness of a vision language model in generating photographic catalog descriptions, highlighting the importance of human review and trust for successful integration into archival workflows.
Contribution
It provides a human-centered evaluation of AI-generated catalog descriptions, emphasizing the need for human oversight and trust-building in specialized archival contexts.
Findings
AI descriptions were often indistinguishable from human ones
Expert trust in AI tools was limited due to concerns about preservation
Human review is essential to ensure accuracy and quality of AI-generated metadata
Abstract
The accelerating growth of photographic collections has outpaced manual cataloguing, motivating the use of vision language models (VLMs) to automate metadata generation. This study examines whether Al-generated catalogue descriptions can approximate human-written quality and how generative Al might integrate into cataloguing workflows in archival and museum collections. A VLM (InternVL2) generated catalogue descriptions for photographic prints on labelled cardboard mounts with archaeological content, evaluated by archive and archaeology experts and non-experts in a human-centered, experimental framework. Participants classified descriptions as AI-generated or expert-written, rated quality, and reported willingness to use and trust in AI tools. Classification performance was above chance level, with both groups underestimating their ability to detect Al-generated descriptions. OCR errors…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Humanities and Scholarship · Multimodal Machine Learning Applications · Image Retrieval and Classification Techniques
