SDUs DAISY: A Benchmark for Danish Culture
Jacob Nielsen, Stine L. Beltoft, Peter Schneider-Kamp, Lukas Galke Poech

TL;DR
This paper presents DAISY, a comprehensive benchmark dataset for Danish culture, created by generating and human-approving questions from cultural artifacts spanning over three millennia, to aid cultural heritage research.
Contribution
It introduces a novel dataset for Danish cultural heritage, combining curated topics with AI-generated questions validated by humans, covering diverse historical periods and cultural domains.
Findings
Dataset contains 741 validated question-answer pairs.
Covers topics from 1300 BC to contemporary culture.
Enables research in cultural heritage and AI understanding.
Abstract
We introduce a new benchmark for Danish culture via cultural heritage, Daisy, based on the curated topics from the Danish Culture Canon 2006. For each artifact in the culture canon, we query the corresponding Wikipedia page and have a language model generate random questions. This yields a sampling strategy within each work, with a mix of central of peripheral questions for each work, not only knowledge of mainstream information, but also in-depth cornerstones defining the heritage of Danish Culture, defined by the Canon committee. Each question-answer pair is humanly approved or corrected in the final dataset consisting of 741 close-ended question answer pairs covering topics, from 1300 BC. archaeological findings, 1700 century poems and musicals pieces to contemporary pop music and Danish design and architecture.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Processing and 3D Reconstruction · Wikis in Education and Collaboration · Digital Humanities and Scholarship
