QuArch: A Question-Answering Dataset for AI Agents in Computer   Architecture

Shvetank Prakash; Andrew Cheng; Jason Yik; Arya Tschand; Radhika; Ghosal; Ikechukwu Uchendu; Jessica Quaye; Jeffrey Ma; Shreyas Grampurohit,; Sofia Giannuzzi; Arnav Balyan; Fin Amin; Aadya Pipersenia; Yash Choudhary,; Ankita Nayak; Amir Yazdanbakhsh; Vijay Janapa Reddi

arXiv:2501.01892·cs.AR·January 7, 2025

QuArch: A Question-Answering Dataset for AI Agents in Computer Architecture

Shvetank Prakash, Andrew Cheng, Jason Yik, Arya Tschand, Radhika, Ghosal, Ikechukwu Uchendu, Jessica Quaye, Jeffrey Ma, Shreyas Grampurohit,, Sofia Giannuzzi, Arnav Balyan, Fin Amin, Aadya Pipersenia, Yash Choudhary,, Ankita Nayak, Amir Yazdanbakhsh, Vijay Janapa Reddi

PDF

Open Access

TL;DR

QuArch is a new dataset of 1500 question-answer pairs aimed at evaluating and improving AI models' understanding of computer architecture, revealing performance gaps and aiding future research.

Contribution

The paper introduces QuArch, a novel dataset for assessing and enhancing language models' knowledge of computer architecture topics.

Findings

01

Best closed-source model achieves 84% accuracy

02

Small open-source model reaches 72% accuracy

03

Fine-tuning improves small model accuracy by up to 8%

Abstract

We introduce QuArch, a dataset of 1500 human-validated question-answer pairs designed to evaluate and enhance language models' understanding of computer architecture. The dataset covers areas including processor design, memory systems, and performance optimization. Our analysis highlights a significant performance gap: the best closed-source model achieves 84% accuracy, while the top small open-source model reaches 72%. We observe notable struggles in memory systems, interconnection networks, and benchmarking. Fine-tuning with QuArch improves small model accuracy by up to 8%, establishing a foundation for advancing AI-driven computer architecture research. The dataset and leaderboard are at https://harvard-edge.github.io/QuArch/.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Semantic Web and Ontologies · Natural Language Processing Techniques