SAIBench: Benchmarking AI for Science

Yatao Li; Jianfeng Zhan

arXiv:2206.05418·cs.AI·June 14, 2022

SAIBench: Benchmarking AI for Science

Yatao Li, Jianfeng Zhan

PDF

TL;DR

SAIBench is a unified benchmarking system for scientific AI that uses a domain-specific language to enable flexible, modular evaluation across multiple scientific disciplines.

Contribution

It introduces SAIBench and SAIL, a domain-specific language, to standardize and simplify benchmarking of AI solutions in scientific research.

Findings

01

SAIBench effectively unifies scientific AI benchmarking.

02

SAIL enables flexible and reusable benchmarking modules.

03

The system adapts to various scientific problems and evaluation methods.

Abstract

Scientific research communities are embracing AI-based solutions to target tractable scientific tasks and improve research workflows. However, the development and evaluation of such solutions are scattered across multiple disciplines. We formalize the problem of scientific AI benchmarking, and propose a system called SAIBench in the hope of unifying the efforts and enabling low-friction on-boarding of new disciplines. The system approaches this goal with SAIL, a domain-specific language to decouple research problems, AI models, ranking criteria, and software/hardware configuration into reusable modules. We show that this approach is flexible and can adapt to problems, AI models, and evaluation methods defined in different perspectives. The project homepage is https://www.computercouncil.org/SAIBench

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.