EvalAI: Towards Better Evaluation Systems for AI Agents

Deshraj Yadav; Rishabh Jain; Harsh Agrawal; Prithvijit Chattopadhyay,; Taranjeet Singh; Akash Jain; Shiv Baran Singh; Stefan Lee; Dhruv Batra

arXiv:1902.03570·cs.AI·February 12, 2019·44 cites

EvalAI: Towards Better Evaluation Systems for AI Agents

Deshraj Yadav, Rishabh Jain, Harsh Agrawal, Prithvijit Chattopadhyay,, Taranjeet Singh, Akash Jain, Shiv Baran Singh, Stefan Lee, Dhruv Batra

PDF

Open Access 3 Repos

TL;DR

EvalAI is an open source platform designed to facilitate scalable evaluation and comparison of AI and machine learning models, promoting collaboration and standardization in AI research.

Contribution

It introduces a scalable, open source platform that simplifies benchmarking AI models and agents, fostering global collaboration and accelerating progress in AI research.

Findings

01

Enables large-scale evaluation of AI agents

02

Facilitates global AI challenges and competitions

03

Standardizes benchmarking processes

Abstract

We introduce EvalAI, an open source platform for evaluating and comparing machine learning (ML) and artificial intelligence algorithms (AI) at scale. EvalAI is built to provide a scalable solution to the research community to fulfill the critical need of evaluating machine learning models and agents acting in an environment against annotations or with a human-in-the-loop. This will help researchers, students, and data scientists to create, collaborate, and participate in AI challenges organized around the globe. By simplifying and standardizing the process of benchmarking these models, EvalAI seeks to lower the barrier to entry for participating in the global scientific effort to push the frontiers of machine learning and artificial intelligence, thereby increasing the rate of measurable progress in this domain.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Explainable Artificial Intelligence (XAI) · Data Stream Mining Techniques