Comprehensive Algorithm Portfolio Evaluation using Item Response Theory

Sevvandi Kandanaarachchi; Kate Smith-Miles

arXiv:2307.15850·stat.ML·August 1, 2023

Comprehensive Algorithm Portfolio Evaluation using Item Response Theory

Sevvandi Kandanaarachchi, Kate Smith-Miles

PDF

1 Repo

TL;DR

This paper introduces a modified Item Response Theory framework to evaluate multiple algorithms across various datasets, providing richer insights into their performance and characteristics without extra dataset features.

Contribution

The paper presents a novel IRT-based method for assessing algorithm portfolios, capturing additional performance traits like consistency and anomalousness.

Findings

01

Effective evaluation of algorithm portfolios across datasets.

02

Enhanced understanding of algorithm characteristics.

03

Broad applicability demonstrated across diverse applications.

Abstract

Item Response Theory (IRT) has been proposed within the field of Educational Psychometrics to assess student ability as well as test question difficulty and discrimination power. More recently, IRT has been applied to evaluate machine learning algorithm performance on a single classification dataset, where the student is now an algorithm, and the test question is an observation to be classified by the algorithm. In this paper we present a modified IRT-based framework for evaluating a portfolio of algorithms across a repository of datasets, while simultaneously eliciting a richer suite of characteristics - such as algorithm consistency and anomalousness - that describe important aspects of algorithm performance. These characteristics arise from a novel inversion and reinterpretation of the traditional IRT model without requiring additional dataset feature computations. We test this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sevvandi/airt-scripts
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.