What's a Good Prediction? Challenges in evaluating an agent's knowledge
Alex Kearney, Anna Koop, Patrick M. Pilarski

TL;DR
This paper discusses the challenges of evaluating an agent's knowledge, critiques the reliance on accuracy metrics, and proposes an alternative evaluation method based on the relevance of features to the prediction task.
Contribution
It introduces a novel evaluation approach focusing on the usefulness of predictions via feature relevance, addressing limitations of accuracy-based assessments.
Findings
Accuracy may not reflect the usefulness of knowledge.
Feature relevance provides better insight into prediction quality.
Empirical examples demonstrate the proposed evaluation method.
Abstract
Constructing general knowledge by learning task-independent models of the world can help agents solve challenging problems. However, both constructing and evaluating such models remains an open challenge. The most common approaches to evaluating models is to assess their accuracy with respect to observable values. However, the prevailing reliance on estimator accuracy as a proxy for the usefulness of the knowledge has the potential to lead us astray. We demonstrate the conflict between accuracy and usefulness through a series of illustrative examples including both a thought experiment and empirical example in MineCraft, using the General Value Function framework (GVF). Having identified challenges in assessing an agent's knowledge, we propose an alternate evaluation approach that arises continually in the online continual learning setting we recommend evaluation by examining internal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Reinforcement Learning in Robotics · Data Stream Mining Techniques
