TL;DR
This paper introduces a benchmark to evaluate whether large language models can recognize the scope of their own knowledge, revealing that larger models tend to develop this self-awareness as they scale.
Contribution
The study develops a novel benchmark to assess LLMs' awareness of their knowledge and demonstrates that larger models show emerging self-awareness capabilities.
Findings
All tested LLMs show some awareness of their knowledge with increased scale
Model architecture influences the rate of knowledge awareness emergence
Knowledge awareness may be a generalizable trait of LLMs
Abstract
Large Language Models (LLMs) have emerged as highly capable systems and are increasingly being integrated into various uses. However, the rapid pace of their deployment has outpaced a comprehensive understanding of their internal mechanisms and a delineation of their capabilities and limitations. A desired attribute of an intelligent system is its ability to recognize the scope of its own knowledge. To investigate whether LLMs embody this characteristic, we develop a benchmark designed to challenge these models to enumerate all information they possess on specific topics. This benchmark evaluates whether the models recall excessive, insufficient, or the precise amount of information, thereby indicating their awareness of their own knowledge. Our findings reveal that all tested LLMs, given sufficient scale, demonstrate an understanding of how much they know about specific topics. While…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
