Effective Context Selection in LLM-based Leaderboard Generation: An Empirical Study
Salomon Kabongo, Jennifer D'Souza, S\"oren Auer

TL;DR
This study investigates how different context selection strategies affect the accuracy and reliability of LLMs in automatically generating AI research leaderboards from scholarly articles, proposing a novel finetuning approach that outperforms traditional methods.
Contribution
Introduces a new context selection method combined with instruction finetuning that improves LLM performance in leaderboard extraction without relying on predefined taxonomies.
Findings
Effective context selection improves LLM accuracy.
Reduces hallucinations in generated data.
Outperforms traditional NLI approaches.
Abstract
This paper explores the impact of context selection on the efficiency of Large Language Models (LLMs) in generating Artificial Intelligence (AI) research leaderboards, a task defined as the extraction of (Task, Dataset, Metric, Score) quadruples from scholarly articles. By framing this challenge as a text generation objective and employing instruction finetuning with the FLAN-T5 collection, we introduce a novel method that surpasses traditional Natural Language Inference (NLI) approaches in adapting to new developments without a predefined taxonomy. Through experimentation with three distinct context types of varying selectivity and length, our study demonstrates the importance of effective context selection in enhancing LLM accuracy and reducing hallucinations, providing a new pathway for the reliable and efficient generation of AI leaderboards. This contribution not only advances the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Rights Management and Security · Video Analysis and Summarization
