LitBench: A Graph-Centric Large Language Model Benchmarking Tool For Literature Tasks
Andreas Varvarigos, Ali Maatouk, Jiasheng Zhang, Ngoc Bui, Jialin Chen, Leandros Tassiulas, Rex Ying

TL;DR
LitBench is a flexible benchmarking tool that creates domain-specific literature graphs and tasks to evaluate and improve large language models' ability to understand and reason across specialized literature domains.
Contribution
The paper introduces LitBench, a novel tool for generating domain-specific literature datasets and tasks, enabling targeted training and evaluation of literature-focused LLMs.
Findings
Small domain-specific LLMs trained on LitBench datasets perform competitively with larger models.
LitBench supports flexible curation of literature graphs across various domains.
Open-source release facilitates community adoption and further research.
Abstract
While large language models (LLMs) have become the de facto framework for literature-related tasks, they still struggle to function as domain-specific literature agents due to their inability to connect pieces of knowledge and reason across domain-specific contexts, terminologies, and nomenclatures. This challenge underscores the need for a tool that facilitates such domain-specific adaptation and enables rigorous benchmarking across literature tasks. To that end, we introduce LitBench, a benchmarking tool designed to enable the development and evaluation of domain-specific LLMs tailored to literature-related tasks. At its core, LitBench uses a data curation process that generates domain-specific literature sub-graphs and constructs training and evaluation datasets based on the textual attributes of the resulting nodes and edges. The tool is designed for flexibility, supporting the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Computational and Text Analysis Methods · Advanced Graph Neural Networks
