TL;DR
Knows introduces a YAML-based companion format for research papers that enables LLM agents to extract structured, fine-grained information directly from PDFs, improving comprehension accuracy and efficiency.
Contribution
It presents a lightweight, verifiable sidecar specification that enhances agent understanding of research artifacts without altering original publications.
Findings
Sidecar improves weak model accuracy from 19-25% to 47-67%.
Using sidecars reduces input tokens by 29-86%.
Community hub indexes over ten thousand publications.
Abstract
Research artifacts are distributed primarily as reader-oriented documents like PDFs. This creates a bottleneck for increasingly agent-assisted and agent-native research workflows, in which LLM agents need to infer fine-grained, task-relevant information from lengthy full documents, a process that is expensive, repetitive, and unstable at scale. We introduce Knows, a lightweight companion specification that binds structured claims, evidence, provenance, and verifiable relations to existing research artifacts in a form LLM agents can consume directly. Knows addresses the gap with a thin YAML sidecar (KnowsRecord) that coexists with the original PDF, requiring no changes to the publication itself, and validated by a deterministic schema linter. We evaluate Knows on 140 comprehension questions across 20 papers spanning 14 academic disciplines, comparing PDF-only, sidecar-only, and hybrid…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
