TL;DR
This paper introduces `paper.json`, a JSON-based convention to enhance LLM agents' ability to interpret academic papers by providing stable identifiers, explicit claims, and precise figure commands, aiming to improve citation and reproducibility.
Contribution
The paper proposes a lightweight, standardized JSON format for academic papers that addresses common failures in LLM comprehension and can be implemented quickly without altering the original PDF.
Findings
The `paper.json` convention improves LLM understanding of claims and figures.
Implementation can be completed in under an hour for a finished paper.
The paper's own `paper.json` is fully compliant and validated.
Abstract
LLM agents routinely serve as first (and sometimes only) readers of academic papers, skimming for sub-claims, extracting reproducibility steps, and generalizing scope. Standard prose papers produce recurring failures in this role: sub-claims that cannot be cited at sub-paper granularity, scope overextension beyond what the paper tests, and figure commands buried in codebases rather than the paper itself. We propose `paper.json`, a companion JSON file that travels with the PDF and addresses each failure with a lightweight convention: stable claim IDs (C1), an explicit does-not-claim list (C2), exact per-figure shell commands (C3), and stable definition IDs (C5). A fifth convention (C4) holds that minimum viable compliance, hand-written JSON alongside the PDF, is achievable in under an hour for a finished paper without touching the human-readable output. C1, C2, C3, and C5 are open…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
