PaperFit: Vision-in-the-Loop Typesetting Optimization for Scientific Documents

Bihui Yu; Xinglong Xu; Junjie Jiang; Jiabei Cheng; Caijun Jia; Siyuan Li; Conghui He; Jingxuan Wei; Cheng Tan

arXiv:2605.10341·cs.AI·May 12, 2026

PaperFit: Vision-in-the-Loop Typesetting Optimization for Scientific Documents

Bihui Yu, Xinglong Xu, Junjie Jiang, Jiabei Cheng, Caijun Jia, Siyuan Li, Conghui He, Jingxuan Wei, Cheng Tan

PDF

1 Repo

TL;DR

PaperFit introduces a vision-in-the-loop system for optimizing LaTeX documents, enabling iterative visual verification and correction to produce publication-ready PDFs, addressing a critical gap in document automation.

Contribution

The paper formalizes Visual Typesetting Optimization (VTO) and presents PaperFit, a novel agent that iteratively diagnoses and repairs layout issues using visual feedback, outperforming baselines.

Findings

01

PaperFit significantly outperforms baseline methods in VTO tasks.

02

Constructed PaperFit-Bench with 200 papers across multiple templates and defect types.

03

Bridging source code and visual layout is essential for publication-ready document automation.

Abstract

A LaTeX manuscript that compiles without error is not necessarily publication-ready. The resulting PDFs frequently suffer from misplaced floats, overflowing equations, inconsistent table scaling, widow and orphan lines, and poor page balance, forcing authors into repetitive compile-inspect-edit cycles. Rule-based tools are blind to rendered visuals, operating only on source code and log files. Text-only LLMs perform open-loop text editing, unable to predict or verify the two-dimensional layout consequences of their changes. Reliable typesetting optimization therefore requires a visual closed loop with verification after every edit. We formalize this problem as Visual Typesetting Optimization (VTO), the task of transforming a compilable LaTeX paper into a visually polished, page-budget-compliant PDF through iterative visual verification and source-level revision, and introduce a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

openraiser/PaperFit
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.