TL;DR
PaperFit introduces a vision-in-the-loop system for optimizing LaTeX documents, enabling iterative visual verification and correction to produce publication-ready PDFs, addressing a critical gap in document automation.
Contribution
The paper formalizes Visual Typesetting Optimization (VTO) and presents PaperFit, a novel agent that iteratively diagnoses and repairs layout issues using visual feedback, outperforming baselines.
Findings
PaperFit significantly outperforms baseline methods in VTO tasks.
Constructed PaperFit-Bench with 200 papers across multiple templates and defect types.
Bridging source code and visual layout is essential for publication-ready document automation.
Abstract
A LaTeX manuscript that compiles without error is not necessarily publication-ready. The resulting PDFs frequently suffer from misplaced floats, overflowing equations, inconsistent table scaling, widow and orphan lines, and poor page balance, forcing authors into repetitive compile-inspect-edit cycles. Rule-based tools are blind to rendered visuals, operating only on source code and log files. Text-only LLMs perform open-loop text editing, unable to predict or verify the two-dimensional layout consequences of their changes. Reliable typesetting optimization therefore requires a visual closed loop with verification after every edit. We formalize this problem as Visual Typesetting Optimization (VTO), the task of transforming a compilable LaTeX paper into a visually polished, page-budget-compliant PDF through iterative visual verification and source-level revision, and introduce a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
