LATTE: Improving Latex Recognition for Tables and Formulae with   Iterative Refinement

Nan Jiang; Shanchao Liang; Chengxiao Wang; Jiannan Wang; Lin Tan

arXiv:2409.14201·cs.CV·February 17, 2025

LATTE: Improving Latex Recognition for Tables and Formulae with Iterative Refinement

Nan Jiang, Shanchao Liang, Chengxiao Wang, Jiannan Wang, Lin Tan

PDF

Open Access 1 Video

TL;DR

LATTE introduces an iterative refinement framework utilizing delta-view feedback to enhance LaTeX source recognition accuracy for complex formulae and tables from PDF images, outperforming existing methods including GPT-4V.

Contribution

The paper presents LATTE, the first iterative LaTeX recognition framework that uses delta-view feedback for fault localization and refinement, improving accuracy for formulae and tables.

Findings

01

Outperforms existing techniques and GPT-4V by at least 7.03% in exact match accuracy.

02

Achieves a 46.08% success refinement rate for formulae.

03

Achieves a 25.51% success refinement rate for tables.

Abstract

Portable Document Format (PDF) files are dominantly used for storing and disseminating scientific research, legal documents, and tax information. LaTeX is a popular application for creating PDF documents. Despite its advantages, LaTeX is not WYSWYG -- what you see is what you get, i.e., the LaTeX source and rendered PDF images look drastically different, especially for formulae and tables. This gap makes it hard to modify or export LaTeX sources for formulae and tables from PDF images, and existing work is still limited. First, prior work generates LaTeX sources in a single iteration and struggles with complex LaTeX formulae. Second, existing work mainly recognizes and extracts LaTeX sources for formulae; and is incapable or ineffective for tables. This paper proposes LATTE, the first iterative refinement framework for LaTeX recognition. Specifically, we propose delta-view as feedback,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

LATTE: Improving Latex Recognition for Tables and Formulae with Iterative Refinement· underline

Taxonomy

TopicsHandwritten Text Recognition Techniques · Mathematics, Computing, and Information Processing