{\mu}gat: Improving Single-Page Document Parsing by Providing Multi-Page Context
Fabio Quattrini, Carmine Zaccagnino, Silvia Cascianelli, Laura Righi,, Rita Cucchiara

TL;DR
This paper introduces {}gat, an extension of the Nougat document parsing model, designed to process multi-page documents by incorporating context from adjacent pages, improving the parsing of complex, multi-page regesta documents.
Contribution
We adapt the Nougat architecture to handle multi-page context, enabling more accurate parsing of visually rich, multi-page documents like regesta, which was not addressed by prior single-page focused models.
Findings
Enhanced parsing accuracy on multi-page regesta documents
Effective incorporation of adjacent page context improves structure recognition
Qualitative and quantitative results demonstrate model's robustness
Abstract
Regesta are catalogs of summaries of other documents and, in some cases, are the only source of information about the content of such full-length documents. For this reason, they are of great interest to scholars in many social and humanities fields. In this work, we focus on Regesta Pontificum Romanum, a large collection of papal registers. Regesta are visually rich documents, where the layout is as important as the text content to convey the contained information through the structure, and are inherently multi-page documents. Among Digital Humanities techniques that can help scholars efficiently exploit regesta and other documental sources in the form of scanned documents, Document Parsing has emerged as a task to process document images and convert them into machine-readable structured representations, usually markup language. However, current models focus on scientific and business…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Web Data Mining and Analysis · Topic Modeling
MethodsFocus
