Leveraging GenAI for Segmenting and Labeling Centuries-old Technical Documents
Carlos Monroy, Benjamin Navarro

TL;DR
This paper explores using advanced AI tools to segment and label centuries-old technical documents, aiming to automate their curation and improve accessibility despite limited training data and domain specialization.
Contribution
It introduces a novel approach combining SAM2, Florence2, ChatGPT, and domain ontologies to enhance segmentation and labeling of historical nautical documents.
Findings
Preliminary results show promising segmentation and labeling accuracy.
The integrated AI approach improves document curation and retrieval.
Challenges include limited training data and domain-specific terminology.
Abstract
Image segmentation and image recognition are well established computational techniques in the broader discipline of image processing. Segmentation allows to locate areas in an image, while recognition identifies specific objects within an image. These techniques have shown remarkable accuracy with modern images, mainly because the amount of training data is vast. Achieving similar accuracy in digitized images of centuries-old documents is more challenging. This difficulty is due to two main reasons: first, the lack of sufficient training data, and second, because the degree of specialization in a given domain. Despite these limitations, the ability to segment and recognize objects in these collections is important for automating the curation, cataloging, and dissemination of knowledge, making the contents of priceless collections accessible to scholars and the general public. In this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence Applications · Image Processing and 3D Reconstruction · Handwritten Text Recognition Techniques
