Prompt me a Dataset: An investigation of text-image prompting for   historical image dataset creation using foundation models

Hassan El-Hajj; Matteo Valleriani

arXiv:2309.01674·cs.CV·September 6, 2023

Prompt me a Dataset: An investigation of text-image prompting for historical image dataset creation using foundation models

Hassan El-Hajj, Matteo Valleriani

PDF

Open Access 1 Repo

TL;DR

This paper introduces a pipeline utilizing foundation models like GroundDINO and SAM to extract and evaluate visual data from historical documents, aiding dataset creation for humanities research.

Contribution

It presents a novel sequential approach for extracting visual elements from historical texts using text-image prompts and foundation models, addressing data scarcity in humanities datasets.

Findings

01

Effective extraction of visual data from historical documents.

02

Impact of different prompts on detection accuracy.

03

Potential for improved dataset creation in humanities.

Abstract

In this paper, we present a pipeline for image extraction from historical documents using foundation models, and evaluate text-image prompts and their effectiveness on humanities datasets of varying levels of complexity. The motivation for this approach stems from the high interest of historians in visual elements printed alongside historical texts on the one hand, and from the relative lack of well-annotated datasets within the humanities when compared to other domains. We propose a sequential approach that relies on GroundDINO and Meta's Segment-Anything-Model (SAM) to retrieve a significant portion of visual data from historical documents that can then be used for downstream development tasks and dataset creation, as well as evaluate the effect of different linguistic prompts on the resulting detections.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hassanhajj910/prompt-me-a-dataset
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Retrieval and Classification Techniques · Digital Humanities and Scholarship · Handwritten Text Recognition Techniques