Automatic Uncertainty-Aware Synthetic Data Bootstrapping for Historical Map Segmentation

Lukas Arzoumanidis; Julius Knechtel; Jan-Henrik Haunert; Youness Dehbi

arXiv:2511.15875·cs.CV·April 14, 2026

Automatic Uncertainty-Aware Synthetic Data Bootstrapping for Historical Map Segmentation

Lukas Arzoumanidis, Julius Knechtel, Jan-Henrik Haunert, Youness Dehbi

PDF

TL;DR

This paper introduces an automatic deep generative method for creating synthetic historical maps that incorporate visual uncertainty, enhancing training data for land-cover segmentation tasks.

Contribution

It presents a novel approach to generate realistic synthetic historical maps with uncertainty modeling, reducing manual effort and data scarcity for deep learning applications.

Findings

01

Synthetic maps improve segmentation accuracy on historical map datasets.

02

Uncertainty-aware data augmentation enhances model robustness.

03

The approach enables scalable generation of training data for specialized domains.

Abstract

The automated analysis of historical documents, particularly maps, has drastically benefited from advances in deep learning and its success across various computer vision applications. However, most deep learning-based methods heavily rely on large amounts of annotated training data, which are typically unavailable for historical maps, especially for those belonging to specific, homogeneous cartographic domains, also known as corpora. Creating high-quality training data suitable for machine learning often takes a significant amount of time and involves extensive manual effort. While synthetic training data can alleviate the scarcity of real-world samples, it often lacks the affinity (realism) and diversity (variation) necessary for effective learning. By transferring the cartographic style of a historical map corpus onto modern vector data, we bootstrap an effectively unlimited number…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.