Few-Shot Segmentation of Historical Maps via Linear Probing of Vision Foundation Models
Rafael Sterzinger, Marco Peer, Robert Sablatnig

TL;DR
This paper introduces a simple, parameter-efficient method for few-shot segmentation of historical maps using vision foundation models, achieving state-of-the-art results with minimal annotated data and low training parameters.
Contribution
It presents a novel approach combining vision foundation models with parameter-efficient fine-tuning for effective few-shot map segmentation, outperforming existing methods.
Findings
Outperforms state-of-the-art on Siegfried benchmark with +5% and +13% in 10-shot scenarios.
Achieves a mean PQ of 67.3% on ICDAR 2021 dataset for building segmentation.
Maintains high performance with only 689k trainable parameters in extremely low-data regimes.
Abstract
As rich sources of history, maps provide crucial insights into historical changes, yet their diverse visual representations and limited annotated data pose significant challenges for automated processing. We propose a simple yet effective approach for few-shot segmentation of historical maps, leveraging the rich semantic embeddings of large vision foundation models combined with parameter-efficient fine-tuning. Our method outperforms the state-of-the-art on the Siegfried benchmark dataset in vineyard and railway segmentation, achieving +5% and +13% relative improvements in mIoU in 10-shot scenarios and around +20% in the more challenging 5-shot setting. Additionally, it demonstrates strong performance on the ICDAR 2021 competition dataset, attaining a mean PQ of 67.3% for building block segmentation, despite not being optimized for this shape-sensitive metric, underscoring its…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
