Effectively Searching Maps in Web Documents
Qingzhao Tan, Prasenjit Mitra, C. Lee Giles

TL;DR
This paper presents an automated system for identifying, indexing, and retrieving maps within digital documents, improving search accuracy over generic methods by using machine learning and metadata-based ranking.
Contribution
It introduces a novel map identification classifier, metadata extraction techniques, and a ranking algorithm, enhancing map retrieval in digital libraries.
Findings
Support-Vector-Machine classifier effectively distinguishes maps from other figures.
Metadata weighting improves retrieval precision.
System outperforms existing map search methods.
Abstract
Maps are an important source of information in archaeology and other sciences. Users want to search for historical maps to determine recorded history of the political geography of regions at different eras, to find out where exactly archaeological artifacts were discovered, etc. Currently, they have to use a generic search engine and add the term map along with other keywords to search for maps. This crude method will generate a significant number of false positives that the user will need to cull through to get the desired results. To reduce their manual effort, we propose an automatic map identification, indexing, and retrieval system that enables users to search and retrieve maps appearing in a large corpus of digital documents using simple keyword queries. We identify features that can help in distinguishing maps from other figures in digital documents and show how a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Retrieval and Classification Techniques · Geographic Information Systems Studies · Advanced Image and Video Retrieval Techniques
