Effectively Searching Maps in Web Documents

Qingzhao Tan; Prasenjit Mitra; C. Lee Giles

arXiv:0901.3939·cs.DL·January 27, 2009

Effectively Searching Maps in Web Documents

Qingzhao Tan, Prasenjit Mitra, C. Lee Giles

PDF

Open Access

TL;DR

This paper presents an automated system for identifying, indexing, and retrieving maps within digital documents, improving search accuracy over generic methods by using machine learning and metadata-based ranking.

Contribution

It introduces a novel map identification classifier, metadata extraction techniques, and a ranking algorithm, enhancing map retrieval in digital libraries.

Findings

01

Support-Vector-Machine classifier effectively distinguishes maps from other figures.

02

Metadata weighting improves retrieval precision.

03

System outperforms existing map search methods.

Abstract

Maps are an important source of information in archaeology and other sciences. Users want to search for historical maps to determine recorded history of the political geography of regions at different eras, to find out where exactly archaeological artifacts were discovered, etc. Currently, they have to use a generic search engine and add the term map along with other keywords to search for maps. This crude method will generate a significant number of false positives that the user will need to cull through to get the desired results. To reduce their manual effort, we propose an automatic map identification, indexing, and retrieval system that enables users to search and retrieve maps appearing in a large corpus of digital documents using simple keyword queries. We identify features that can help in distinguishing maps from other figures in digital documents and show how a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Retrieval and Classification Techniques · Geographic Information Systems Studies · Advanced Image and Video Retrieval Techniques