Automatic Metadata Extraction Incorporating Visual Features from Scanned Electronic Theses and Dissertations
Muntabir Hasan Choudhury, Himarsha R. Jayanetti, Jian Wu, William A., Ingram, Edward A. Fox

TL;DR
This paper introduces a CRF-based model that combines visual and text features to improve automatic metadata extraction from scanned ETD cover pages, outperforming text-only methods.
Contribution
The study develops a novel CRF model integrating visual features for metadata extraction from scanned ETDs, validated on an extended corpus with human-verified data.
Findings
CRF with visual features outperforms text-only models.
Achieved 81.3%-96% F1 on seven metadata fields.
Model is robust across different scanned ETD cover pages.
Abstract
Electronic Theses and Dissertations (ETDs) contain domain knowledge that can be used for many digital library tasks, such as analyzing citation networks and predicting research trends. Automatic metadata extraction is important to build scalable digital library search engines. Most existing methods are designed for born-digital documents, so they often fail to extract metadata from scanned documents such as for ETDs. Traditional sequence tagging methods mainly rely on text-based features. In this paper, we propose a conditional random field (CRF) model that combines text-based and visual features. To verify the robustness of our model, we extended an existing corpus and created a new ground truth corpus consisting of 500 ETD cover pages with human validated metadata. Our experiments show that CRF with visual features outperformed both a heuristic and a CRF model with only text-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Retrieval and Classification Techniques · Video Analysis and Summarization · Advanced Image and Video Retrieval Techniques
MethodsConditional Random Field
