Finding Person Relations in Image Data of the Internet Archive
Eric M\"uller-Budack, Kader Pustu-Iren, Sebastian Diering, Ralph, Ewerth

TL;DR
This paper presents a deep learning-based system for recognizing persons in web news images from the Internet Archive, enhancing entity tracking by complementing text analysis with image-based person identification.
Contribution
The paper introduces a novel face recognition system tailored for large-scale web news images, enabling more precise tracking of persons in multimedia web archives.
Findings
High accuracy on standard face recognition benchmarks
Successful identification of persons in real-world web news images
Demonstrated system's utility with practical use cases
Abstract
The multimedia content in the World Wide Web is rapidly growing and contains valuable information for many applications in different domains. For this reason, the Internet Archive initiative has been gathering billions of time-versioned web pages since the mid-nineties. However, the huge amount of data is rarely labeled with appropriate metadata and automatic approaches are required to enable semantic search. Normally, the textual content of the Internet Archive is used to extract entities and their possible relations across domains such as politics and entertainment, whereas image and video content is usually neglected. In this paper, we introduce a system for person recognition in image content of web news stored in the Internet Archive. Thus, the system complements entity recognition in text and allows researchers and analysts to track media coverage and relations of persons more…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
