Web Image Context Extraction with Graph Neural Networks and Sentence Embeddings on the DOM tree
Chen Dang (QR), Hicham Randrianarivo (QR), Rapha\"el, Fournier-S'Niehotta (CNAM, CEDRIC - VERTIGO), Nicolas Audebert (CNAM, CEDRIC, - VERTIGO)

TL;DR
This paper introduces a novel method combining Graph Neural Networks and NLP to extract image context from web pages efficiently without rendering, addressing large-scale web indexing challenges.
Contribution
It presents a new GNN-based approach for web image context extraction that leverages HTML structure and text, trained on a proxy task due to lack of labeled data.
Findings
Promising results in extracting relevant textual context for images
Effective encoding of structural and semantic webpage information
Potential for large-scale web image indexing
Abstract
Web Image Context Extraction (WICE) consists in obtaining the textual information describing an image using the content of the surrounding webpage. A common preprocessing step before performing WICE is to render the content of the webpage. When done at a large scale (e.g., for search engine indexation), it may become very computationally costly (up to several seconds per page). To avoid this cost, we introduce a novel WICE approach that combines Graph Neural Networks (GNNs) and Natural Language Processing models. Our method relies on a graph model containing both node types and text as features. The model is fed through several blocks of GNNs to extract the textual context. Since no labeled WICE dataset with ground truth exists, we train and evaluate the GNNs on a proxy task that consists in finding the semantically closest text to the image caption. We then interpret importance weights…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Web Data Mining and Analysis · Topic Modeling
