Webpage Segmentation for Extracting Images and Their Surrounding   Contextual Information

F. Fauzi; H. J. Long; M. Belkhatir

arXiv:2005.09639·cs.MM·May 21, 2020

Webpage Segmentation for Extracting Images and Their Surrounding Contextual Information

F. Fauzi, H. J. Long, M. Belkhatir

PDF

TL;DR

This paper presents a webpage segmentation method designed to extract images and their surrounding contextual information, validated through user studies and outperforming existing algorithms.

Contribution

The paper introduces a novel webpage segmentation algorithm specifically for extracting images and their context, validated with a human-labeled dataset.

Findings

01

The proposed method outperforms existing segmentation algorithms.

02

User study confirms the effectiveness of the segmentation approach.

03

Achieves better accuracy in extracting images and context from webpages.

Abstract

Web images come in hand with valuable contextual information. Although this information has long been mined for various uses such as image annotation, clustering of images, inference of image semantic content, etc., insufficient attention has been given to address issues in mining this contextual information. In this paper, we propose a webpage segmentation algorithm targeting the extraction of web images and their contextual information based on their characteristics as they appear on webpages. We conducted a user study to obtain a human-labeled dataset to validate the effectiveness of our method and experiments demonstrated that our method can achieve better results compared to an existing segmentation algorithm.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.