A Survey on Web Spam Detection Methods: Taxonomy

Shekoofeh Ghiam; Alireza Nemaney Pour

arXiv:1210.3131·cs.IR·October 12, 2012

A Survey on Web Spam Detection Methods: Taxonomy

Shekoofeh Ghiam, Alireza Nemaney Pour

PDF

TL;DR

This survey categorizes web spam techniques and detection methods, highlighting their effectiveness and the importance of accurate spam identification to maintain search engine trust and resource efficiency.

Contribution

It provides a comprehensive taxonomy of web spam techniques and reviews detection methods, offering insights into their effectiveness and areas for improvement.

Findings

01

Some detection techniques outperform others in accuracy.

02

Classification of spam techniques aids in targeted detection.

03

Effective methods help maintain search engine integrity.

Abstract

Web spam refers to some techniques, which try to manipulate search engine ranking algorithms in order to raise web page position in search engine results. In the best case, spammers encourage viewers to visit their sites, and provide undeserved advertisement gains to the page owner. In the worst case, they use malicious contents in their pages and try to install malware on the victims machine. Spammers use three kinds of spamming techniques to get higher score in ranking. These techniques are Link based techniques, hiding techniques and content-based techniques. Existing spam pages cause distrust to search engine results. This not only wastes the time of visitors, but also wastes lots of search engine resources. Hence spam detection methods have been proposed as a solution for web spam in order to reduce negative effects of spam pages. Experimental results show that some of these…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.