Using Page Size for Controlling Duplicate Query Results in Semantic Web
1Oumair Naseer, 2Ayesha Naseer, 3Atif Ali Khan, 4Humza Naseer

TL;DR
This paper proposes a new algorithm combining hash index and page size comparisons to effectively detect and eliminate duplicate query results in the semantic web, addressing limitations of existing hash-based methods.
Contribution
It introduces a novel duplicate detection algorithm that improves accuracy and can be integrated into existing SQL-based semantic web query systems.
Findings
Efficient removal of duplicate results demonstrated in experiments.
Overcomes hash index limitations with a combined approach.
Compatible with current SQL-based systems.
Abstract
Semantic web is a web of future. The Resource Description Framework (RDF) is a language to represent resources in the World Wide Web. When these resources are queried the problem of duplicate query results occurs. The present techniques used hash index comparison to remove duplicate query results. The major drawback of using the hash index to remove duplicate query results is that, if there is a slight change in formatting or word order, then hash index is changed and query results are no more considered as duplicate even though they have same contents. We presented an algorithm for detection and elimination of duplicate query results from semantic web using hash index and page size comparisons. Experimental results showed that the proposed technique removed duplicate query results from semantic web efficiently, solved the problems of using hash index for duplicate handling and could be…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Web Data Mining and Analysis · Scientific Computing and Data Management
