Wiki Entity Summarization Benchmark
Saeedeh Javadi, Atefeh Moradan, Mohammad Sorkhpar, Klim Zaporojets,, Davide Mottin, Ira Assent

TL;DR
WikES is a scalable, cost-effective benchmark for entity summarization in knowledge graphs, combining graph algorithms and NLP without human annotation, enabling comprehensive evaluation across domains.
Contribution
The paper introduces WikES, a novel benchmark with a dataset generator that integrates graph structure and NLP, eliminating the need for human-labeled summaries.
Findings
WikES effectively captures knowledge graph complexities.
Empirical results validate the benchmark's usefulness.
WikES enables cross-domain evaluation of summarization methods.
Abstract
Entity summarization aims to compute concise summaries for entities in knowledge graphs. Existing datasets and benchmarks are often limited to a few hundred entities and discard graph structure in source knowledge graphs. This limitation is particularly pronounced when it comes to ground-truth summaries, where there exist only a few labeled summaries for evaluation and training. We propose WikES, a comprehensive benchmark comprising of entities, their summaries, and their connections. Additionally, WikES features a dataset generator to test entity summarization algorithms in different areas of the knowledge graph. Importantly, our approach combines graph algorithms and NLP models as well as different data sources such that WikES does not require human annotation, rendering the approach cost-effective and generalizable to multiple domains. Finally, WikES is scalable and capable of…
Peer Reviews
Decision·ICLR 2025 Conference Withdrawn Submission
--> The authors introduce a benchmark dataset for entity summarization over wikidata.
--> Why the automated mapping from Wikipedia to Wikidata allows for better summarization over entities as stated in "Wikidata to automatically map entities from Wikipedia to Wikidata. This automation allows us to efficiently generate summaries for any number of entities" --> The motivation behind choosing these 4 algorithms for comparison is missing. --> A thorough comparison with all the benchmark datasets given in the related work is missing. --> The main contribution on why this benchmark da
1. The new benchmark WIKES is the first ES benchmark that does not require human annotation. And the generation method could be easily applied to generate other ES datasets with diverse topics and scales. 2. WIKES is the largest ES benchmark compared to existing benchmarks, which make it possible to explore the effectiveness of the ES methods over large scale datasets. 3. Some results on the smallest datasets of WIKES are presented, giving baseline results for further researches.
1. Relying Wikipedia’s abstract to generate the ES datasets is cost-efficient and novel. But this makes the entity summarization generated based on the abstract text rather than the triples of the entities in the knowledge graph. This might cause the entity summarization in WIKES not the gold entity summarization of the entities. 2. The DistillBERT is used to annotate the property that should be included in the summarization. The correctness of the final property is not evaluated, which is imp
S1: The WIKES benchmark is scalable, leveraging automatic summary generation from Wikipedia and Wikidata without relying on costly manual annotations, making it applicable to large datasets across various domains. The use of random walk-based subgraph extraction ensures that the structure of knowledge graphs is preserved, capturing both topological and semantic complexities of entities while maintaining computational efficiency. S2: This paper provides a thorough evaluation of multiple graph-ba
W1: The summarization methods are limited to graph-based summarization techniques. The authors may need to evaluate some text generation methods. A broader comparison with recent NLP-based summarization techniques could add more depth. W2: The paper focuses on scalability but only evaluating the small version of their dataset. The methods without efficiency concerns could be used to conduct evaluation on the large version to show the effectiveness of the proposed dataset. W3: The random walk-b
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Semantic Web and Ontologies
