TailNLG: A Multilingual Benchmark Addressing Verbalization of Long-Tail Entities
Lia Draetta, Michael Oliverio, Virginia Ram\'on-Ferrer, Pier Felice Balestrucci, Flaviana Corallo, Carlos Badenes-Olmedo, Alessandro Mazzei, Marco Antonio Stranisci, Rossana Damiano

TL;DR
This paper introduces TailNLG, a multilingual benchmark for evaluating the verbalization of long-tail entities in Data-to-Text generation, revealing biases against rare entities across models and languages.
Contribution
It presents the first systematic study of long-tail entities in multilingual Data-to-Text generation and introduces a new benchmark built from Wikidata.
Findings
Models show bias against long-tail entities with lower embedding scores.
Model uncertainty is higher for rare entities across languages.
Existing metrics do not reliably capture performance differences on long-tail entities.
Abstract
The automatic verbalization of structured knowledge is a key task for making knowledge graphs accessible to non-expert users and supporting retrieval-augmented generation systems. Although recent advances in Data-to-Text generation have improved multilingual coverage, little attention has been paid to potential biases in the verbalization of rare entities, frequently known as long-tail entities. In this work, we present the first systematic study of long-tail entities in Data-to-Text generation. We introduce TailNLG, a new multilingual benchmark in English, Italian, and Spanish, built from Wikidata and covering entities with varying levels of popularity. We evaluate three different families of large language models in zero-shot settings and compare their performance on rare versus common entities, as well as against the established WebNLG benchmark. Our results reveal a consistent bias…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
