Lessons from Computational Modelling of Reference Production in Mandarin and English
Guanyi Chen, Kees van Deemter

TL;DR
This paper evaluates and analyzes computational models of referring expression generation in Mandarin and English, revealing higher under-specification rates than previously reported and highlighting language-specific challenges.
Contribution
It provides annotated Mandarin RE corpus, evaluates classic REG algorithms on it, and compares findings with English, uncovering new insights into under-specification issues.
Findings
Higher under-specification rates in Mandarin and English than previously reported
Identified grammar-related issues affecting REG performance
Highlighted shortcomings in previous REG evaluation methods
Abstract
Referring expression generation (REG) algorithms offer computational models of the production of referring expressions. In earlier work, a corpus of referring expressions (REs) in Mandarin was introduced. In the present paper, we annotate this corpus, evaluate classic REG algorithms on it, and compare the results with earlier results on the evaluation of REG for English referring expressions. Next, we offer an in-depth analysis of the corpus, focusing on issues that arise from the grammar of Mandarin. We discuss shortcomings of previous REG evaluations that came to light during our investigation and we highlight some surprising results. Perhaps most strikingly, we found a much higher proportion of under-specified expressions than previous studies had suggested, not just in Mandarin but in English as well.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
