Semi-automatic definite description annotation: a first report
Danillo da Silva Rocha, Alex Gwo Jen Lan, Ivandre Paraboni

TL;DR
This paper introduces a semi-automatic method to improve the annotation quality and efficiency of definite descriptions in Referring Expression Generation corpora, addressing issues of noise and high annotation costs.
Contribution
It presents a novel semi-automatic annotation approach using simple rules to link words with meanings, aiding REG experiment design.
Findings
Reduces noise in REG data collections.
Speeds up annotation process.
Facilitates semantic annotation of definite descriptions.
Abstract
Studies in Referring Expression Generation (REG) often make use of corpora of definite descriptions produced by human subjects in controlled experiments. Experiments of this kind, which are essential for the study of reference phenomena and many others, may however include a considerable amount of noise. Human subjects may easily lack attention, or may simply misunderstand the task at hand and, as a result, the elicited data may include large proportions of ambiguous or ill-formed descriptions. In addition to that, REG corpora are usually collected for the study of semantics-related phenomena, and it is often the case that the elicited descriptions (and their input contexts) need to be annotated with their corresponding semantic properties. This, as in many other fields, may require considerable time and skilled annotators. As a means to tackle both kinds of difficulties - poor data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Speech and dialogue systems · Topic Modeling
