Can large language models generate salient negative statements?
Hiba Arnaout, Simon Razniewski

TL;DR
This paper investigates the capability of large language models to generate salient negative statements about real-world entities, comparing different prompting strategies and traditional methods, revealing their limitations in factuality and ambiguity.
Contribution
It introduces a systematic evaluation of LLMs for negative statement generation, highlighting the impact of guided prompting and identifying challenges in factual accuracy.
Findings
Guided prompts improve negative statement quality.
LLMs often generate ambiguous or factually incorrect negatives.
Traditional methods outperform LLMs in factuality accuracy.
Abstract
We examine the ability of large language models (LLMs) to generate salient (interesting) negative statements about real-world entities; an emerging research topic of the last few years. We probe the LLMs using zero- and k-shot unconstrained probes, and compare with traditional methods for negation generation, i.e., pattern-based textual extractions and knowledge-graph-based inferences, as well as crowdsourced gold statements. We measure the correctness and salience of the generated lists about subjects from different domains. Our evaluation shows that guided probes do in fact improve the quality of generated negatives, compared to the zero-shot variant. Nevertheless, using both prompts, LLMs still struggle with the notion of factuality of negatives, frequently generating many ambiguous statements, or statements with negative keywords but a positive meaning.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Computational and Text Analysis Methods
