TL;DR
This study shows that people often cannot distinguish AI-generated poetry from human-written poetry, especially when the AI's best work is selected, highlighting the advanced human-like quality of modern NLG algorithms.
Contribution
It provides empirical evidence that state-of-the-art NLG models like GPT-2 can produce poetry indistinguishable from human work in certain conditions, and introduces a novel incentivized testing methodology.
Findings
Participants failed to detect AI poetry when the best samples were selected.
People showed slight aversion to AI-generated poetry regardless of transparency.
AI-generated poetry was often judged as comparable to human poetry in quality.
Abstract
The release of openly available, robust natural language generation algorithms (NLG) has spurred much public attention and debate. One reason lies in the algorithms' purported ability to generate human-like text across various domains. Empirical evidence using incentivized tasks to assess whether people (a) can distinguish and (b) prefer algorithm-generated versus human-written text is lacking. We conducted two experiments assessing behavioral reactions to the state-of-the-art Natural Language Generation algorithm GPT-2 (Ntotal = 830). Using the identical starting lines of human poems, GPT-2 produced samples of poems. From these samples, either a random poem was chosen (Human-out-of-the-loop) or the best one was selected (Human-in-the-loop) and in turn matched with a human-written poem. In a new incentivized version of the Turing Test, participants failed to reliably detect the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLinear Layer · Cosine Annealing · Weight Decay · Softmax · Adam · Dropout · Refunds@Expedia|||How do I get a full refund from Expedia? · Attention Dropout · Byte Pair Encoding · Dense Connections
