Machine Generated Product Advertisements: Benchmarking LLMs Against Human Performance
Sanjukta Ghosh

TL;DR
This paper benchmarks AI-generated product descriptions against human-written ones across multiple quality metrics, revealing ChatGPT 4's superior performance and highlighting limitations of other models in coherence and relevance.
Contribution
It introduces a comprehensive evaluation framework for AI-generated product descriptions and compares multiple models against human performance.
Findings
ChatGPT 4 outperforms other models in quality metrics.
Most models produce incoherent and illogical descriptions.
AI models struggle with maintaining focus and relevance.
Abstract
This study compares the performance of AI-generated and human-written product descriptions using a multifaceted evaluation model. We analyze descriptions for 100 products generated by four AI models (Gemma 2B, LLAMA, GPT2, and ChatGPT 4) with and without sample descriptions, against human-written descriptions. Our evaluation metrics include sentiment, readability, persuasiveness, Search Engine Optimization(SEO), clarity, emotional appeal, and call-to-action effectiveness. The results indicate that ChatGPT 4 performs the best. In contrast, other models demonstrate significant shortcomings, producing incoherent and illogical output that lacks logical structure and contextual relevance. These models struggle to maintain focus on the product being described, resulting in disjointed sentences that do not convey meaningful information. This research provides insights into the current…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWikis in Education and Collaboration · AI in Service Interactions · Open Source Software Innovations
MethodsFocus
