AdParaphrase: Paraphrase Dataset for Analyzing Linguistic Features   toward Generating Attractive Ad Texts

Soichiro Murakami; Peinan Zhang; Hidetaka Kamigaito; Hiroya Takamura,; Manabu Okumura

arXiv:2502.04674·cs.CL·February 12, 2025

AdParaphrase: Paraphrase Dataset for Analyzing Linguistic Features toward Generating Attractive Ad Texts

Soichiro Murakami, Peinan Zhang, Hidetaka Kamigaito, Hiroya Takamura,, Manabu Okumura

PDF

Open Access 1 Repo 1 Datasets 1 Video

TL;DR

This paper introduces AdParaphrase, a dataset of paraphrased ad texts with human preferences, enabling analysis of linguistic features that enhance ad attractiveness and improving text generation models accordingly.

Contribution

The study provides a novel dataset for preference analysis and demonstrates how linguistic features influence ad attractiveness, advancing research in advertising text generation.

Findings

01

Preferred ad texts are more fluent, longer, contain more nouns, and use brackets.

02

A generation model considering these features produces more attractive ads.

03

The dataset facilitates future research on linguistic factors in advertising.

Abstract

Effective linguistic choices that attract potential customers play crucial roles in advertising success. This study aims to explore the linguistic features of ad texts that influence human preferences. Although the creation of attractive ad texts is an active area of research, progress in understanding the specific linguistic features that affect attractiveness is hindered by several obstacles. First, human preferences are complex and influenced by multiple factors, including their content, such as brand names, and their linguistic styles, making analysis challenging. Second, publicly available ad text datasets that include human preferences are lacking, such as ad performance metrics and human feedback, which reflect people's interests. To address these problems, we present AdParaphrase, a paraphrase dataset that contains human preferences for pairs of ad texts that are semantically…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

cyberagentailab/adparaphrase
noneOfficial

Datasets

cyberagent/AdParaphrase
dataset· 49 dl
49 dl

Videos

AdParaphrase: Paraphrase Dataset for Analyzing Linguistic Features toward Generating Attractive Ad Texts· underline

Taxonomy

TopicsAdvanced Text Analysis Techniques · Topic Modeling · Misinformation and Its Impacts