Searching for PETs: Using Distributional and Sentiment-Based Methods to   Find Potentially Euphemistic Terms

Patrick Lee; Martha Gavidia; Anna Feldman; Jing Peng

arXiv:2205.10451·cs.CL·May 24, 2022

Searching for PETs: Using Distributional and Sentiment-Based Methods to Find Potentially Euphemistic Terms

Patrick Lee, Martha Gavidia, Anna Feldman, Jing Peng

PDF

Open Access 1 Repo

TL;DR

This paper introduces a linguistically driven method combining distributional similarity and sentiment analysis to identify potentially euphemistic terms across various sensitive topics, demonstrating promising results in detecting PETs.

Contribution

It presents a novel approach that integrates distributional and sentiment-based techniques for detecting euphemistic language, advancing methods for linguistic analysis of sensitive content.

Findings

01

Effective detection of single and multi-word PETs

02

Demonstrated approach's efficacy on euphemism corpus

03

Potential for sentiment-based methods in euphemism detection

Abstract

This paper presents a linguistically driven proof of concept for finding potentially euphemistic terms, or PETs. Acknowledging that PETs tend to be commonly used expressions for a certain range of sensitive topics, we make use of distributional similarities to select and filter phrase candidates from a sentence and rank them using a set of simple sentiment-based metrics. We present the results of our approach tested on a corpus of sentences containing euphemisms, demonstrating its efficacy for detecting single and multi-word PETs from a broad range of topics. We also discuss future potential for sentiment-based methods on this task.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

marsgav/petdetection
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Swearing, Euphemism, Multilingualism · Natural Language Processing Techniques