Token-free Models for Sarcasm Detection

Sumit Mamtani; Maitreya Sonawane; Kanika Agarwal; Nishanth Sanjeev

arXiv:2505.01006·cs.CL·May 5, 2025

Token-free Models for Sarcasm Detection

Sumit Mamtani, Maitreya Sonawane, Kanika Agarwal, Nishanth Sanjeev

PDF

Open Access

TL;DR

This paper demonstrates that token-free models like ByT5 and CANINE outperform token-based models in sarcasm detection tasks across social media and news headlines, achieving new state-of-the-art accuracy.

Contribution

It provides the first comprehensive evaluation of token-free models for sarcasm detection, showing their superiority over traditional token-based approaches.

Findings

01

ByT5-small and CANINE outperform token-based models in accuracy.

02

Token-free models achieve new state-of-the-art results.

03

Token-free models are more robust in noisy, informal domains.

Abstract

Tokenization is a foundational step in most natural language processing (NLP) pipelines, yet it introduces challenges such as vocabulary mismatch and out-of-vocabulary issues. Recent work has shown that models operating directly on raw text at the byte or character level can mitigate these limitations. In this paper, we evaluate two token-free models, ByT5 and CANINE, on the task of sarcasm detection in both social media (Twitter) and non-social media (news headlines) domains. We fine-tune and benchmark these models against token-based baselines and state-of-the-art approaches. Our results show that ByT5-small and CANINE outperform token-based counterparts and achieve new state-of-the-art performance, improving accuracy by 0.77% and 0.49% on the News Headlines and Twitter Sarcasm datasets, respectively. These findings underscore the potential of token-free models for robust NLP in noisy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsForensic Entomology and Diptera Studies · Identification and Quantification in Food · Forensic and Genetic Research

MethodsAttention Is All You Need · Linear Layer · Multi-Head Attention · Dense Connections · Dropout · Layer Normalization · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Softmax · Absolute Position Encodings