Multi-lingual and Multi-cultural Figurative Language Understanding
Anubha Kabra, Emmy Liu, Simran Khanuja, Alham Fikri Aji, Genta Indra, Winata, Samuel Cahyawijaya, Anuoluwapo Aremu, Perez Ogayo, Graham Neubig

TL;DR
This paper introduces a multilingual figurative language inference dataset covering seven diverse languages, revealing cultural influences on figurative expressions and highlighting the limitations of current multilingual language models in understanding them.
Contribution
It creates a new dataset for figurative language in seven languages and evaluates multilingual LMs, exposing their deficiencies and cultural dependencies in figurative language understanding.
Findings
Languages rely on cultural concepts for figurative expressions
Multilingual LMs perform poorly compared to English in figurative tasks
Performance varies with training data availability
Abstract
Figurative language permeates human communication, but at the same time is relatively understudied in NLP. Datasets have been created in English to accelerate progress towards measuring and improving figurative language processing in language models (LMs). However, the use of figurative language is an expression of our cultural and societal experiences, making it difficult for these phrases to be universally applicable. In this work, we create a figurative language inference dataset, \datasetname, for seven diverse languages associated with a variety of cultures: Hindi, Indonesian, Javanese, Kannada, Sundanese, Swahili and Yoruba. Our dataset reveals that each language relies on cultural and regional concepts for figurative expressions, with the highest overlap between languages originating from the same region. We assess multilingual LMs' abilities to interpret figurative language in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Natural Language Processing Techniques · Language, Metaphor, and Cognition
