MICE: A Crosslinguistic Emotion Corpus in Malay, Indonesian, Chinese and English
Ng Bee Chin, Yosephine Susanto, Erik Cambria

TL;DR
This paper introduces MICE, a multilingual emotion corpus in Malay, Indonesian, Chinese, and English, detailing data collection methods and preliminary findings, with ongoing validation and analysis.
Contribution
It presents the creation of a crosslinguistic emotion corpus and describes the methodology and initial data collection for four languages.
Findings
Identified thousands of emotion expressions across four languages.
Collected survey data on emotion categorization, valence, and intensity.
Preliminary data and ongoing validation processes are described.
Abstract
MICE is a corpus of emotion words in four languages which is currently working progress. There are two sections to this study, Part I: Emotion word corpus and Part II: Emotion word survey. In Part 1, the method of how the emotion data is culled for each of the four languages will be described and very preliminary data will be presented. In total, we identified 3,750 emotion expressions in Malay, 6,657 in Indonesian, 3,347 in Mandarin Chinese and 8,683 in English. We are currently evaluating and double checking the corpus and doing further analysis on the distribution of these emotion expressions. Part II Emotion word survey involved an online language survey which collected information on how speakers assigned the emotion words into basic emotion categories, the rating for valence and intensity as well as biographical information of all the respondents.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Categorization, perception, and language · Language, Metaphor, and Cognition
