Persona-aware Generative Model for Code-mixed Language
Ayan Sengupta, Md Shad Akhtar, Tanmoy Chakraborty

TL;DR
This paper introduces PARADOX, a novel persona-aware Transformer model that generates realistic code-mixed texts by encoding user-specific attributes, improving semantic coherence and linguistic validity without relying on monolingual data.
Contribution
It presents a new persona-aware generative model for code-mixed language, incorporating an alignment module and novel metrics for evaluation, advancing personalized multilingual text generation.
Findings
PARADOX outperforms non-persona models in BLEU and perplexity.
It generates more semantically coherent code-mixed texts.
The model does not require monolingual reference data.
Abstract
Code-mixing and script-mixing are prevalent across online social networks and multilingual societies. However, a user's preference toward code-mixing depends on the socioeconomic status, demographics of the user, and the local context, which existing generative models mostly ignore while generating code-mixed texts. In this work, we make a pioneering attempt to develop a persona-aware generative model to generate texts resembling real-life code-mixed texts of individuals. We propose a Persona-aware Generative Model for Code-mixed Generation, PARADOX, a novel Transformer-based encoder-decoder model that encodes an utterance conditioned on a user's persona and generates code-mixed texts without monolingual reference data. We propose an alignment module that re-calibrates the generated sequence to resemble real-life code-mixed texts. PARADOX generates code-mixed texts that are semantically…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPersona Design and Applications · Digital Communication and Language · Natural Language Processing Techniques
