TriLex: A Framework for Multilingual Sentiment Analysis in Low-Resource South African Languages
Mike Nkongolo, Hilton Vorster, Josh Warren, Trevor Naick, Deandre Vanmali, Masana Mashapha, Luke Brand, Alyssa Fernandes, Janco Calitz, Sibusiso Makhoba

TL;DR
TriLex is a novel three-stage framework that systematically expands sentiment lexicons for low-resource South African languages, significantly improving the performance of multilingual sentiment analysis models like AfroXLMR and AfriBERTa.
Contribution
The paper introduces TriLex, a scalable framework combining corpus extraction, cross-lingual mapping, and retrieval augmented generation for sentiment lexicon expansion in low-resource languages.
Findings
AfroXLMR achieves F1-scores above 80% for isiXhosa and isiZulu.
AfriBERTa attains around 64% F1-score despite limited pre-training.
Both models outperform traditional machine learning baselines.
Abstract
Low-resource African languages remain underrepresented in sentiment analysis, limiting both lexical coverage and the performance of multilingual Natural Language Processing (NLP) systems. This study proposes TriLex, a three-stage retrieval augmented framework that unifies corpus-based extraction, cross lingual mapping, and retrieval augmented generation (RAG) driven lexical refinement to systematically expand sentiment lexicons for low-resource languages. Using the enriched lexicon, the performance of two prominent African pretrained language models (AfroXLMR and AfriBERTa) is evaluated across multiple case studies. Results demonstrate that AfroXLMR delivers superior performance, achieving F1-scores above 80% for isiXhosa and isiZulu and exhibiting strong cross-lingual stability. Although AfriBERTa lacks pre-training on these target languages, it still achieves reliable F1-scores around…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Hate Speech and Cyberbullying Detection · Topic Modeling
