A Multilingual Sentiment Lexicon for Low-Resource Language Translation using Large Languages Models and Explainable AI
Melusi Malinga, Isaac Lupanda, Mike Wa Nkongolo, and Phil van Deventer

TL;DR
This paper develops a multilingual sentiment lexicon and applies advanced AI models, including BERT with explainability, to improve sentiment analysis and translation for low-resource languages in South Africa and DRC.
Contribution
It introduces a culturally relevant multilingual sentiment lexicon and demonstrates the effectiveness of BERT and traditional ML models in low-resource language sentiment analysis.
Findings
BERT achieved 99% accuracy and 98% precision in sentiment prediction.
Random Forest outperformed other ML models in handling language nuances.
Explainable AI improved transparency in sentiment classification.
Abstract
South Africa and the Democratic Republic of Congo (DRC) present a complex linguistic landscape with languages such as Zulu, Sepedi, Afrikaans, French, English, and Tshiluba (Ciluba), which creates unique challenges for AI-driven translation and sentiment analysis systems due to a lack of accurately labeled data. This study seeks to address these challenges by developing a multilingual lexicon designed for French and Tshiluba, now expanded to include translations in English, Afrikaans, Sepedi, and Zulu. The lexicon enhances cultural relevance in sentiment classification by integrating language-specific sentiment scores. A comprehensive testing corpus is created to support translation and sentiment analysis tasks, with machine learning models such as Random Forest, Support Vector Machine (SVM), Decision Trees, and Gaussian Naive Bayes (GNB) trained to predict sentiment across low resource…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling
MethodsAttention Is All You Need · Linear Layer · Softmax · Dropout · Refunds@Expedia|||How do I get a full refund from Expedia? · Dense Connections · Layer Normalization · Linear Warmup With Linear Decay · WordPiece · Adam
