Non-Contextual BERT or FastText? A Comparative Analysis
Abhay Shanbhag, Suramya Jadhav, Amogh Thakurdesai, Ridhima Sinare,, Raviraj Joshi

TL;DR
This paper compares non-contextual BERT and FastText embeddings for low-resource Marathi NLP tasks, finding non-contextual BERT embeddings from the first layer outperform FastText, offering a promising alternative.
Contribution
It provides a comprehensive analysis of non-contextual BERT embeddings in low-resource NLP, highlighting their effectiveness over FastText in various tasks.
Findings
Non-contextual BERT embeddings outperform FastText in Marathi NLP tasks.
First-layer BERT embeddings are more effective than deeper layers.
Non-contextual BERT embeddings are a promising low-resource NLP alternative.
Abstract
Natural Language Processing (NLP) for low-resource languages, which lack large annotated datasets, faces significant challenges due to limited high-quality data and linguistic resources. The selection of embeddings plays a critical role in achieving strong performance in NLP tasks. While contextual BERT embeddings require a full forward pass, non-contextual BERT embeddings rely only on table lookup. Existing research has primarily focused on contextual BERT embeddings, leaving non-contextual embeddings largely unexplored. In this study, we analyze the effectiveness of non-contextual embeddings from BERT models (MuRIL and MahaBERT) and FastText models (IndicFT and MahaFT) for tasks such as news classification, sentiment analysis, and hate speech detection in one such low-resource language Marathi. We compare these embeddings with their contextual and compressed variants. Our findings…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Business Process Modeling and Analysis · Multi-Agent Systems and Negotiation
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Linear Warmup With Linear Decay · Layer Normalization · Adam · Residual Connection · Weight Decay · Logistic Regression · Softmax
