Coarse and Fine-Grained Hostility Detection in Hindi Posts using Fine Tuned Multilingual Embeddings
Arkadipta De, Venkatesh E, Kaushal Kumar Maurya, Maunendra Sankar, Desarkar

TL;DR
This paper presents a neural network approach using multilingual BERT to detect various types of hostility in Hindi social media posts, achieving state-of-the-art results despite resource constraints.
Contribution
It introduces a multi-label classification framework for hostility detection in Hindi using fine-tuned multilingual embeddings, outperforming existing baselines.
Findings
Achieved high F1 scores for multiple hostility categories.
Outperformed baseline models with a novel One-vs-the-Rest approach.
Established a new state-of-the-art for Hindi hostility detection.
Abstract
Due to the wide adoption of social media platforms like Facebook, Twitter, etc., there is an emerging need of detecting online posts that can go against the community acceptance standards. The hostility detection task has been well explored for resource-rich languages like English, but is unexplored for resource-constrained languages like Hindidue to the unavailability of large suitable data. We view this hostility detection as a multi-label multi-class classification problem. We propose an effective neural network-based technique for hostility detection in Hindi posts. We leverage pre-trained multilingual Bidirectional Encoder Representations of Transformer (mBERT) to obtain the contextual representations of Hindi posts. We have performed extensive experiments including different pre-processing techniques, pre-trained models, neural architectures, hybrid strategies, etc. Our best…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Spam and Phishing Detection · Advanced Malware Detection Techniques
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Dense Connections · Label Smoothing · Attention Is All You Need · Byte Pair Encoding · Multi-Head Attention · Dropout · Layer Normalization
