TL;DR
This paper presents HeBERT, a Hebrew BERT model, and HebEMO, a tool for polarity and emotion detection in Hebrew user-generated content, achieving state-of-the-art performance on multiple language understanding tasks.
Contribution
Introduces HeBERT, a new Hebrew BERT model, and HebEMO, an emotion and polarity detection tool trained on a novel Covid-19 Hebrew dataset.
Findings
HeBERT outperforms existing Hebrew language models on common tasks.
HebEMO achieves high F1-scores of 0.96 for polarity detection.
Emotion detection F1-scores range from 0.78 to 0.97, except for surprise.
Abstract
This paper introduces HeBERT and HebEMO. HeBERT is a Transformer-based model for modern Hebrew text, which relies on a BERT (Bidirectional Encoder Representations for Transformers) architecture. BERT has been shown to outperform alternative architectures in sentiment analysis, and is suggested to be particularly appropriate for MRLs. Analyzing multiple BERT specifications, we find that while model complexity correlates with high performance on language tasks that aim to understand terms in a sentence, a more-parsimonious model better captures the sentiment of entire sentence. Either way, out BERT-based language model outperforms all existing Hebrew alternatives on all common language tasks. HebEMO is a tool that uses HeBERT to detect polarity and extract emotions from Hebrew UGC. HebEMO is trained on a unique Covid-19-related UGC dataset that we collected and annotated for this study.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLinear Layer · Attention Is All You Need · Softmax · Refunds@Expedia|||How do I get a full refund from Expedia? · Dense Connections · Dropout · Layer Normalization · Residual Connection · WordPiece · Attention Dropout
