The Inception Team at NSURL-2019 Task 8: Semantic Question Similarity in Arabic
Hana Al-Theiabat, Aisha Al-Sadi

TL;DR
This paper presents an ensemble approach using multilingual BERT for Arabic semantic question similarity, achieving top performance in the NSURL-2019 task with F1-scores up to 96%.
Contribution
It introduces an effective ensemble model leveraging multilingual BERT for Arabic question similarity, outperforming previous methods in the NSURL-2019 challenge.
Findings
Achieved up to 96% F1-score on the dataset.
Ensemble of multilingual BERT models ranked first.
Demonstrated effectiveness of pre-trained multilingual models for Arabic NLP.
Abstract
This paper describes our method for the task of Semantic Question Similarity in Arabic in the workshop on NLP Solutions for Under-Resourced Languages (NSURL). The aim is to build a model that is able to detect similar semantic questions in the Arabic language for the provided dataset. Different methods of determining questions similarity are explored in this work. The proposed models achieved high F1-scores, which range from (88% to 96%). Our official best result is produced from the ensemble model of using a pre-trained multilingual BERT model with different random seeds with 95.924% F1-Score, which ranks the first among nine participants teams.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text and Document Classification Technologies
MethodsLinear Layer · Residual Connection · Attention Dropout · Linear Warmup With Linear Decay · Weight Decay · Refunds@Expedia|||How do I get a full refund from Expedia? · Dense Connections · Adam · WordPiece · Softmax
