Neural Text Classification and Stacked Heterogeneous Embeddings for Named Entity Recognition in SMM4H 2021
Usama Yaseen, Stefan Langer

TL;DR
This paper explores neural and machine learning methods for named entity recognition and text classification in medical social media data, demonstrating effective approaches across English and Spanish with competitive results.
Contribution
It introduces a BiLSTM-CRF model with stacked heterogeneous embeddings for NER and evaluates multiple classifiers for text classification, showing their effectiveness across languages.
Findings
Achieved F1-score of 0.50 and 0.82 in NER tasks
F1-score of 0.46 and 0.90 in text classification tasks
Demonstrated cross-lingual applicability for English and Spanish
Abstract
This paper presents our findings from participating in the SMM4H Shared Task 2021. We addressed Named Entity Recognition (NER) and Text Classification. To address NER we explored BiLSTM-CRF with Stacked Heterogeneous Embeddings and linguistic features. We investigated various machine learning algorithms (logistic regression, Support Vector Machine (SVM) and Neural Networks) to address text classification. Our proposed approaches can be generalized to different languages and we have shown its effectiveness for English and Spanish. Our text classification submissions (team:MIC-NLP) have achieved competitive performance with F1-score of and on ADE Classification (Task 1a) and Profession Classification (Task 7a) respectively. In the case of NER, our submissions scored F1-score of and on ADE Span Detection (Task 1b) and Profession Span detection (Task 7b)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
