Larger models yield better results? Streamlined severity classification of ADHD-related concerns using BERT-based knowledge distillation
Ahmed Akib Jawad Karim, Kazi Hafiz Md. Asad, Md. Golam Rabiul Alam

TL;DR
This paper presents LastBERT, a lightweight BERT model created through knowledge distillation, which effectively classifies ADHD severity from social media text with reduced computational resources and maintained high performance.
Contribution
The study introduces LastBERT, a significantly smaller BERT-based model that retains strong NLP performance, demonstrating effective knowledge distillation for resource-efficient mental health classification.
Findings
LastBERT is 73.64% smaller than BERT base.
Achieved 85% accuracy and F1 score on ADHD severity classification.
Comparable performance to larger models like DistilBERT and ClinicalBERT.
Abstract
This work focuses on the efficiency of the knowledge distillation approach in generating a lightweight yet powerful BERT based model for natural language processing applications. After the model creation, we applied the resulting model, LastBERT, to a real-world task classifying severity levels of Attention Deficit Hyperactivity Disorder (ADHD)-related concerns from social media text data. Referring to LastBERT, a customized student BERT model, we significantly lowered model parameters from 110 million BERT base to 29 million, resulting in a model approximately 73.64% smaller. On the GLUE benchmark, comprising paraphrase identification, sentiment analysis, and text classification, the student model maintained strong performance across many tasks despite this reduction. The model was also used on a real-world ADHD dataset with an accuracy and F1 score of 85%. When compared to DistilBERT…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAttention Deficit Hyperactivity Disorder · EEG and Brain-Computer Interfaces
