BERT might be Overkill: A Tiny but Effective Biomedical Entity Linker based on Residual Convolutional Neural Networks
Tuan Lai, Heng Ji, and ChengXiang Zhai

TL;DR
This paper introduces a lightweight convolutional neural network for biomedical entity linking that rivals BERT-based models in accuracy but is significantly more efficient and has far fewer parameters.
Contribution
The authors propose a residual convolutional neural network that achieves comparable or better performance than BERT-based models with 60 times fewer parameters.
Findings
The model achieves state-of-the-art accuracy on five datasets.
Performance remains stable despite input shuffling and limited attention scope.
The proposed model is highly efficient with significantly fewer parameters.
Abstract
Biomedical entity linking is the task of linking entity mentions in a biomedical document to referent entities in a knowledge base. Recently, many BERT-based models have been introduced for the task. While these models have achieved competitive results on many datasets, they are computationally expensive and contain about 110M parameters. Little is known about the factors contributing to their impressive performance and whether the over-parameterization is needed. In this work, we shed some light on the inner working mechanisms of these large BERT-based models. Through a set of probing experiments, we have found that the entity linking performance only changes slightly when the input word order is shuffled or when the attention scope is limited to a fixed window size. From these observations, we propose an efficient convolutional neural network with residual connections for biomedical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Biomedical Text Mining and Ontologies · Natural Language Processing Techniques
