An Exploration of Encoder-Decoder Approaches to Multi-Label Classification for Legal and Biomedical Text
Yova Kementchedjhieva, Ilias Chalkidis

TL;DR
This paper compares encoder-only and encoder-decoder models for multi-label classification in legal and biomedical texts, demonstrating that encoder-decoder models, especially non-autoregressive ones, outperform encoder-only models, particularly on complex datasets.
Contribution
It provides a comprehensive comparison of encoder-only and encoder-decoder approaches for multi-label classification across multiple domains and dataset complexities, highlighting the advantages of encoder-decoder models.
Findings
Encoder-decoder models outperform encoder-only models.
Non-autoregressive encoder-decoder models achieve the best results.
Performance gains are more significant on complex datasets with finer labels.
Abstract
Standard methods for multi-label text classification largely rely on encoder-only pre-trained language models, whereas encoder-decoder models have proven more effective in other classification tasks. In this study, we compare four methods for multi-label classification, two based on an encoder only, and two based on an encoder-decoder. We carry out experiments on four datasets -- two in the legal domain and two in the biomedical domain, each with two levels of label granularity -- and always depart from the same pre-trained model, T5. Our results show that encoder-decoder methods outperform encoder-only methods, with a growing advantage on more complex datasets and labeling schemes of finer granularity. Using encoder-decoder models in a non-autoregressive fashion, in particular, yields the best performance overall, so we further study this approach through ablations to better understand…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies · Topic Modeling · Natural Language Processing Techniques
MethodsGated Linear Unit · Attention Is All You Need · Adafactor · Softmax · Inverse Square Root Schedule · Layer Normalization · Linear Layer · Dropout · Byte Pair Encoding · Refunds@Expedia|||How do I get a full refund from Expedia?
