Comparison of pipeline, sequence-to-sequence, and GPT models for end-to-end relation extraction: experiments with the rare disease use-case
Shashank Gupta, Xuguang Ai, Ramakanth Kavuluru

TL;DR
This study compares pipeline, sequence-to-sequence, and GPT models for end-to-end relation extraction in biomedical texts, revealing pipeline models currently outperform GPTs, especially with available training data, and highlights the need for hybrid approaches.
Contribution
First to evaluate E2ERE on the RareDis dataset, providing a comprehensive comparison of three major paradigms in complex biomedical relation extraction tasks.
Findings
Pipeline models outperform GPT models in E2ERE tasks.
Sequence-to-sequence models are competitive but slightly behind pipelines.
GPT models with more parameters perform worse than smaller, specialized models.
Abstract
End-to-end relation extraction (E2ERE) is an important and realistic application of natural language processing (NLP) in biomedicine. In this paper, we aim to compare three prevailing paradigms for E2ERE using a complex dataset focused on rare diseases involving discontinuous and nested entities. We use the RareDis information extraction dataset to evaluate three competing approaches (for E2ERE): NER RE pipelines, joint sequence to sequence models, and generative pre-trained transformer (GPT) models. We use comparable state-of-the-art models and best practices for each of these approaches and conduct error analyses to assess their failure modes. Our findings reveal that pipeline models are still the best, while sequence-to-sequence models are not far behind; GPT models with eight times as many parameters are worse than even sequence-to-sequence models and lose to pipeline…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Biomedical Text Mining and Ontologies
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Refunds@Expedia|||How do I get a full refund from Expedia? · Adam · Weight Decay · Cosine Annealing · Residual Connection · Byte Pair Encoding · Dense Connections
