Information Extraction from Swedish Medical Prescriptions with   Sig-Transformer Encoder

John Pougue Biyong; Bo Wang; Terry Lyons; Alejo J Nevado-Holgado

arXiv:2010.04897·cs.CL·October 13, 2020

Information Extraction from Swedish Medical Prescriptions with Sig-Transformer Encoder

John Pougue Biyong, Bo Wang, Terry Lyons, Alejo J Nevado-Holgado

PDF

TL;DR

This paper introduces a novel Sig-Transformer encoder that enhances clinical NLP tasks on Swedish prescriptions by integrating signature transforms into the Transformer architecture, outperforming baseline models in key information extraction tasks.

Contribution

The work presents a new Transformer extension with signature transforms, improving Swedish medical prescription information extraction over existing models.

Findings

01

Superior performance in two of three tasks compared to baselines

02

Signature transform integration enhances model capabilities

03

Evaluation of multilingual BERT versus translated text encoding

Abstract

Relying on large pretrained language models such as Bidirectional Encoder Representations from Transformers (BERT) for encoding and adding a simple prediction layer has led to impressive performance in many clinical natural language processing (NLP) tasks. In this work, we present a novel extension to the Transformer architecture, by incorporating signature transform with the self-attention model. This architecture is added between embedding and prediction layers. Experiments on a new Swedish prescription data show the proposed architecture to be superior in two of the three information extraction tasks, comparing to baseline models. Finally, we evaluate two different embedding approaches between applying Multilingual BERT and translating the Swedish text to English then encode with a BERT model pretrained on clinical notes.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · WordPiece · Adam · Byte Pair Encoding · Softmax · Multi-Head Attention · Layer Normalization · Dense Connections