# MedCATTrainer: A Biomedical Free Text Annotation Interface with Active   Learning and Research Use Case Specific Customisation

**Authors:** Thomas Searle, Zeljko Kraljevic, Rebecca Bendayan, Daniel Bean,, Richard Dobson

arXiv: 1907.07322 · 2023-02-28

## TL;DR

MedCATTrainer is an interactive web tool designed for customizing biomedical NER+L models, enabling efficient data annotation and model improvement tailored to specific clinical research needs.

## Contribution

It introduces a novel interface combining active learning and research-specific customization for biomedical text annotation and model training.

## Key findings

- Efficient collection of research-specific training data.
- Improved accuracy in biomedical NER+L tasks.
- Enhanced user interaction for model refinement.

## Abstract

We present MedCATTrainer an interface for building, improving and customising a given Named Entity Recognition and Linking (NER+L) model for biomedical domain text. NER+L is often used as a first step in deriving value from clinical text. Collecting labelled data for training models is difficult due to the need for specialist domain knowledge. MedCATTrainer offers an interactive web-interface to inspect and improve recognised entities from an underlying NER+L model via active learning. Secondary use of data for clinical research often has task and context specific criteria. MedCATTrainer provides a further interface to define and collect supervised learning training data for researcher specific use cases. Initial results suggest our approach allows for efficient and accurate collection of research use case specific training data.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1907.07322/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/1907.07322/full.md

## References

16 references — full list in the complete paper: https://tomesphere.com/paper/1907.07322/full.md

---
Source: https://tomesphere.com/paper/1907.07322