Biomedical Data-to-Text Generation via Fine-Tuning Transformers
Ruslan Yermakov, Nicholas Drago, Angelo Ziletti

TL;DR
This paper demonstrates that fine-tuned transformer models can generate realistic biomedical package leaflets from data, introduces a new dataset for benchmarking, and discusses current limitations in biomedical data-to-text generation.
Contribution
It presents a novel application of transformers to biomedical data-to-text generation and releases a new dataset for future research in this area.
Findings
Transformers can generate realistic biomedical texts from data.
The models have notable limitations in the biomedical domain.
A new dataset, BioLeaflets, is introduced for benchmarking.
Abstract
Data-to-text (D2T) generation in the biomedical domain is a promising - yet mostly unexplored - field of research. Here, we apply neural models for D2T generation to a real-world dataset consisting of package leaflets of European medicines. We show that fine-tuned transformers are able to generate realistic, multisentence text from data in the biomedical domain, yet have important limitations. We also release a new dataset (BioLeaflets) for benchmarking D2T generation models in the biomedical domain.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Biomedical Text Mining and Ontologies · Natural Language Processing Techniques
