A framework for information extraction from tables in biomedical   literature

Nikola Milosevic; Cassie Gregson; Robert Hernandez; Goran Nenadic

arXiv:1902.10031·cs.CL·February 27, 2019

A framework for information extraction from tables in biomedical literature

Nikola Milosevic, Cassie Gregson, Robert Hernandez, Goran Nenadic

PDF

1 Repo

TL;DR

This paper introduces an integrated framework for extracting both numerical and textual information from biomedical tables, addressing a gap in text mining approaches that often ignore tabular data.

Contribution

It presents a comprehensive 7-step methodology for extracting structured data from clinical literature tables, which improves upon previous isolated or less systematic methods.

Findings

01

F-measure ranged between 82% and 92% depending on task

02

Effective extraction of numerical and textual data from biomedical tables

03

Addresses complexities and challenges in table data mining

Abstract

The scientific literature is growing exponentially, and professionals are no more able to cope with the current amount of publications. Text mining provided in the past methods to retrieve and extract information from text; however, most of these approaches ignored tables and figures. The research done in mining table data still does not have an integrated approach for mining that would consider all complexities and challenges of a table. Our research is examining the methods for extracting numerical (number of patients, age, gender distribution) and textual (adverse reactions) information from tables in the clinical literature. We present a requirement analysis template and an integral methodology for information extraction from tables in clinical domain that contains 7 steps: (1) table detection, (2) functional processing, (3) structural processing, (4) semantic tagging, (5) pragmatic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nikolamilosevic86/TabInOut
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.