ChatSchema: A pipeline of extracting structured information with Large   Multimodal Models based on schema

Fei Wang; Yuewen Zheng; Qin Li; Jingyi Wu; Pengfei Li and; Luxia Zhang

arXiv:2407.18716·cs.CL·July 29, 2024

ChatSchema: A pipeline of extracting structured information with Large Multimodal Models based on schema

Fei Wang, Yuewen Zheng, Qin Li, Jingyi Wu, Pengfei Li and, Luxia Zhang

PDF

Open Access

TL;DR

ChatSchema leverages Large Multimodal Models and schema-based OCR to accurately extract and structure information from unstructured medical reports, improving data standardization and entry efficiency.

Contribution

This paper introduces ChatSchema, a novel two-stage method combining LMMs and schema-guided OCR for structured information extraction from medical reports.

Findings

01

GPT-4o outperformed Gemini 1.5 Pro in extraction tasks.

02

Key extraction achieved 98.6% F1-score.

03

Significant accuracy improvements over baseline methods.

Abstract

Objective: This study introduces ChatSchema, an effective method for extracting and structuring information from unstructured data in medical paper reports using a combination of Large Multimodal Models (LMMs) and Optical Character Recognition (OCR) based on the schema. By integrating predefined schema, we intend to enable LMMs to directly extract and standardize information according to the schema specifications, facilitating further data entry. Method: Our approach involves a two-stage process, including classification and extraction for categorizing report scenarios and structuring information. We established and annotated a dataset to verify the effectiveness of ChatSchema, and evaluated key extraction using precision, recall, F1-score, and accuracy metrics. Based on key extraction, we further assessed value extraction. We conducted ablation studies on two LMMs to illustrate the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies · Natural Language Processing Techniques · Topic Modeling