OIDA-QA: A Multimodal Benchmark for Analyzing the Opioid Industry Documents Archive
Xuan Shen, Brian Wingenroth, Zichao Wang, Jason Kuen, Wanrong Zhu, Ruiyi Zhang, Yiwei Wang, Lichun Ma, Anqi Liu, Hongfu Liu, Tong Sun, Kevin S. Hawkins, Kate Tasker, G. Caleb Alexander, Jiuxiang Gu

TL;DR
This paper introduces OIDA-QA, a comprehensive multimodal benchmark dataset and AI models for analyzing complex opioid industry documents, enhancing document understanding and question-answering accuracy.
Contribution
It creates a large-scale multimodal dataset and develops domain-specific multimodal LLMs tailored for analyzing opioid industry documents.
Findings
Improved document information extraction accuracy
Enhanced question-answering performance with multimodal inputs
Effective use of historical QA pairs for contextual understanding
Abstract
The opioid crisis represents a significant moment in public health that reveals systemic shortcomings across regulatory systems, healthcare practices, corporate governance, and public policy. Analyzing how these interconnected systems simultaneously failed to protect public health requires innovative analytic approaches for exploring the vast amounts of data and documents disclosed in the UCSF-JHU Opioid Industry Documents Archive (OIDA). The complexity, multimodal nature, and specialized characteristics of these healthcare-related legal and corporate documents necessitate more advanced methods and models tailored to specific data types and detailed annotations, ensuring the precision and professionalism in the analysis. In this paper, we tackle this challenge by organizing the original dataset according to document attributes and constructing a benchmark with 400k training documents…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Computational and Text Analysis Methods · Mental Health via Writing
