Material Database Agent: A Multimodal Agentic Framework for Scientific Literature Mining
Achuth Chandrasekhar, Omid Barati Farimani, Radheesh Sharma Meda, Amir Barati Farimani

TL;DR
Material Database Agent (MDA) is a modular, multi-agent system that leverages multimodal large language models to efficiently convert scientific literature into structured material databases.
Contribution
MDA introduces a novel multi-agent architecture that processes research articles into databases, enhancing scalability and accuracy over traditional methods.
Findings
MDA can process PDFs into structured databases with high accuracy.
The system operates in parallel, assembling sub-databases from text and figures.
MDA demonstrates potential for automating scientific literature mining in materials science.
Abstract
Materials science workflows rely on structured and unstructured data from the vast body of available scientific literature. However, most of the experimental details remain buried in text, tables, graphs and figures. Thus, constructing databases that incorporate this data is a manual, time-consuming, and hard-to-scale process. Multimodal large language models have made it feasible to extract information from text and scientific figures with high speed and accuracy. This opens the possibility of an AI system that can create production-scale material databases. Material Database Agent (MDA) is a modular, multi-agent system architecture for converting research literature into structured databases. MDA accepts article PDFs as input, which are subsequently processed in parallel into markdown files and figures. Multiple sub-agents read these markdown files and figures in parallel to assemble…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
