KnowledgeShovel: An AI-in-the-Loop Document Annotation System for Scientific Knowledge Base Construction
Shao Zhang, Yuting Jia, Hui Xu, Dakuo Wang, Toby Jia-jun Li, Ying Wen,, Xinbing Wang, Chenghu Zhou

TL;DR
KnowledgeShovel is an AI-assisted document annotation system that facilitates the construction of scientific knowledge bases by integrating human expertise with AI in a multi-modal, multi-step workflow, improving accuracy and reducing effort.
Contribution
It introduces a novel multi-modal, human-AI collaborative pipeline tailored for scientific knowledge base construction, addressing challenges of diverse data and domain complexity.
Findings
Enables efficient knowledge base construction with satisfactory accuracy.
Reduces human effort in annotating complex scientific literature.
Supports multi-modal data integration in scientific annotation workflows.
Abstract
Constructing a comprehensive, accurate, and useful scientific knowledge base is crucial for human researchers synthesizing scientific knowledge and for enabling Al-driven scientific discovery. However, the current process is difficult, error-prone, and laborious due to (1) the enormous amount of scientific literature available; (2) the highly-specialized scientific domains; (3) the diverse modalities of information (text, figure, table); and, (4) the silos of scientific knowledge in different publications with inconsistent formats and structures. Informed by a formative study and iterated with participatory design workshops, we designed and developed KnowledgeShovel, an Al-in-the-Loop document annotation system for researchers to construct scientific knowledge bases. The design of KnowledgeShovel introduces a multi-step multi-modal human-AI collaboration pipeline that aligns with users'…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Semantic Web and Ontologies · Research Data Management Practices
MethodsBalanced Selection
