ScienceDB AI: An LLM-Driven Agentic Recommender System for Large-Scale Scientific Data Sharing Services
Qingqing Long, Haotian Chen, Chenyang Zhao, Xiaolei Du, Xuezhi Wang, Pengyao Wang, Chengzan Li, Yuanchun Zhou, Hengshu Zhu

TL;DR
ScienceDB AI is a pioneering LLM-based conversational recommender system designed to improve scientific dataset sharing by understanding complex queries, extracting research intentions, and providing trustworthy, personalized dataset recommendations on a large-scale platform.
Contribution
The paper introduces ScienceDB AI, the first LLM-driven agentic recommender system tailored for scientific data sharing, with novel modules for intention extraction, dialogue management, and trustworthy retrieval.
Findings
Demonstrated significant effectiveness in offline and online experiments.
Successfully recommended datasets aligned with researchers' scientific intents.
Enhanced trustworthiness and reproducibility through CSTR identifiers.
Abstract
The rapid growth of AI for Science (AI4S) has underscored the significance of scientific datasets, leading to the establishment of numerous national scientific data centers and sharing platforms. Despite this progress, efficiently promoting dataset sharing and utilization for scientific research remains challenging. Scientific datasets contain intricate domain-specific knowledge and contexts, rendering traditional collaborative filtering-based recommenders inadequate. Recent advances in Large Language Models (LLMs) offer unprecedented opportunities to build conversational agents capable of deep semantic understanding and personalized recommendations. In response, we present ScienceDB AI, a novel LLM-driven agentic recommender system developed on Science Data Bank (ScienceDB), one of the largest global scientific data-sharing platforms. ScienceDB AI leverages natural language…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Scientific Computing and Data Management · Topic Modeling
