KALE-LM-Chem: Vision and Practice Toward an AI Brain for Chemistry
Weichen Dai, Yezeng Chen, Zijie Dai, Yubo Liu, Zhijie Huang, Yixuan Pan, Baiyang Song, Chengli Zhong, Xinhe Li, Zeyu Wang, Zhuoying Feng, Yi Zhou

TL;DR
This paper proposes a vision for an AI chemical brain leveraging large language models, focusing on core capabilities like information extraction, semantic parsing, knowledge-based QA, and reasoning to accelerate scientific discovery.
Contribution
Introduction of KALE-LM-Chem models, the first large language models for chemistry, demonstrating outstanding performance in chemical tasks and advancing domain-specific AI.
Findings
KALE-LM-Chem models achieve high accuracy in chemical tasks
The system demonstrates effective knowledge extraction and reasoning
Promotes AI-driven scientific discovery in chemistry
Abstract
Recent advancements in large language models (LLMs) have demonstrated strong potential for enabling domain-specific intelligence. In this work, we present our vision for building an AI-powered chemical brain, which frames chemical intelligence around four core capabilities: information extraction, semantic parsing, knowledge-based QA, and reasoning & planning. We argue that domain knowledge and logic are essential pillars for enabling such a system to assist and accelerate scientific discovery. To initiate this effort, we introduce our first generation of large language models for chemistry: KALE-LM-Chem and KALE-LM-Chem-1.5, which have achieved outstanding performance in tasks related to the field of chemistry. We hope that our work serves as a strong starting point, helping to realize more intelligent AI and promoting the advancement of human science and technology, as well as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗USTC-KnowledgeComputingLab/Llama3-KALE-LM-Chem-8Bmodel· 11 dl· ♡ 611 dl♡ 6
- 🤗USTC-KnowledgeComputingLab/Llama3-KALE-LM-Chem-1.5-8Bmodel· 579 dl· ♡ 2579 dl♡ 2
- 🤗RichardErkhov/USTC-KnowledgeComputingLab_-_Llama3-KALE-LM-Chem-1.5-8B-ggufmodel· 217 dl217 dl
- 🤗RichardErkhov/USTC-KnowledgeComputingLab_-_Llama3-KALE-LM-Chem-1.5-8B-8bitsmodel· 5 dl5 dl
- 🤗RichardErkhov/USTC-KnowledgeComputingLab_-_Llama3-KALE-LM-Chem-1.5-8B-awqmodel· 1 dl1 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Physics and Python Applications
MethodsSoftmax · Attention Is All You Need
