SciDFM: A Large Language Model with Mixture-of-Experts for Science
Liangtai Sun, Danyu Luo, Da Ma, Zihan Zhao, Baocai Chen, Zhennan Shen,, Su Zhu, Lu Chen, Xin Chen, Kai Yu

TL;DR
SciDFM is a large language model with a mixture-of-experts architecture, trained on extensive scientific data to enhance domain-specific scientific reasoning and understanding of molecules and amino acids, achieving state-of-the-art results.
Contribution
Introduces SciDFM, a mixture-of-experts LLM trained from scratch with a large scientific corpus, fine-tuned on instruction data, and capable of domain-specific scientific reasoning.
Findings
Strong performance on scientific benchmarks like SciEval and SciQ
Achieves state-of-the-art results on domain-specific benchmarks
Expert layer analysis shows discipline-dependent expert selection
Abstract
Recently, there has been a significant upsurge of interest in leveraging large language models (LLMs) to assist scientific discovery. However, most LLMs only focus on general science, while they lack domain-specific knowledge, such as chemical molecules and amino acid sequences. To bridge these gaps, we introduce SciDFM, a mixture-of-experts LLM, which is trained from scratch and is able to conduct college-level scientific reasoning and understand molecules and amino acid sequences. We collect a large-scale training corpus containing numerous scientific papers and books from different disciplines as well as data from domain-specific databases. We further fine-tune the pre-trained model on lots of instruction data to improve performances on downstream benchmarks. From experiment results, we show that SciDFM achieves strong performance on general scientific benchmarks such as SciEval and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling
MethodsFocus
