Knowledge AI: Fine-tuning NLP Models for Facilitating Scientific Knowledge Extraction and Understanding
Balaji Muralidharan, Hayden Beadles, Reza Marzban, Kalyan Sashank, Mupparaju

TL;DR
This paper presents Knowledge AI, a framework that fine-tunes large language models on scientific data to improve their ability to perform NLP tasks like summarization, question answering, and entity recognition for scientific knowledge extraction.
Contribution
It introduces a deep learning framework that adapts pre-trained LLMs for scientific NLP tasks, enhancing knowledge extraction and understanding in scientific domains.
Findings
Domain-specific fine-tuning improves model performance across tasks.
Fine-tuned models enable non-experts to extract scientific information efficiently.
The framework demonstrates potential for scientific knowledge discovery.
Abstract
This project investigates the efficacy of Large Language Models (LLMs) in understanding and extracting scientific knowledge across specific domains and to create a deep learning framework: Knowledge AI. As a part of this framework, we employ pre-trained models and fine-tune them on datasets in the scientific domain. The models are adapted for four key Natural Language Processing (NLP) tasks: summarization, text generation, question answering, and named entity recognition. Our results indicate that domain-specific fine-tuning significantly enhances model performance in each of these tasks, thereby improving their applicability for scientific contexts. This adaptation enables non-experts to efficiently query and extract information within targeted scientific fields, demonstrating the potential of fine-tuned LLMs as a tool for knowledge discovery in the sciences.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Topic Modeling · Scientific Computing and Data Management
