OntoTune: Ontology-Driven Self-training for Aligning Large Language   Models

Zhiqiang Liu; Chengtao Gan; Junjie Wang; Yichi Zhang; Zhongpu Bo,; Mengshu Sun; Huajun Chen; Wen Zhang

arXiv:2502.05478·cs.CL·February 11, 2025·2 cites

OntoTune: Ontology-Driven Self-training for Aligning Large Language Models

Zhiqiang Liu, Chengtao Gan, Junjie Wang, Yichi Zhang, Zhongpu Bo,, Mengshu Sun, Huajun Chen, Wen Zhang

PDF

Open Access 1 Repo 3 Models

TL;DR

OntoTune is a novel ontology-driven self-training framework that aligns large language models with hierarchical domain knowledge, improving their understanding and response quality in specialized fields like medicine.

Contribution

It introduces an ontology-based self-training method leveraging in-context learning to enhance LLM domain knowledge with reduced data costs.

Findings

01

Achieves state-of-the-art results in medical domain QA.

02

Improves hypernym discovery within the ontology.

03

Better preserves original LLM knowledge compared to existing methods.

Abstract

Existing domain-specific Large Language Models (LLMs) are typically developed by fine-tuning general-purposed LLMs with large-scale domain-specific corpora. However, training on large-scale corpora often fails to effectively organize domain knowledge of LLMs, leading to fragmented understanding. Inspired by how humans connect concepts and organize knowledge through mind maps, we aim to emulate this approach by using ontology with hierarchical conceptual knowledge to reorganize LLM's domain knowledge. From this perspective, we propose an ontology-driven self-training framework called OntoTune, which aims to align LLMs with ontology through in-context learning, enabling the generation of responses guided by the ontology. We leverage in-context learning to identify whether the LLM has acquired the specific concept's ontology knowledge, and select the entries not yet mastered by LLM as the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zjukg/ontotune
pytorchOfficial

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques

MethodsSparse Evolutionary Training · Ontology · ALIGN