Efficient Learning Content Retrieval with Knowledge Injection

Batuhan Sariturk; Rabia Bayraktar; Merve Elmas Erdem

arXiv:2412.00125·cs.CL·December 3, 2024

Efficient Learning Content Retrieval with Knowledge Injection

Batuhan Sariturk, Rabia Bayraktar, Merve Elmas Erdem

PDF

Open Access

TL;DR

This paper presents a resource-efficient domain-specific chatbot for educational content retrieval, combining fine-tuned Phi language models with a RAG system to improve accuracy in ICT learning environments.

Contribution

It introduces a novel approach integrating Phi models fine-tuned with QLoRA and a RAG system for effective educational content retrieval in a resource-limited setting.

Findings

01

Phi-2 model with RAG achieved 0.84 precision

02

F1 score of 0.82 for Phi-2 with RAG

03

Multiple evaluation metrics used to compare models

Abstract

With the rise of online education platforms, there is a growing abundance of educational content across various domain. It can be difficult to navigate the numerous available resources to find the most suitable training, especially in domains that include many interconnected areas, such as ICT. In this study, we propose a domain-specific chatbot application that requires limited resources, utilizing versions of the Phi language model to help learners with educational content. In the proposed method, Phi-2 and Phi-3 models were fine-tuned using QLoRA. The data required for fine-tuning was obtained from the Huawei Talent Platform, where courses are available at different levels of expertise in the field of computer science. RAG system was used to support the model, which was fine-tuned by 500 Q&A pairs. Additionally, a total of 420 Q&A pairs of content were extracted from different…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEducational Technology and Assessment · Natural Language Processing Techniques

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Linear Layer · Softmax · Linear Warmup With Linear Decay · Multi-Head Attention · Byte Pair Encoding · WordPiece · Dropout · Dense Connections · Layer Normalization