Efficient Learning Content Retrieval with Knowledge Injection
Batuhan Sariturk, Rabia Bayraktar, Merve Elmas Erdem

TL;DR
This paper presents a resource-efficient domain-specific chatbot for educational content retrieval, combining fine-tuned Phi language models with a RAG system to improve accuracy in ICT learning environments.
Contribution
It introduces a novel approach integrating Phi models fine-tuned with QLoRA and a RAG system for effective educational content retrieval in a resource-limited setting.
Findings
Phi-2 model with RAG achieved 0.84 precision
F1 score of 0.82 for Phi-2 with RAG
Multiple evaluation metrics used to compare models
Abstract
With the rise of online education platforms, there is a growing abundance of educational content across various domain. It can be difficult to navigate the numerous available resources to find the most suitable training, especially in domains that include many interconnected areas, such as ICT. In this study, we propose a domain-specific chatbot application that requires limited resources, utilizing versions of the Phi language model to help learners with educational content. In the proposed method, Phi-2 and Phi-3 models were fine-tuned using QLoRA. The data required for fine-tuning was obtained from the Huawei Talent Platform, where courses are available at different levels of expertise in the field of computer science. RAG system was used to support the model, which was fine-tuned by 500 Q&A pairs. Additionally, a total of 420 Q&A pairs of content were extracted from different…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEducational Technology and Assessment · Natural Language Processing Techniques
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Linear Layer · Softmax · Linear Warmup With Linear Decay · Multi-Head Attention · Byte Pair Encoding · WordPiece · Dropout · Dense Connections · Layer Normalization
