Evaluating the Impact of Advanced LLM Techniques on AI-Lecture Tutors for a Robotics Course
Sebastian Kahl, Felix L\"offler, Martin Maciol, Fabian Ridder, Marius, Schmitz, Jennifer Spanagel, Jens Wienkamp, Christopher Burgahn, Malte, Schilling

TL;DR
This paper evaluates how advanced techniques like prompt engineering, RAG, and fine-tuning improve LLM-based AI tutors for university courses, highlighting benefits and challenges in educational applications.
Contribution
It demonstrates that RAG combined with prompt engineering significantly improves LLM responses and discusses the limitations of current evaluation metrics in educational contexts.
Findings
RAG with prompt engineering enhances factual accuracy
Fine-tuning produces strong but potentially overfitted models
Similarity metrics correlate with performance but favor shorter responses
Abstract
This study evaluates the performance of Large Language Models (LLMs) as an Artificial Intelligence-based tutor for a university course. In particular, different advanced techniques are utilized, such as prompt engineering, Retrieval-Augmented-Generation (RAG), and fine-tuning. We assessed the different models and applied techniques using common similarity metrics like BLEU-4, ROUGE, and BERTScore, complemented by a small human evaluation of helpfulness and trustworthiness. Our findings indicate that RAG combined with prompt engineering significantly enhances model responses and produces better factual answers. In the context of education, RAG appears as an ideal technique as it is based on enriching the input of the model with additional information and material which usually is already present for a university course. Fine-tuning, on the other hand, can produce quite small, still…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI in Service Interactions · Robotic Process Automation Applications
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Linear Layer · Attention Dropout · WordPiece · Layer Normalization · Multi-Head Attention · Linear Warmup With Linear Decay · Attention Is All You Need · Weight Decay · Adam
