Towards AI Evaluation in Domain-Specific RAG Systems: The AgriHubi Case Study

Md. Toufique Hasan; Ayman Asad Khan; Mika Saari; Vaishnavi Bankhele; Pekka Abrahamsson

arXiv:2602.02208·cs.CL·February 3, 2026

Towards AI Evaluation in Domain-Specific RAG Systems: The AgriHubi Case Study

Md. Toufique Hasan, Ayman Asad Khan, Mika Saari, Vaishnavi Bankhele, Pekka Abrahamsson

PDF

Open Access

TL;DR

This paper introduces AgriHubi, a Finnish-language agricultural RAG system that improves answer quality and reliability through domain adaptation, explicit source grounding, and iterative refinement, addressing challenges in low-resource language settings.

Contribution

It presents a novel domain-specific RAG system for agriculture in Finnish, integrating source grounding and user feedback, with empirical evaluation and practical insights for low-resource languages.

Findings

01

AgriHubi improves answer completeness and linguistic accuracy.

02

User feedback enhances system reliability and relevance.

03

Trade-offs between response quality and latency are identified.

Abstract

Large language models show promise for knowledge-intensive domains, yet their use in agriculture is constrained by weak grounding, English-centric training data, and limited real-world evaluation. These issues are amplified for low-resource languages, where high-quality domain documentation exists but remains difficult to access through general-purpose models. This paper presents AgriHubi, a domain-adapted retrieval-augmented generation (RAG) system for Finnish-language agricultural decision support. AgriHubi integrates Finnish agricultural documents with open PORO family models and combines explicit source grounding with user feedback to support iterative refinement. Developed over eight iterations and evaluated through two user studies, the system shows clear gains in answer completeness, linguistic accuracy, and perceived reliability. The results also reveal practical trade-offs…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Information Retrieval and Search Behavior · Multimodal Machine Learning Applications