WeQA: A Benchmark for Retrieval Augmented Generation in Wind Energy Domain
Rounak Meyur, Hung Phan, Sridevi Wagle, Jan Strube, Mahantesh Halappanavar, Sameera Horawalavithana, Anurag Acharya, Sai Munikoti

TL;DR
This paper introduces WeQA, a novel benchmark for evaluating retrieval-augmented generation systems in the wind energy domain, facilitating rigorous assessment of LLMs handling complex scientific documents.
Contribution
It presents a comprehensive framework for creating domain-specific RAG benchmarks using automatic question-answer generation with expert-AI collaboration, demonstrated through the wind energy case study.
Findings
First domain-specific RAG benchmark for wind energy.
Framework enables systematic evaluation of RAG performance.
Supports identification of improvement areas in scientific NLP applications.
Abstract
Wind energy project assessments present significant challenges for decision-makers, who must navigate and synthesize hundreds of pages of environmental and scientific documentation. These documents often span different regions and project scales, covering multiple domains of expertise. This process traditionally demands immense time and specialized knowledge from decision-makers. The advent of Large Language Models (LLM) and Retrieval Augmented Generation (RAG) approaches offer a transformative solution, enabling rapid, accurate cross-document information retrieval and synthesis. As the landscape of Natural Language Processing (NLP) and text generation continues to evolve, benchmarking becomes essential to evaluate and compare the performance of different RAG-based LLMs. In this paper, we present a comprehensive framework to generate a domain relevant RAG benchmark. Our framework is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsPower Systems and Technologies
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Linear Layer · WordPiece · Residual Connection · Multi-Head Attention · Linear Warmup With Linear Decay · Attention Dropout · Adam · Layer Normalization · Weight Decay
