GPT-4 as Evaluator: Evaluating Large Language Models on Pest Management in Agriculture
Shanglong Yang, Zhipeng Yuan, Shunbao Li, Ruoling Peng, Kang Liu, Po, Yang

TL;DR
This study demonstrates that GPT-4 can effectively evaluate and generate high-quality pest management advice in agriculture, outperforming other LLMs and achieving a 72% accuracy in suggesting pest control actions.
Contribution
The paper introduces a novel approach using GPT-4 as an evaluator for agricultural LLM outputs and assesses their effectiveness in pest management advice.
Findings
GPT-4 outperforms FLAN models in most evaluation categories.
Instruction-based prompts with domain knowledge improve LLM performance.
Achieved 72% accuracy in pest management suggestions.
Abstract
In the rapidly evolving field of artificial intelligence (AI), the application of large language models (LLMs) in agriculture, particularly in pest management, remains nascent. We aimed to prove the feasibility by evaluating the content of the pest management advice generated by LLMs, including the Generative Pre-trained Transformer (GPT) series from OpenAI and the FLAN series from Google. Considering the context-specific properties of agricultural advice, automatically measuring or quantifying the quality of text generated by LLMs becomes a significant challenge. We proposed an innovative approach, using GPT-4 as an evaluator, to score the generated content on Coherence, Logical Consistency, Fluency, Relevance, Comprehensibility, and Exhaustiveness. Additionally, we integrated an expert system based on crop threshold data as a baseline to obtain scores for Factual Accuracy on whether…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational and Text Analysis Methods · Sustainable Agricultural Systems Analysis
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · {Dispute@FaQ-s}How to file a dispute with Expedia? · Attention Is All You Need · Linear Layer · Absolute Position Encodings · Softmax · Layer Normalization · Multi-Head Attention · Cosine Annealing · Dropout
