Evaluating Digital Inclusiveness of Digital Agri-Food Tools Using Large Language Models: A Comparative Analysis Between Human and AI-Based Evaluations

Githma Pewinya; Carolina Martins; Garcia Mariangel

arXiv:2604.03252·cs.CY·April 7, 2026

Evaluating Digital Inclusiveness of Digital Agri-Food Tools Using Large Language Models: A Comparative Analysis Between Human and AI-Based Evaluations

Githma Pewinya, Carolina Martins, Garcia Mariangel

PDF

TL;DR

This paper investigates the potential of large language models to rapidly and effectively evaluate the digital inclusiveness of agricultural tools, comparing their performance to traditional human assessments.

Contribution

It introduces a comparative analysis of four LLMs against expert evaluations for assessing digital inclusiveness in agri-food tools, highlighting their potential and limitations.

Findings

01

LLMs can approximate expert judgments in some evaluation dimensions.

02

Model performance varies across different LLMs and contexts.

03

AI-based assessments could complement traditional evaluation workflows.

Abstract

Ensuring digital inclusiveness is a critical priority in agri-food systems, particularly in the Global South, where digital divides persist. The Multidimensional Digital Inclusiveness Index (MDII) offers a comprehensive, human-led framework to assess how inclusive digital agricultural tools (agritools) are. However, the current evaluation process is resource intensive, often requiring months to complete. This study explores whether large language models (LLMs) can support a rapid, AI-enabled assessment of digital inclusiveness, complementing the MDII's existing workflow. Using a comparative analysis, the research benchmarks the performance of four LLMs (Grok, Gemini, GPT-4o, and GPT-5) against prior expert-led evaluations. The study investigates model alignment with human scores, sensitivity to temperature settings, and potential sources of bias. Findings suggest that LLMs can generate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.