Evaluating the Performance of Large Language Models for SDG Mapping (Technical Report)
Hui Yin, Amir Aryani, Nakul Nambiar

TL;DR
This study evaluates the performance of various open-source large language models on SDG mapping, comparing them to GPT-4o, and analyzes their effectiveness using multiple metrics and threshold-based performance curves.
Contribution
It provides a comparative analysis of open-source LLMs for SDG mapping, highlighting their strengths and weaknesses relative to GPT-4o, and offers insights into their current performance levels.
Findings
LLaMA 2 and Gemma show significant room for improvement.
Other models have similar performance levels with no large differences.
All models' outputs are publicly available for further analysis.
Abstract
The use of large language models (LLMs) is expanding rapidly, and open-source versions are becoming available, offering users safer and more adaptable options. These models enable users to protect data privacy by eliminating the need to provide data to third parties and can be customized for specific tasks. In this study, we compare the performance of various language models on the Sustainable Development Goal (SDG) mapping task, using the output of GPT-4o as the baseline. The selected open-source models for comparison include Mixtral, LLaMA 2, LLaMA 3, Gemma, and Qwen2. Additionally, GPT-4o-mini, a more specialized version of GPT-4o, was included to extend the comparison. Given the multi-label nature of the SDG mapping task, we employed metrics such as F1 score, precision, and recall with micro-averaging to evaluate different aspects of the models' performance. These metrics are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsE-Government and Public Services · Geographic Information Systems Studies · FinTech, Crowdfunding, Digital Finance
MethodsLLaMA
