Assessing an evolutionary search engine for small language models, prompts, and evaluation metrics

Cl\'audio L\'ucio do Val Lopes; Lucca Machado

arXiv:2506.21512·cs.NE·February 26, 2026

Assessing an evolutionary search engine for small language models, prompts, and evaluation metrics

Cl\'audio L\'ucio do Val Lopes, Lucca Machado

PDF

TL;DR

This paper presents a bi-objective evolutionary search engine that optimizes small language models and prompts for accuracy and token efficiency, revealing critical trade-offs and task-specific affinities.

Contribution

It introduces a novel automated framework using NSGA-II for optimizing SLMs and prompts, moving beyond manual tuning to discover effective human-AI interaction strategies.

Findings

01

Identifies diverse high-performing model-prompt combinations

02

Quantifies the trade-off between accuracy and token efficiency

03

Provides Pareto fronts for decision-makers

Abstract

The concurrent optimization of language models and instructional prompts presents a significant challenge for deploying efficient and effective AI systems, particularly when balancing performance against computational costs like token usage. This paper introduces and assesses a bi-objective evolutionary search engine designed to navigate this complex space, focusing specifically on Small Language Models (SLMs). We employ the NSGA-II algorithm and prompt grammar to simultaneously optimize for task accuracy and token efficiency across some reasoning tasks. Our results successfully identify diverse, high-performing model-prompt combinations, quantitatively revealing the critical trade-off between the two objectives. This research highlights task-specific affinities between particular SLMs and prompt structures (e.g., instructions, context, chain of thought). The generated practical Pareto…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.