Assessing an evolutionary search engine for small language models, prompts, and evaluation metrics
Cl\'audio L\'ucio do Val Lopes, Lucca Machado

TL;DR
This paper presents a bi-objective evolutionary search engine that optimizes small language models and prompts for accuracy and token efficiency, revealing critical trade-offs and task-specific affinities.
Contribution
It introduces a novel automated framework using NSGA-II for optimizing SLMs and prompts, moving beyond manual tuning to discover effective human-AI interaction strategies.
Findings
Identifies diverse high-performing model-prompt combinations
Quantifies the trade-off between accuracy and token efficiency
Provides Pareto fronts for decision-makers
Abstract
The concurrent optimization of language models and instructional prompts presents a significant challenge for deploying efficient and effective AI systems, particularly when balancing performance against computational costs like token usage. This paper introduces and assesses a bi-objective evolutionary search engine designed to navigate this complex space, focusing specifically on Small Language Models (SLMs). We employ the NSGA-II algorithm and prompt grammar to simultaneously optimize for task accuracy and token efficiency across some reasoning tasks. Our results successfully identify diverse, high-performing model-prompt combinations, quantitatively revealing the critical trade-off between the two objectives. This research highlights task-specific affinities between particular SLMs and prompt structures (e.g., instructions, context, chain of thought). The generated practical Pareto…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
