CLEAR: Cue Learning using Evolution for Accurate Recognition Applied to Sustainability Data Extraction
Peter J. Bentley, Soo Ling Lim, Fuyuki Ishikawa

TL;DR
CLEAR combines LLMs and evolutionary algorithms to automatically generate and optimize cues, significantly improving image recognition accuracy for sustainability data extraction from building images beyond human and prompt-based methods.
Contribution
This paper introduces a novel method that uses evolutionary computation to optimize cues for LLM-based image recognition, enhancing accuracy in specialized tasks.
Findings
Error rates improved by up to two orders of magnitude.
CLEAR outperforms human recognition and prompts in accuracy.
Variable-length representations enhance LLM consistency.
Abstract
Large Language Model (LLM) image recognition is a powerful tool for extracting data from images, but accuracy depends on providing sufficient cues in the prompt - requiring a domain expert for specialized tasks. We introduce Cue Learning using Evolution for Accurate Recognition (CLEAR), which uses a combination of LLMs and evolutionary computation to generate and optimize cues such that recognition of specialized features in images is improved. It achieves this by auto-generating a novel domain-specific representation and then using it to optimize suitable textual cues with a genetic algorithm. We apply CLEAR to the real-world task of identifying sustainability data from interior and exterior images of buildings. We investigate the effects of using a variable-length representation compared to fixed-length and show how LLM consistency can be improved by refactoring from categorical to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
