CLEAR: Cue Learning using Evolution for Accurate Recognition Applied to   Sustainability Data Extraction

Peter J. Bentley; Soo Ling Lim; Fuyuki Ishikawa

arXiv:2501.18504·cs.CV·May 8, 2025

CLEAR: Cue Learning using Evolution for Accurate Recognition Applied to Sustainability Data Extraction

Peter J. Bentley, Soo Ling Lim, Fuyuki Ishikawa

PDF

TL;DR

CLEAR combines LLMs and evolutionary algorithms to automatically generate and optimize cues, significantly improving image recognition accuracy for sustainability data extraction from building images beyond human and prompt-based methods.

Contribution

This paper introduces a novel method that uses evolutionary computation to optimize cues for LLM-based image recognition, enhancing accuracy in specialized tasks.

Findings

01

Error rates improved by up to two orders of magnitude.

02

CLEAR outperforms human recognition and prompts in accuracy.

03

Variable-length representations enhance LLM consistency.

Abstract

Large Language Model (LLM) image recognition is a powerful tool for extracting data from images, but accuracy depends on providing sufficient cues in the prompt - requiring a domain expert for specialized tasks. We introduce Cue Learning using Evolution for Accurate Recognition (CLEAR), which uses a combination of LLMs and evolutionary computation to generate and optimize cues such that recognition of specialized features in images is improved. It achieves this by auto-generating a novel domain-specific representation and then using it to optimize suitable textual cues with a genetic algorithm. We apply CLEAR to the real-world task of identifying sustainability data from interior and exterior images of buildings. We investigate the effects of using a variable-length representation compared to fixed-length and show how LLM consistency can be improved by refactoring from categorical to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.