LLM Context Conditioning and PWP Prompting for Multimodal Validation of Chemical Formulas
Evgeny Markhasin

TL;DR
This study explores using structured context conditioning with Persistent Workflow Prompting to improve large language models' ability to detect subtle errors in complex scientific documents, especially those involving multimodal data like formulas in images.
Contribution
It introduces a novel PWP-based prompting strategy to enhance LLMs' error detection capabilities in scientific validation tasks without requiring model modifications.
Findings
PWP prompting improved textual error detection in LLMs.
Gemini 2.5 Pro identified an image-based formula error overlooked manually.
Basic prompts were unreliable for detailed validation.
Abstract
Identifying subtle technical errors within complex scientific and technical documents, especially those requiring multimodal interpretation (e.g., formulas in images), presents a significant hurdle for Large Language Models (LLMs) whose inherent error-correction tendencies can mask inaccuracies. This exploratory proof-of-concept (PoC) study investigates structured LLM context conditioning, informed by Persistent Workflow Prompting (PWP) principles, as a methodological strategy to modulate this LLM behavior at inference time. The approach is designed to enhance the reliability of readily available, general-purpose LLMs (specifically Gemini 2.5 Pro and ChatGPT Plus o3) for precise validation tasks, crucially relying only on their standard chat interfaces without API access or model modifications. To explore this methodology, we focused on validating chemical formulas within a single,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Scientific Computing and Data Management · Biomedical Text Mining and Ontologies
