TL;DR
ContraPrompt leverages differences in reasoning traces from multiple attempts to optimize prompts, significantly improving performance on reasoning and compliance benchmarks through dyadic trace analysis.
Contribution
It introduces dyadic reasoning trace analysis and an automated contrastive data generation method for prompt optimization, outperforming prior contrastive approaches.
Findings
Outperforms GEPA on four reasoning benchmarks with significant gains.
Dyadic trace contrastivity is critical, as shown by ablation studies.
Achieves notable improvements in financial named entity recognition.
Abstract
Prompt optimization methods either analyze individual failures in isolation or compare prompt variants across examples, operating on single execution traces with no access to the reasoning process distinguishing success from failure on the same input. We introduce ContraPrompt, built on the observation that when a model fails but succeeds on a retry with feedback, the difference between its two chain-of-thought traces constitutes an optimization signal not captured by prior methods. Unlike prior contrastive methods, we compare complete intermediate reasoning processes: the two traces share model, input, and base prompt, so remaining differences reflect reasoning strategy and appended error feedback -- we call this dyadic reasoning trace analysis. The multi-attempt solving phase is an instrumented agentic retry loop that generates contrastive data automatically without human annotation.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
