Inline Critic Steers Image Editing

Weitai Kang; Xiaohang Zhan; Yizhou Wang; Mang Tik Chiu; Jason Kuen; Kangning Liu; Yan Yan

arXiv:2605.12724·cs.CV·May 14, 2026

Inline Critic Steers Image Editing

Weitai Kang, Xiaohang Zhan, Yizhou Wang, Mang Tik Chiu, Jason Kuen, Kangning Liu, Yan Yan

PDF

TL;DR

This paper introduces Inline Critic, a learnable token that critiques and refines image editing predictions during the forward pass, leading to state-of-the-art results on multiple benchmarks.

Contribution

It proposes a novel inline critiquing mechanism that operates within the model's forward pass, improving image editing quality without additional inference steps.

Findings

01

Achieved state-of-the-art on GEdit-Bench with 7.89 score.

02

Improved RISEBench performance by +9.4 over the backbone.

03

Surpassed GPT-4o on KRIS-Bench with 81.92 score.

Abstract

Instruction-based image editing exhibits heterogeneous difficulty not only across cases but also across regions of an image, motivating refinement approaches that allocate correction to where the model struggles. Existing refinement signals arrive late, after a fully generated image or a completed denoising step. We ask whether such a signal can act within an ongoing forward pass. To investigate this, we probe a frozen image-editing model and find that although generation capability emerges only in the last few layers, the error pattern is already set in early layers (rank correlation \r{ho} = 0.83 with the final-layer error map). Based on this, we introduce Inline Critic, a learnable token that critiques a frozen model's predictions at its intermediate layers and steers its hidden states to refine generation during the forward pass. A three-stage recipe is proposed to stabilize the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.