Edit Fidelity Field: Semantics-Aware Region Isolation for Training-Free Scene Text Editing
Guandong Li, Mengxia Ye

TL;DR
This paper introduces the Edit Fidelity Field (EFF), a semantics-aware post-processing module that significantly reduces spillover in diffusion-based scene text editing, improving non-target region preservation.
Contribution
The paper proposes EFF, a training-free, model-agnostic method that controls pixel-level editing fidelity using OCR-detected regions, addressing spillover in scene text editing.
Findings
EFF reduces spillover rate from 94% to 25%.
Non-target region preservation improves by +91.4 dB PSNR.
The method is applicable to any diffusion-based scene text editing model.
Abstract
Scene text editing (STE) has achieved remarkable progress in accurately rendering target text through diffusion-based methods. However, we identify a critical yet overlooked problem: edit spillover -- when editing a target text region, existing methods inadvertently modify non-target regions, particularly neighboring text. Through systematic evaluation on 50 real-world scenes across four categories, we reveal that state-of-the-art diffusion editing models exhibit a spillover rate of 94%, meaning nearly all non-target text regions are altered during editing. To address this, we propose the Edit Fidelity Field (EFF), a semantics-aware continuous field that controls per-pixel editing fidelity. Unlike binary masks, EFF leverages OCR-detected text regions to construct a four-zone field: Edit Core (fully editable), Transition Zone (smooth decay), Protected Zone (non-target text, explicitly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
