LogicLens: Visual-Logical Co-Reasoning for Text-Centric Forgery Analysis
Fanwei Zeng, Changtao Miao, Jing Huang, Zhiya Tan, Shutao Gong, Xiaoming Yu, Yang Wang, Huazhe Tan, Weibin Yao, Jianshu Li

TL;DR
LogicLens is a unified framework that combines visual and textual reasoning to detect and analyze sophisticated text-centric forgeries, outperforming existing methods on multiple benchmarks with a novel co-reasoning mechanism.
Contribution
The paper introduces LogicLens, a joint visual-textual reasoning framework with a novel CCT mechanism and a hierarchical annotation pipeline, along with a new dataset for forgery analysis.
Findings
Outperforms specialized frameworks by 41.4% in macro F1 on T-IC13.
Achieves significant improvements on dense-text T-SROIE dataset.
Demonstrates the effectiveness of joint reasoning and multi-agent annotation pipeline.
Abstract
Sophisticated text-centric forgeries, fueled by rapid AIGC advancements, pose a significant threat to societal security and information authenticity. Current methods for text-centric forgery analysis are often limited to coarse-grained visual analysis and lack the capacity for sophisticated reasoning. Moreover, they typically treat detection, grounding, and explanation as discrete sub-tasks, overlooking their intrinsic relationships for holistic performance enhancement. To address these challenges, we introduce LogicLens, a unified framework for Visual-Textual Co-reasoning that reformulates these objectives into a joint task. The deep reasoning of LogicLens is powered by our novel Cross-Cues-aware Chain of Thought (CCT) mechanism, which iteratively cross-validates visual cues against textual logic. To ensure robust alignment across all tasks, we further propose a weighted multi-task…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Media Forensic Detection · Generative Adversarial Networks and Image Synthesis · Adversarial Robustness in Machine Learning
