No Pixel Left Behind: A Detail-Preserving Architecture for Robust High-Resolution AI-Generated Image Detection

Lianrui Mu; Zou Xingze; Jianhong Bai; Jiaqi Hu; Wenjie Zheng; Jiangnan Ye; Jiedong Zhuang; Mudassar Ali; Jing Wang; Haoji Hu

arXiv:2508.17346·cs.CV·August 26, 2025

No Pixel Left Behind: A Detail-Preserving Architecture for Robust High-Resolution AI-Generated Image Detection

Lianrui Mu, Zou Xingze, Jianhong Bai, Jiaqi Hu, Wenjie Zheng, Jiangnan Ye, Jiedong Zhuang, Mudassar Ali, Jing Wang, Haoji Hu

PDF

1 Datasets 3 Reviews

TL;DR

This paper presents HiDA-Net, a high-resolution image detection architecture that preserves pixel-level details using feature aggregation from local tiles and global views, improving robustness and accuracy in detecting AI-generated images.

Contribution

The paper introduces HiDA-Net, a novel detail-preserving framework with feature aggregation, and new modules for forgery localization and compression noise disentanglement, along with a large high-resolution benchmark.

Findings

01

Achieves over 13% accuracy improvement on Chameleon dataset.

02

Improves detection accuracy by 10% on HiRes-50K benchmark.

03

Demonstrates robustness against localized manipulations and compression artifacts.

Abstract

The rapid growth of high-resolution, meticulously crafted AI-generated images poses a significant challenge to existing detection methods, which are often trained and evaluated on low-resolution, automatically generated datasets that do not align with the complexities of high-resolution scenarios. A common practice is to resize or center-crop high-resolution images to fit standard network inputs. However, without full coverage of all pixels, such strategies risk either obscuring subtle, high-frequency artifacts or discarding information from uncovered regions, leading to input information loss. In this paper, we introduce the High-Resolution Detail-Aggregation Network (HiDA-Net), a novel framework that ensures no pixel is left behind. We use the Feature Aggregation Module (FAM), which fuses features from multiple full-resolution local tiles with a down-sampled global view of the image.…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 6Confidence 4

Strengths

1. The motivation for the method is absolutely clear and supplemented with math and illustrations. 2. The proposed method achieves SOTA performance on several datasets. 3. The paper proposed a novel high-resolution HiRes-50K dataset that may be valuable for the community. 4. The paper includes extensive ablation on the proposed method.

Weaknesses

Major weaknesses: 1. The proposed method is not compared with recent AI-generated image detection methods, like [1 - 3]. 2. The proposed method has an increased inference time for high resolution images compared to the other approaches. But what is the difference in speed between the HiRes-50K and the other methods on standard resolutions, like 224 $\times$ 224? 3. I have not found the explicit list of models that are used in creating the HiRes-50K dataset. It is important to include the relevan

Reviewer 02Rating 4Confidence 3

Strengths

1. The paper convincingly identifies and quantifies how resizing harms detection by losing high-frequency details. 2. It introduces a new and high-quality dataset, named HiRes-50K, for evaluation.

Weaknesses

1. The proposed method includes both a global and a local path, but uses the same transformer blocks to process global and local images. How do the proposed FAM, TFL, and QFE modules adaptively extract and distinguish local and global features for classification? It is unclear why these modules can effectively capture both types of features simultaneously. 2. The computational complexity of transformer blocks is quadratic. When more local patches are used, this inevitably increases the processi

Reviewer 03Rating 6Confidence 3

Strengths

- The motivation is clear and intuitive. The detailed information loss will introduce misunderstanding for the detection model, especially for the AIGC scene, where most generated models produce well-structured but detail-failed images. - The model design is reasonable and theoretically proven. - The experiment is comprehensive to demonstrate the effectiveness of the proposed network and benchmarks.

Weaknesses

- The open-source AIGC image dataset and real-image dataset are innumerable. The paper does not sufficiently explain the core basis for HiRes-50K to surpass existing data resources in terms of irreplaceability or value increment. - The experiments primarily rely on outdated models (e.g., SD v1.4, SDXL) for generating AI-synthesized images with limited coverage of other mainstream high-resolution generative models—especially advanced models updated after 2024 (e.g., SD3.5, FLUX, Qwen-Image). - T

Code & Models

Datasets

Mu437/HiRes-50K
dataset· 36 dl
36 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.