Remove360: Benchmarking Residuals After Object Removal in 3D Gaussian Splatting
Simona Kocour, Assia Benbihi, Torsten Sattler

TL;DR
This paper introduces Remove360, a benchmark and dataset for evaluating residual semantic cues after object removal in 3D scenes, revealing current methods often leave detectable information despite visual removal.
Contribution
It presents a new benchmark, dataset, and evaluation framework for assessing semantic residuals post-object removal in complex 3D scenes, exposing limitations of existing methods.
Findings
Current removal methods often leave detectable semantic cues.
Even inpainting does not fully erase residual information.
There is a significant gap between geometric removal and semantic erasure.
Abstract
An object can disappear from a 3D scene, yet still be detectable. Even after visual removal, modern vision models may infer what was originally present. In this work, we introduce a novel benchmark and evaluation framework to quantify semantic residuals, the unintended cues left behind after object removal in 3D Gaussian Splatting. We conduct experiments across a diverse set of indoor and outdoor scenes, showing that current methods often preserve semantic information despite the absence of visual geometry. Notably, even when removal is followed by inpainting, residual cues frequently remain detectable by foundation models. We also present Remove360, a real-world dataset of pre- and post-removal RGB captures with object-level masks. Unlike prior datasets focused on isolated object instances, Remove360 contains complex, cluttered scenes that enable evaluation of object removal in…
Peer Reviews
Decision·ICLR 2026 Conference Withdrawn Submission
- Remove360 The paper comes with a real world dataset for object removal, i.e., Remove360. The dataset provides a diverse collection of both indoor and outdoor scenes under different lightings with objects pre- and post- removal. It also comes with GT annotations of objects being removed. Remove360 focuses on the multi-object removal scenario which is a better match for its everyday usage. The dataset could serve as a baseline benchmarking dataset for future works. - Low and high level metrics
- Regions Outside the Object A common issue in object removal is that shadows, reflections, or indirect effects of the removed objects often remain in the scene. These residual pixels lie outside the object’s mask, yet can still reveal visual or semantic cues about what was removed. The paper’s proposed metrics primarily focus on evaluating changes within the masked region, without explicitly considering these contextual regions outside the mask. Such unaddressed residuals may still compromise
1. The paper focuses on the critical issue of invisible "semantic residuals" that may persist in scene data after 3D removal. To this end, it proposes a comprehensive evaluation from multiple dimensions, including semantics and segmentation. 2. The paper releases the Remove360 dataset, which includes pre-/post-removal RGB images and object-level masks from real-world scenes. Providing these scenes after physical object removal to serve as ground truth offers a solid basis for evaluating remov
1. The paper lacks a comparison with previous evaluation metrics such as PSNR, SSIM, and LPIPS. It would strengthen the paper's argument to demonstrate cases where the difference lies between these traditional metrics and the proposed metrics. 2. Regarding the paper's assertion that "the model can detect residual information invisible to the human eye": It is possible that the inpainted shape is similar to the original, or that artifacts introduced during the removal and inpainting process coinc
1. A new dataset is released to measure the object removal, containing indoor and outdoor scene, which boosts the community. 2. This paper is well-written and the measurement method is easy to understand.
1. In the introduction, the definition of semantic residuals is unclear and lacks a rigorous mathematical or task-based formulation, which is critical for understanding and designing the subsequent evaluation metrics. 2. Although semantic residuals in 3DGS have been visualized, a deeper analysis is required to understand their underlying causes. 3. In the related work section, the authors only discuss 3D reconstruction and semantics in privacy protection applications. Object removal, however,
-The paper addresses a critical and previously underexplored aspect of 3D scene editing – the presence of semantic residuals after object removal, particularly relevant for privacy-preserving applications. -The proposed evaluation framework combines multiple complementary metrics (semantic, segmentation, and depth-based) to provide a thorough assessment of object removal effectiveness. -The Remove360 dataset fills a gap by providing a diverse set of real-world indoor and outdoor scenes with pre/
-The evaluation metrics rely on off-the-shelf semantic segmentation models, which may introduce errors and limit the robustness of the evaluation. The paper acknowledges this but could benefit from further discussion of potential biases and limitations. -Limited Analysis of Failure Cases: While the paper mentions failure cases (residual artifacts, incomplete removal, over-smoothing), a more detailed analysis of specific failure modes would be valuable.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Digital Media Forensic Detection · Adversarial Robustness in Machine Learning
