Beyond Voxel 3D Editing: Learning from 3D Masks and Self-Constructed Data

Yizhao Xu; Hongyuan Zhu; Caiyun Liu; Tianfu Wang; Keyu Chen; Sicheng Xu; Jiaolong Yang; Nicholas Jing Yuan; Qi Zhang

arXiv:2604.13688·cs.CV·April 16, 2026

Beyond Voxel 3D Editing: Learning from 3D Masks and Self-Constructed Data

Yizhao Xu, Hongyuan Zhu, Caiyun Liu, Tianfu Wang, Keyu Chen, Sicheng Xu, Jiaolong Yang, Nicholas Jing Yuan, Qi Zhang

PDF

TL;DR

This paper introduces BVE, a novel 3D editing framework that leverages a large-scale dataset and 3D masks to enable high-quality, semantic, and local invariant 3D asset modifications without extensive retraining.

Contribution

The paper presents a new 3D editing method using a self-constructed dataset and a 3D masking strategy, overcoming limitations of voxel and multi-view editing approaches.

Findings

01

BVE achieves superior quality in text-guided 3D editing.

02

The method maintains local invariance effectively during modifications.

03

Extensive experiments validate the approach's effectiveness.

Abstract

3D editing refers to the ability to apply local or global modifications to 3D assets. Effective 3D editing requires maintaining semantic consistency by performing localized changes according to prompts, while also preserving local invariance so that unchanged regions remain consistent with the original. However, existing approaches have significant limitations: multi-view editing methods incur losses when projecting back to 3D, while voxel-based editing is constrained in both the regions that can be modified and the scale of modifications. Moreover, the lack of sufficiently large editing datasets for training and evaluation remains a challenge. To address these challenges, we propose a Beyond Voxel 3D Editing (BVE) framework with a self-constructed large-scale dataset specifically tailored for 3D editing. Building upon this dataset, our model enhances a foundational image-to-3D…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.