EdgeRunner: Auto-regressive Auto-encoder for Artistic Mesh Generation
Jiaxiang Tang, Zhaoshuo Li, Zekun Hao, Xian Liu, Gang Zeng, Ming-Yu, Liu, Qinsheng Zhang

TL;DR
EdgeRunner introduces an auto-regressive auto-encoder that efficiently generates high-quality 3D meshes by novel tokenization and fixed-length latent space compression, outperforming existing methods in quality and generalization.
Contribution
The paper presents a novel auto-regressive auto-encoder with a mesh tokenization algorithm and fixed-length latent space, improving mesh generation quality and efficiency.
Findings
Outperforms existing methods in mesh quality and diversity
Efficiently compresses meshes into 1D token sequences
Demonstrates superior generalization in point cloud and image-conditioned tasks
Abstract
Current auto-regressive mesh generation methods suffer from issues such as incompleteness, insufficient detail, and poor generalization. In this paper, we propose an Auto-regressive Auto-encoder (ArAE) model capable of generating high-quality 3D meshes with up to 4,000 faces at a spatial resolution of . We introduce a novel mesh tokenization algorithm that efficiently compresses triangular meshes into 1D token sequences, significantly enhancing training efficiency. Furthermore, our model compresses variable-length triangular meshes into a fixed-length latent space, enabling training latent diffusion models for better generalization. Extensive experiments demonstrate the superior quality, diversity, and generalization capabilities of our model in both point cloud and image-conditioned mesh generation tasks.
Peer Reviews
Decision·ICLR 2025 Poster
EdgeRunner demonstrates strengths in the mesh tokenization algorithm and the Auto-regressive Auto-encoder framework, which address previous challenges in mesh generation. It offers a mesh tokenization algorithm for efficient compression into 1D token sequences, enabling mesh generation with up to 4,000 faces at a 512^3 resolution. The model's ability to compress variable-length meshes into a fixed-length latent space facilitates the training of latent diffusion models for enhanced generalizatio
A. While the paper presents a compelling case for EdgeRunner's capabilities, a notable weakness is that it does not fundamentally differentiate itself from existing works like MeshGPT and the MeshAnything series in terms of the core generative approach. B. Despite the improvements in tokenization and the auto-regressive framework, the generated 3D geometries may not adhere to the modeling and wiring conventions that human artists typically follow. This could potentially limit the acceptance an
1. The proposed method can generate meshes with up to 4000 faces, surpassing the capabilities of previous baselines. 2. The trained model shows strong generalization on novel inputs. 3. Extensive experiments highlight the advantages of the proposed method in achieving high-quality mesh generation.
1. The proposed method and the baselines are trained on different datasets (for example, MeshAnything does not have access to Objaverse-XL and reserves 10% of Objaverse for evaluation). As a result, the comparisons can be unfair. 2. Is the training sequence unique for each mesh? How did you define the start of the sequence? 3. Although the authors report the inference speed, there is no comparison of the inference speed against other methods when generating similar number of faces. 4. The detail
The paper proposes a novel method for mesh tokenization which allows to handle mesh with more faces and in higher resolution. It also introduces a latent space with a unified token length, enhancing the model's generalization ability. These two contributions represent the technical novelties of the paper. The paper includes a comprehensive comparison with several state-of-the-art mesh generative models, such as MeshAnything, MeshAnythingV2, and Unique3D. This thorough evaluation makes the stu
While I find no weaknesses in the method itself, I believe the presentation of the paper needs restructured. Important information, such as experimental settings and training datasets, currently appears only in the supplementary materials, which makes the paper difficult to follow. Presentation: - Line 102: The term "EdgeBreaker" appears for the first time without a reference, making this part challenging to understand. - Line 200: It seems that the mesh tokenization approach is heavily inspire
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputer Graphics and Visualization Techniques · 3D Shape Modeling and Analysis · Human Motion and Animation
MethodsDiffusion
