Semantic Segmentation of Textured Non-manifold 3D Meshes using Transformers
Mohammadreza Heidarianbaei, Max Mehltretter, Franz Rottensteiner

TL;DR
This paper presents a texture-aware transformer model for semantic segmentation of textured 3D meshes, effectively integrating geometric and textural information for improved accuracy.
Contribution
Introduces a novel hierarchical transformer architecture that fuses face-level pixel textures with geometric descriptors for mesh segmentation.
Findings
Achieves 81.9% mF1 on SUM benchmark
Attains 49.7% mF1 on cultural-heritage dataset
Outperforms existing methods significantly
Abstract
Textured 3D meshes jointly represent geometry, topology, and appearance, yet their irregular structure poses significant challenges for deep-learning-based semantic segmentation. While a few recent methods operate directly on meshes without imposing geometric constraints, they typically overlook the rich textural information also provided by such meshes. We introduce a texture-aware transformer that learns directly from raw pixels associated with each mesh face, coupled with a new hierarchical learning scheme for multi-scale feature aggregation. A texture branch summarizes all face-level pixels into a learnable token, which is fused with geometrical descriptors and processed by a stack of Two-Stage Transformer Blocks (TSTB), which allow for both a local and a global information flow. We evaluate our model on the Semantic Urban Meshes (SUM) benchmark and a newly curated cultural-heritage…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
