Semantic Segmentation of Textured Non-manifold 3D Meshes using Transformers

Mohammadreza Heidarianbaei; Max Mehltretter; Franz Rottensteiner

arXiv:2604.01836·cs.CV·April 3, 2026

Semantic Segmentation of Textured Non-manifold 3D Meshes using Transformers

Mohammadreza Heidarianbaei, Max Mehltretter, Franz Rottensteiner

PDF

TL;DR

This paper presents a texture-aware transformer model for semantic segmentation of textured 3D meshes, effectively integrating geometric and textural information for improved accuracy.

Contribution

Introduces a novel hierarchical transformer architecture that fuses face-level pixel textures with geometric descriptors for mesh segmentation.

Findings

01

Achieves 81.9% mF1 on SUM benchmark

02

Attains 49.7% mF1 on cultural-heritage dataset

03

Outperforms existing methods significantly

Abstract

Textured 3D meshes jointly represent geometry, topology, and appearance, yet their irregular structure poses significant challenges for deep-learning-based semantic segmentation. While a few recent methods operate directly on meshes without imposing geometric constraints, they typically overlook the rich textural information also provided by such meshes. We introduce a texture-aware transformer that learns directly from raw pixels associated with each mesh face, coupled with a new hierarchical learning scheme for multi-scale feature aggregation. A texture branch summarizes all face-level pixels into a learnable token, which is fused with geometrical descriptors and processed by a stack of Two-Stage Transformer Blocks (TSTB), which allow for both a local and a global information flow. We evaluate our model on the Semantic Urban Meshes (SUM) benchmark and a newly curated cultural-heritage…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.