Image Segmentation: Inducing graph-based learning
Aryan Singh, Pepijn Van de Ven, Ciar\'an Eising, Patrick Denny

TL;DR
This paper investigates the use of graph neural networks within a U-Net architecture to improve semantic segmentation across various image types, demonstrating enhanced performance over traditional CNNs and transformer models.
Contribution
It introduces a novel GNN-based U-Net model for segmentation, effectively modeling long-range dependencies and complex spatial relationships in diverse image modalities.
Findings
GNN-based U-Net outperforms CNN and transformer models on multiple datasets.
The approach effectively handles geometric distortions in fisheye images.
The method improves boundary delineation in medical images.
Abstract
This study explores the potential of graph neural networks (GNNs) to enhance semantic segmentation across diverse image modalities. We evaluate the effectiveness of a novel GNN-based U-Net architecture on three distinct datasets: PascalVOC, a standard benchmark for natural image segmentation, WoodScape, a challenging dataset of fisheye images commonly used in autonomous driving, introducing significant geometric distortions; and ISIC2016, a dataset of dermoscopic images for skin lesion segmentation. We compare our proposed UNet-GNN model against established convolutional neural networks (CNNs) based segmentation models, including U-Net and U-Net++, as well as the transformer-based SwinUNet. Unlike these methods, which primarily rely on local convolutional operations or global self-attention, GNNs explicitly model relationships between image regions by constructing and operating on a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTeaching and Learning Programming · Intelligent Tutoring Systems and Adaptive Learning · Multimodal Machine Learning Applications
MethodsMax Pooling · Convolution · *Communicated@Fast*How Do I Communicate to Expedia? · Concatenated Skip Connection · U-Net
