GEM: Boost Simple Network for Glass Surface Segmentation via Vision Foundation Models
Jing Hao, Moyun Liu, Jinrong Yang, Kuo Feng Hung

TL;DR
This paper introduces GEM, a simple yet effective glass surface segmentation method leveraging vision foundation models, which constructs a large synthetic dataset and achieves state-of-the-art results without extensive manual labeling.
Contribution
It proposes a novel approach combining Stable Diffusion and SAM to create a synthetic dataset and a segmentation model, reducing manual effort and improving performance.
Findings
GEM surpasses previous methods with a 2.1% IoU improvement.
The synthetic dataset S-GSD performs well in zero-shot and transfer learning.
The approach reduces data annotation costs significantly.
Abstract
Detecting glass regions is a challenging task due to the inherent ambiguity in their transparency and reflective characteristics. Current solutions in this field remain rooted in conventional deep learning paradigms, requiring the construction of annotated datasets and the design of network architectures. However, the evident drawback with these mainstream solutions lies in the time-consuming and labor-intensive process of curating datasets, alongside the increasing complexity of model structures. In this paper, we propose to address these issues by fully harnessing the capabilities of two existing vision foundation models (VFMs): Stable Diffusion and Segment Anything Model (SAM). Firstly, we construct a Synthetic but photorealistic large-scale Glass Surface Detection dataset, dubbed S-GSD, without any labour cost via Stable Diffusion. This dataset consists of four different scales,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Memory and Neural Computing · Neural Networks and Reservoir Computing · CCD and CMOS Imaging Sensors
MethodsDiffusion · Segment Anything Model · Convolution
