# TOSQ: Transparent Object Segmentation via Query-Based Dictionary Lookup with Transformers

**Authors:** Bin Ma, Ming Ma, Ruiguang Li, Jiawei Zheng, Deping Li

PMC · DOI: 10.3390/s25154700 · Sensors (Basel, Switzerland) · 2025-07-30

## TL;DR

This paper introduces TOSQ, a new method for segmenting transparent objects using transformers and a query-based approach, achieving better performance than previous methods.

## Contribution

The novel Query Parsing Module (QPM) formulates segmentation as a dictionary lookup problem using learnable class prototypes.

## Key findings

- TOSQ achieves 76.63% mIoU and 95.34% Acc on the Trans10K-V2 dataset.
- It shows significant improvements in challenging categories like windows and glass doors.
- The model leverages transformer-based global modeling for transparent object segmentation.

## Abstract

Sensing transparent objects has many applications in human daily life, including robot navigation and grasping. However, this task presents significant challenges due to the unpredictable nature of scenes that extend beyond/behind transparent objects, particularly the lack of fixed visual patterns and strong background interference. This paper aims to solve the transparent object segmentation problem by leveraging the intrinsic global modeling capabilities of transformer architectures. We design a Query Parsing Module (QPM) that innovatively formulates segmentation as a dictionary lookup problem, differing fundamentally from conventional pixel-wise mechanisms, e.g., via attention-based prototype matching, and a set of learnable class prototypes as query inputs. Based on QPM, we propose a high-performance transformer-based end-to-end segmentation model, Transparent Object Segmentation through Query (TOSQ). TOSQ’s encoder is based on the Segformer’s backbone, and its decoder consists of a series of QPM modules, which progressively refine segmentation masks by the proposed QPMs. TOSQ achieves state-of-the-art performance on the Trans10K-V2 dataset (76.63% mIoU, 95.34% Acc), with particularly significant gains in challenging categories like windows (+23.59%) and glass doors (+11.22%), demonstrating its superior capability in transparent object segmentation.

## Full-text entities

- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12349598/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12349598/full.md

## References

48 references — full list in the complete paper: https://tomesphere.com/paper/PMC12349598/full.md

---
Source: https://tomesphere.com/paper/PMC12349598