Technical Report for ICRA 2025 GOOSE 2D Semantic Segmentation Challenge: Leveraging Color Shift Correction, RoPE-Swin Backbone, and Quantile-based Label Denoising Strategy for Robust Outdoor Scene Understanding
Chih-Chung Hsu, I-Hsuan Wu, Wen-Hai Tseng, Ching-Heng Cheng, Ming-Hsuan Wu, Jin-Hui Jiang, Yu-Jou Hsiao

TL;DR
This paper introduces a robust outdoor scene semantic segmentation framework for ICRA 2025 GOOSE challenge, combining a RoPE-enhanced Swin Transformer, color shift correction, and quantile-based label denoising, achieving high accuracy.
Contribution
The novel integration of RoPE-Swin backbone, color shift correction, and quantile-based denoising improves robustness in outdoor scene segmentation under real-world conditions.
Findings
Achieved a mean IoU of 0.848 on the GOOSE test set.
Demonstrated effectiveness of color correction and denoising strategies.
Enhanced spatial generalization with RoPE embeddings.
Abstract
This report presents our semantic segmentation framework developed by team ACVLAB for the ICRA 2025 GOOSE 2D Semantic Segmentation Challenge, which focuses on parsing outdoor scenes into nine semantic categories under real-world conditions. Our method integrates a Swin Transformer backbone enhanced with Rotary Position Embedding (RoPE) for improved spatial generalization, alongside a Color Shift Estimation-and-Correction module designed to compensate for illumination inconsistencies in natural environments. To further improve training stability, we adopt a quantile-based denoising strategy that downweights the top 2.5\% of highest-error pixels, treating them as noise and suppressing their influence during optimization. Evaluated on the official GOOSE test set, our approach achieved a mean Intersection over Union (mIoU) of 0.848, demonstrating the effectiveness of combining color…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Image Enhancement Techniques · Generative Adversarial Networks and Image Synthesis
MethodsAttention Is All You Need · Linear Layer · Stochastic Depth · Multi-Head Attention · Dense Connections · Swin Transformer · ADaptive gradient method with the OPTimal convergence rate · Dropout · Layer Normalization · Position-Wise Feed-Forward Layer
