Segment Any Vehicle: Semantic and Visual Context Driven SAM and A Benchmark

Xiao Wang; Ziwen Wang; Wentao Wu; Anjie Wang; Jiashu Wu; Yantao Pan; Chenglong Li

arXiv:2508.04260·cs.CV·August 7, 2025

Segment Any Vehicle: Semantic and Visual Context Driven SAM and A Benchmark

Xiao Wang, Ziwen Wang, Wentao Wu, Anjie Wang, Jiashu Wu, Yantao Pan, Chenglong Li

PDF

TL;DR

This paper introduces SAV, a novel vehicle part segmentation framework that combines a SAM-based encoder-decoder, a vehicle part knowledge graph, and a context retrieval module, along with a new large-scale dataset VehicleSeg10K.

Contribution

It presents a new framework for vehicle part segmentation that overcomes SAM's limitations and introduces a comprehensive benchmark dataset for the task.

Findings

01

SAV outperforms baseline models on VehicleSeg10K and other datasets.

02

The knowledge graph improves semantic understanding of vehicle parts.

03

Context retrieval enhances segmentation accuracy in diverse scenes.

Abstract

With the rapid advancement of autonomous driving, vehicle perception, particularly detection and segmentation, has placed increasingly higher demands on algorithmic performance. Pre-trained large segmentation models, especially Segment Anything Model (SAM), have sparked significant interest and inspired new research directions in artificial intelligence. However, SAM cannot be directly applied to the fine-grained task of vehicle part segmentation, as its text-prompted segmentation functionality is not publicly accessible, and the mask regions generated by its default mode lack semantic labels, limiting its utility in structured, category-specific segmentation tasks. To address these limitations, we propose SAV, a novel framework comprising three core components: a SAM-based encoder-decoder, a vehicle part knowledge graph, and a context sample retrieval encoding module. The knowledge…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.