# Off-Road Autonomous Vehicle Semantic Segmentation and Spatial Overlay Video Assembly

**Authors:** Itai Dror, Omer Aviv, Ofer Hadar

PMC · DOI: 10.3390/s26061944 · Sensors (Basel, Switzerland) · 2026-03-19

## TL;DR

This paper introduces a new system for off-road autonomous vehicles that improves perception and video compression in challenging environments.

## Contribution

A novel three-part solution including a large off-road dataset, a Confusion-Aware Loss, and a spatial overlay video encoding scheme.

## Key findings

- The Confusion-Aware Loss improves segmentation accuracy on off-road data by 1.4% mIoU.
- The spatial overlay encoding achieves up to +5 dB PSNR and +40 VMAF gains under lossy compression.
- The proposed framework enables efficient autonomous operation in bandwidth-limited settings.

## Abstract

Autonomous systems are expanding rapidly, driving a demand for robust perception technologies capable of navigating challenging, unstructured environments. While urban autonomy has made significant progress, off-road environments pose unique challenges, including dynamic terrain and limited communication infrastructure. This research addresses these challenges by introducing a novel three-part solution for off-road autonomous vehicles. First, we present a large-scale off-road dataset curated to capture the visual complexity and variability of unstructured environments, providing a realistic training ground that supports improved model generalization. Second, we propose a Confusion-Aware Loss (CAL) that dynamically penalizes systematic misclassifications based on class-level confusion statistics. When combined with cross-entropy, CAL improves segmentation mean Intersection over Union (mIoU) on the off-road test set from 68.66% to 70.06% and achieves cross-domain gains of up to ~0.49% mIoU on the Cityscapes dataset. Third, leveraging semantic segmentation as an intermediate representation, we introduce a spatial overlay video encoding scheme that preserves high-fidelity RGB information in semantically critical regions while compressing non-essential background regions. Experimental results demonstrate Peak Signal-to-Noise Ratio (PSNR) improvements of up to +5 dB and Video Multi-Method Assessment Fusion (VMAF) gains of up to +40 points under lossy compression, enabling efficient and reliable off-road autonomous operation. This integrated approach provides a robust framework for real-time remote operation in bandwidth-constrained environments.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC13030456/full.md

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/PMC13030456/full.md

## References

40 references — full list in the complete paper: https://tomesphere.com/paper/PMC13030456/full.md

---
Source: https://tomesphere.com/paper/PMC13030456