# SKGE-SWIN: End-To-End Autonomous Vehicle Waypoint Prediction and Navigation Using Skip Stage Swin Transformer

**Authors:** Fachri Najm Noer Kartiman, Rasim, Yaya Wihardi, Nurul Hasanah, Oskar Natan, Bambang Wahono, Taufik Ibnu Salim

arXiv: 2508.20762 · 2025-08-29

## TL;DR

This paper introduces SKGE-Swin, an end-to-end autonomous vehicle model using a skip-stage Swin Transformer to enhance global and multi-level feature extraction, improving navigation in complex environments.

## Contribution

The paper presents a novel architecture combining skip-stage mechanisms with Swin Transformer for improved pixel-level context awareness in autonomous driving.

## Key findings

- Achieves higher Driving Score on CARLA platform.
- Effectively captures distant pixel information.
- Demonstrates the benefit of skip connections and Swin Transformer.

## Abstract

Focusing on the development of an end-to-end autonomous vehicle model with pixel-to-pixel context awareness, this research proposes the SKGE-Swin architecture. This architecture utilizes the Swin Transformer with a skip-stage mechanism to broaden feature representation globally and at various network levels. This approach enables the model to extract information from distant pixels by leveraging the Swin Transformer's Shifted Window-based Multi-head Self-Attention (SW-MSA) mechanism and to retain critical information from the initial to the final stages of feature extraction, thereby enhancing its capability to comprehend complex patterns in the vehicle's surroundings. The model is evaluated on the CARLA platform using adversarial scenarios to simulate real-world conditions. Experimental results demonstrate that the SKGE-Swin architecture achieves a superior Driving Score compared to previous methods. Furthermore, an ablation study will be conducted to evaluate the contribution of each architectural component, including the influence of skip connections and the use of the Swin Transformer, in improving model performance.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2508.20762/full.md

## Figures

13 figures with captions in the complete paper: https://tomesphere.com/paper/2508.20762/full.md

## References

37 references — full list in the complete paper: https://tomesphere.com/paper/2508.20762/full.md

---
Source: https://tomesphere.com/paper/2508.20762