Learn for Variation: Variationally Guided AAV Trajectory Learning in Differentiable Environments

Xiucheng Wang; Zhenye Chen; Nan Cheng

arXiv:2603.18853·eess.SY·March 26, 2026

Learn for Variation: Variationally Guided AAV Trajectory Learning in Differentiable Environments

Xiucheng Wang, Zhenye Chen, Nan Cheng

PDF

Open Access

TL;DR

This paper introduces L4V, a gradient-based trajectory learning framework for autonomous aerial vehicles that improves training stability and performance by replacing sparse rewards with dense, analytically derived policy gradients in differentiable environments.

Contribution

L4V is a novel framework that uses structured policy gradients and differentiable environment modeling to enhance AAV trajectory learning, addressing reward sparsity and training instability issues.

Findings

01

L4V outperforms baseline algorithms in mission completion time.

02

L4V achieves higher average transmission rates.

03

L4V demonstrates more efficient training with lower costs.

Abstract

Autonomous aerial vehicles (AAVs) empower sixth-generation (6G) Internet-of-Things (IoT) networks through mobility-driven data collection. However, conventional reward-driven reinforcement learning for AAV trajectory planning suffers from severe credit assignment issues and training instability, because sparse scalar rewards fail to capture the long-term and nonlinear effects of sequential movements. To address these challenges, this paper proposes Learn for Variation (L4V), a gradient-informed trajectory learning framework that replaces high-variance scalar reward signals with dense and analytically grounded policy gradients. Particularly, the coupled evolution of AAV kinematics, distance-dependent channel gains, and per-user data-collection progress is first unrolled into an end-to-end differentiable computational graph. Backpropagation through time then serves as a discrete adjoint…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsUAV Applications and Optimization · Age of Information Optimization · Advanced Wireless Communication Technologies