Meta-Adaptive Beam Search Planning for Transformer-Based Reinforcement Learning Control of UAVs with Overhead Manipulators under Flight Disturbances

Hazim Alzorgan; Sayed Pedram Haeri Boroujeni; Abolfazl Razi

arXiv:2603.26612·cs.RO·March 30, 2026

Meta-Adaptive Beam Search Planning for Transformer-Based Reinforcement Learning Control of UAVs with Overhead Manipulators under Flight Disturbances

Hazim Alzorgan, Sayed Pedram Haeri Boroujeni, Abolfazl Razi

PDF

TL;DR

This paper introduces a reinforcement learning framework with a transformer-based adaptive beam search planner for UAVs with manipulators, improving tracking accuracy and stability under disturbances.

Contribution

It develops a novel meta-adaptive planner combining transformer critic and beam search for better control of UAV manipulators under flight disturbances.

Findings

01

10.2% reward increase over baseline

02

Reduced mean tracking error from 6% to 3%

03

29.6% improvement in combined reward-error metric

Abstract

Drones equipped with overhead manipulators offer unique capabilities for inspection, maintenance, and contact-based interaction. However, the motion of the drone and its manipulator is tightly linked, and even small attitude changes caused by wind or control imperfections shift the end-effector away from its intended path. This coupling makes reliable tracking difficult and also limits the direct use of learning-based arm controllers that were originally designed for fixed-base robots. These effects appear consistently in our tests whenever the UAV body experiences drift or rapid attitude corrections. To address this behavior, we develop a reinforcement-learning (RL) framework with a transformer-based double deep Q learning (DDQN), with the core idea of using an adaptive beam-search planner that applies a short-horizon beam search over candidate control sequences using the learned…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.