# D3PG-Light: A Lightweight and Stable Resource Scheduling Framework for UAV-Integrated Sensing, Communication, and Computation Systems

**Authors:** Qing Cheng, Wenwen Wu, Yebo Zhou

PMC · DOI: 10.3390/s26061829 · Sensors (Basel, Switzerland) · 2026-03-13

## TL;DR

This paper introduces D3PG-Light, a lightweight deep reinforcement learning framework for efficient resource scheduling in UAV-based sensing, communication, and computation systems.

## Contribution

D3PG-Light is a novel, stable, and lightweight deep reinforcement learning framework tailored for real-time UAV resource scheduling.

## Key findings

- D3PG-Light reduces 95th-percentile latency from over 100 ms to approximately 24 ms.
- The method achieves faster convergence and more stable learning than DDPG, TD3, and original D3PG.
- D3PG-Light requires fewer than 50 k model parameters while maintaining high performance.

## Abstract

Unmanned Aerial Vehicles (UAVs) are gradually emerging as key platforms for Integrated Sensing, Communication, and Computation (ISCC) systems in next-generation wireless networks. However, strict resource constraints and task coupling make static allocation inefficient in dynamic environments. This paper studies a UAV-driven ISCC system in which a single UAV dynamically allocates communication bandwidth, sensing resources, and computing power. Considering that sensing data in mission-critical applications is highly time-sensitive, minimizing the response time is paramount. To reduce system latency while maintaining sensing quality and energy efficiency, we propose D3PG-Light, a deployment oriented and stability-enhanced refinement of the deep reinforcement learning framework, specifically tailored for real-time resource scheduling under UAV hardware constraints. D3PG-Light incorporates an adaptive gradient stabilization mechanism, Long Short-Term Memory (LSTM), and feature fusion to enhance training stability. Simulation results based on real air–ground channel measurements show that D3PG-Light converges faster and achieves more stable learning behavior than DDPG, TD3, and the original D3PG. In particular, the proposed method reduces the 95th-percentile latency from over 100 ms to approximately 24 ms, achieves higher converged reward values, and requires fewer than 50 k model parameters. These results demonstrate the effectiveness of D3PG-Light for latency-sensitive UAV-ISCC applications.

## Full-text entities

- **Chemicals:** D3PG (-)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC13029996/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC13029996/full.md

## References

42 references — full list in the complete paper: https://tomesphere.com/paper/PMC13029996/full.md

---
Source: https://tomesphere.com/paper/PMC13029996