# Cloud-Edge Resource Scheduling and Offloading Optimization Based on Deep Reinforcement Learning

**Authors:** Lili Yin, Yunze Xie, Ze Zhao, Jie Gao

PMC · DOI: 10.3390/s26051704 · Sensors (Basel, Switzerland) · 2026-03-08

## TL;DR

This paper introduces a deep reinforcement learning algorithm to optimize task offloading in smart manufacturing environments, reducing latency and task dropouts.

## Contribution

A novel distributed algorithm combining CNN, Informer, and Dueling DQN for cloud-edge task scheduling in dynamic environments.

## Key findings

- The proposed algorithm reduces task dropout rates by 82.3–94% compared to existing methods.
- It achieves a 28–39.2% reduction in average latency for latency-sensitive tasks in smart manufacturing.
- The method effectively handles uncertainty and dynamic changes in edge node loads.

## Abstract

In the context of smart manufacturing, with the widespread deployment of Industrial Internet of Things (IoT) devices, a large number of computation tasks that are highly sensitive to latency and have strict deadlines have emerged, requiring real-time processing. Effectively offloading tasks to address the issues of increased latency and task dropouts caused by dynamic changes in edge node load has become a key challenge in the cloud–edge–end collaborative environment of smart manufacturing. To tackle the complex issues of unknown edge node loads and dynamic system state changes, this paper proposes a distributed algorithm based on deep reinforcement learning, utilizing convolutional neural networks (CNN) and the Informer architecture. The proposed algorithm leverages CNN to extract local features of edge node loads while utilizing Informer’s self-attention mechanism to capture long-term load variation trends, thereby effectively handling the uncertainty and dynamics inherent in node loads. Furthermore, by integrating the Dueling Deep Q-Network (DQN) and Double DQN techniques, the algorithm achieves a precise approximation of the state–action value function, further enhancing its capability to perceive system temporal characteristics and adapt to heterogeneous tasks. Each mobile device can independently make task offloading decisions and scheduling strategies based on its observations, enabling dynamic task allocation and optimization of execution order. Simulation results show that, compared to various existing algorithms, the proposed method reduces task dropout rates by 82.3–94% and average latency by 28–39.2%. Experimental results validate the significant advantages of this method in intelligent manufacturing scenarios with high load and latency-sensitive tasks.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12986750/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12986750/full.md

## References

26 references — full list in the complete paper: https://tomesphere.com/paper/PMC12986750/full.md

---
Source: https://tomesphere.com/paper/PMC12986750