# Collision-Free Robot Path Planning by Integrating DRL with Noise Layers and MPC

**Authors:** Xinzhan Hong, Qieshi Zhang, Yexing Yang, Tianqi Zhao, Zhenyu Xu, Tichao Wang, Jing Ji

PMC · DOI: 10.3390/s25206263 · Sensors (Basel, Switzerland) · 2025-10-10

## TL;DR

This paper introduces a hybrid robot path planning method combining DRL and MPC to improve obstacle avoidance and navigation efficiency in dynamic environments.

## Contribution

A novel hybrid framework integrating DRL with noise layers and MPC for collision-free robot path planning is proposed.

## Key findings

- The proposed method improves obstacle avoidance success rate in dynamic environments.
- Trajectory smoothness and path accuracy are significantly enhanced compared to traditional methods.
- The hybrid approach outperforms standalone DRL and other hybrid techniques in complex scenarios.

## Abstract

With the rapid advancement of Autonomous Mobile Robots (AMRs) in industrial automation and intelligent logistics, achieving efficient and safe path planning in dynamic environments has become a critical challenge. These environments require robots to perceive complex scenarios and adapt their motion strategies accordingly, often under real-time constraints. Existing methods frequently struggle to balance efficiency, responsiveness, and safety, especially in the presence of continuously changing dynamic obstacles. While Model Predictive Control (MPC) and Deep Reinforcement Learning (DRL) have each shown promise in this domain, they also face limitations when applied individually—such as high computational demands or insufficient environmental exploration. To address these challenges, we propose a hybrid path planning framework that integrates an optimized DRL algorithm with MPC. We replace the Actor’s output with a learnable noisy linear layer whose mean and scale parameters are optimized jointly with the policy via backpropagation, thereby enhancing exploration while preserving training stability. TD3 produces stepwise control commands that evolve into a short-horizon reference trajectory, while MPC refines this trajectory through constraint-aware optimization to ensure timely obstacle avoidance. This complementary process combines TD3′s learning-based adaptability with MPC’s reliable local feasibility. Extensive experiments conducted in environments with varying obstacle dynamics and densities demonstrate that the proposed method significantly improves obstacle avoidance success rate, trajectory smoothness, and path accuracy compared to traditional MPC, standalone DRL, and other hybrid approaches, offering a robust and efficient solution for autonomous navigation in complex scenarios.

## Full-text entities

- **Diseases:** injury to (MESH:D014947)
- **Chemicals:** DDPG (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12567521/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12567521/full.md

## References

28 references — full list in the complete paper: https://tomesphere.com/paper/PMC12567521/full.md

---
Source: https://tomesphere.com/paper/PMC12567521