# PPO-Based Reinforcement Learning Control of a Flapping-Wing Robot with a Bio-Inspired Sensing and Actuation Feather Unit

**Authors:** Saddam Hussain, Mohammed Messaoudi, Muhammad Imran, Diyin Tang

PMC · DOI: 10.3390/s26031009 · Sensors (Basel, Switzerland) · 2026-02-04

## TL;DR

This paper introduces a bio-inspired feather unit for flapping-wing robots that uses reinforcement learning to adapt to airflow disturbances and improve stability.

## Contribution

A novel bio-inspired sensing and actuation feather unit combined with a PPO-based reinforcement learning controller for flapping-wing robots is introduced.

## Key findings

- The PPO-driven SAFU achieves fast and well-damped responses with rise times below 0.5 s and settling times under 1.4 s.
- The system reduces airflow-induced disturbance effects by up to 50% across varying gust conditions.
- The approach enables autonomous flow adaptation and improved dynamic stability without predefined control laws.

## Abstract

Bio-inspired flow-sensing and actuation mechanisms offer a promising path for enhancing the stability of flapping-wing flying robots (FWFRs) operating in dynamic and noisy environments. This study introduces a bio-inspired sensing and actuation feather unit (SAFU) that mimics the covert feathers of falcons and serves simultaneously as a distributed flow sensor and an adaptive actuation element. Each electromechanical feather (EF) passively detects airflow disturbances through deflection and actively modulates its flaps through an embedded actuator, enabling real-time aerodynamic adaptation. A reduced-order bond-graph model capturing the coupled aero-electromechanical dynamics of the FWFR wing and SAFU is developed to provide a physics-based training environment for a proximal policy optimization (PPO) based reinforcement learning controller. Through closed-loop interaction with this environment, the PPO policy autonomously learns control actions that regulate feather displacement, reduce airflow-induced loads, and improve dynamic stability without predefined control laws. Simulation results show that the PPO-driven SAFU achieves fast, well-damped responses with rise times below 0.5 s, settling times under 1.4 s, near-zero steady-state error across varying gust conditions and up to 50% alleviation of airflow-induced disturbance effects. Overall, this work highlights the potential of bio-inspired sensing-actuation architectures, combined with reinforcement learning, to serve as a promising solution for future flapping-wing drone designs, enabling enhanced resilience, autonomous flow adaptation, and intelligent aerodynamic control during operations in gusts.

## Full-text entities

- **Species:** Falco (falcons, genus) [taxon 8952]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12899858/full.md

## Figures

22 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12899858/full.md

## References

38 references — full list in the complete paper: https://tomesphere.com/paper/PMC12899858/full.md

---
Source: https://tomesphere.com/paper/PMC12899858