# LSWM: A Long–Short History World Model for Bipedal Locomotion via Reinforcement Learning

**Authors:** Jie Xue, Zhiyuan Liang, Haiming Mou, Qingdu Li, Jianwei Zhang

PMC · DOI: 10.3390/biomimetics11010040 · Biomimetics · 2026-01-05

## TL;DR

This paper introduces LSWM, a new model that improves bipedal robot locomotion in complex environments by combining state reconstruction and future prediction.

## Contribution

The novel LSWM framework integrates long-term and short-term history with future prediction to enhance robot locomotion in unstructured terrain.

## Key findings

- LSWM achieved a 94% success rate in indoor stair-climbing tasks.
- It outperformed state-of-the-art methods by at least 34% in complex environments.

## Abstract

The presence of sensor noise, missing states and inadequate future prediction capabilities imposes significant limitations on the locomotion performance of bipedal robots operating in unstructured terrain. Conventional methods generally depend on long-term history observations to reconstruct single-frame privileged information. However, these methods fail to acknowledge the pivotal function of short-term history in rapid state responses and the significance of future state prediction in anticipating potential risks. The proposed framework is a Long–Short World Model (LSWM), which integrates state reconstruction and future state prediction to enhance the locomotion capabilities of bipedal robots in complex environments. The LSWM framework comprises two modules: a state reconstruction module (SRM) and a future state prediction module (SPM). The state reconstruction module employs long-term history observations to reconstruct privileged information in the current short-term history, thereby effectively improving the system’s robustness to sensor noise and enhancing state observability. The future state prediction module enhances the robot’s adaptability to complex environments and unpredictable scenarios by predicting the robot’s future short-term privileged information. We conducted extensive comparative experiments in simulation as well as in a variety of real-world indoor and outdoor environments. In the indoor stair-climbing task, LSWM achieved a 94% success rate, outperforming the current state-of-the-art baseline methods by at least 34%, thereby demonstrating its substantial performance advantages in complex and dynamic environments.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12838926/full.md

## Figures

14 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12838926/full.md

## References

29 references — full list in the complete paper: https://tomesphere.com/paper/PMC12838926/full.md

---
Source: https://tomesphere.com/paper/PMC12838926