# Artificial Intelligence for Prosthetics - challenge solutions

**Authors:** {\L}ukasz Kidzi\'nski, Carmichael Ong, Sharada Prasanna Mohanty,, Jennifer Hicks, Sean F. Carroll, Bo Zhou, Hongsheng Zeng, Fan Wang, Rongzhong, Lian, Hao Tian, Wojciech Ja\'skowski, Garrett Andersen, Odd Rune Lykkeb{\o},, Nihat Engin Toklu, Pranav Shyam, Rupesh Kumar Srivastava, Sergey Kolesnikov,, Oleksii Hrinchuk, Anton Pechenko, Mattias Ljungstr\"om, Zhen Wang, Xu Hu,, Zehong Hu, Minghui Qiu, Jun Huang, Aleksei Shpilman, Ivan Sosin, Oleg, Svidchenko, Aleksandra Malysheva, Daniel Kudenko, Lance Rane, Aditya Bhatt,, Zhengfei Wang, Penghui Qi, Zeyang Yu, Peng Peng, Quan Yuan, Wenxin Li,, Yunsheng Tian, Ruihan Yang, Pingchuan Ma, Shauharda Khadka, Somdeb Majumdar,, Zach Dwiel, Yinyin Liu, Evren Tumer, Jeremy Watson, Marcel Salath\'e, Sergey, Levine, Scott Delp

arXiv: 1902.02441 · 2019-02-08

## TL;DR

This paper reviews the NeurIPS 2018 challenge on AI-controlled prosthetics, highlighting diverse deep reinforcement learning solutions that incorporate various heuristics and modifications for improved control.

## Contribution

It presents thirteen novel deep reinforcement learning approaches for prosthetic control, demonstrating different modifications and heuristics used by participants.

## Key findings

- Multiple solutions used reward shaping and discretization.
- Teams divided tasks into subtasks and incorporated expert knowledge.
- Solutions achieved effective control in the prosthetic simulation.

## Abstract

In the NeurIPS 2018 Artificial Intelligence for Prosthetics challenge, participants were tasked with building a controller for a musculoskeletal model with a goal of matching a given time-varying velocity vector. Top participants were invited to describe their algorithms. In this work, we describe the challenge and present thirteen solutions that used deep reinforcement learning approaches. Many solutions use similar relaxations and heuristics, such as reward shaping, frame skipping, discretization of the action space, symmetry, and policy blending. However, each team implemented different modifications of the known algorithms by, for example, dividing the task into subtasks, learning low-level control, or by incorporating expert knowledge and using imitation learning.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1902.02441/full.md

## Figures

42 figures with captions in the complete paper: https://tomesphere.com/paper/1902.02441/full.md

## References

52 references — full list in the complete paper: https://tomesphere.com/paper/1902.02441/full.md

---
Source: https://tomesphere.com/paper/1902.02441