# Vision-based Robotic Arm Imitation by Human Gesture

**Authors:** Cheng Xuan, Zhiqiang Tang, Jinxin Xu

arXiv: 1703.04906 · 2018-10-05

## TL;DR

This paper presents a vision-based method for robotic arm imitation using deep reinforcement learning, enabling a robot to learn complex tasks by observing human gestures through monocular camera input.

## Contribution

It introduces a novel approach combining modified DDPG and visual imitation networks that learns from monocular camera frames without 3D environment reconstruction.

## Key findings

- Robot can imitate human hand movements closely during training
- Method requires only monocular camera input, no 3D modeling
- Achieves effective visual imitation without explicit environment modeling

## Abstract

One of the most efficient ways for a learning-based robotic arm to learn to process complex tasks as human, is to directly learn from observing how human complete those tasks, and then imitate. Our idea is based on success of Deep Q-Learning (DQN) algorithm according to reinforcement learning, and then extend to Deep Deterministic Policy Gradient (DDPG) algorithm. We developed a learning-based method, combining modified DDPG and visual imitation network. Our approach acquires frames only from a monocular camera, and no need to either construct a 3D environment or generate actual points. The result we expected during training, was that robot would be able to move as almost the same as how human hands did.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1703.04906/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/1703.04906/full.md

---
Source: https://tomesphere.com/paper/1703.04906