OtoWorld: Towards Learning to Separate by Learning to Move

Omkar Ranadive; Grant Gasser; David Terpay; Prem Seetharaman

arXiv:2007.06123·cs.SD·July 14, 2020·1 cites

OtoWorld: Towards Learning to Separate by Learning to Move

Omkar Ranadive, Grant Gasser, David Terpay, Prem Seetharaman

PDF

Open Access 1 Repo

TL;DR

OtoWorld is an interactive environment designed to advance reinforcement learning research in computer audition, where agents learn to navigate and identify sound sources using auditory cues alone.

Contribution

It introduces a new open-source environment for training agents in auditory navigation tasks, combining simulation, open-source libraries, and a novel challenge setting.

Findings

01

Preliminary results show agents can learn to navigate towards sound sources.

02

OtoWorld demonstrates the feasibility of auditory-based navigation tasks.

03

The environment is easily extendable for more complex auditory navigation research.

Abstract

We present OtoWorld, an interactive environment in which agents must learn to listen in order to solve navigational tasks. The purpose of OtoWorld is to facilitate reinforcement learning research in computer audition, where agents must learn to listen to the world around them to navigate. OtoWorld is built on three open source libraries: OpenAI Gym for environment and agent interaction, PyRoomAcoustics for ray-tracing and acoustics simulation, and nussl for training deep computer audition models. OtoWorld is the audio analogue of GridWorld, a simple navigation game. OtoWorld can be easily extended to more complex environments and games. To solve one episode of OtoWorld, an agent must move towards each sounding source in the auditory scene and "turn it off". The agent receives no other input than the current sound of the room. The sources are placed randomly within the room and can vary…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

pseeth/otoworld
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Music Technology and Sound Studies · Speech and Audio Processing