The Bach Doodle: Approachable music composition with machine learning at   scale

Cheng-Zhi Anna Huang; Curtis Hawthorne; Adam Roberts; Monica; Dinculescu; James Wexler; Leon Hong; Jacob Howcroft

arXiv:1907.06637·cs.SD·July 17, 2019·42 cites

The Bach Doodle: Approachable music composition with machine learning at scale

Cheng-Zhi Anna Huang, Curtis Hawthorne, Adam Roberts, Monica, Dinculescu, James Wexler, Leon Hong, Jacob Howcroft

PDF

Open Access

TL;DR

The paper introduces the Bach Doodle, an accessible AI-powered music composition tool that allows users to create melodies and receive harmonizations in Bach's style, leveraging optimized machine learning models in the browser.

Contribution

It presents a scalable, real-time browser-based implementation of Coconet for music harmonization, with reduced latency and size, enabling millions of user interactions and data collection.

Findings

01

Over 55 million queries received in three days

02

Users spent 350 years worth of time on the doodle

03

Model runtime reduced from 40s to 2s in the browser

Abstract

To make music composition more approachable, we designed the first AI-powered Google Doodle, the Bach Doodle, where users can create their own melody and have it harmonized by a machine learning model Coconet (Huang et al., 2017) in the style of Bach. For users to input melodies, we designed a simplified sheet-music based interface. To support an interactive experience at scale, we re-implemented Coconet in TensorFlow.js (Smilkov et al., 2019) to run in the browser and reduced its runtime from 40s to 2s by adopting dilated depth-wise separable convolutions and fusing operations. We also reduced the model download size to approximately 400KB through post-training weight quantization. We calibrated a speed test based on partial model evaluation time to determine if the harmonization request should be performed locally or sent to remote TPU servers. In three days, people spent 350 years…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Music Technology and Sound Studies · Neuroscience and Music Perception

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings