audio2chart: End to End Audio Transcription into playable Guitar Hero charts
Riccardo Tripodi

TL;DR
audio2chart is a novel framework that automatically generates Guitar Hero charts from raw audio, using sequence prediction models that are improved by audio conditioning, with publicly available code and pretrained models.
Contribution
This work introduces a new end-to-end neural approach for automatic chart generation directly from audio, demonstrating the effectiveness of audio conditioning and providing reproducible resources.
Findings
Unconditional baseline achieves strong performance.
Audio conditioning improves note prediction accuracy.
Public release of code and pretrained models supports reproducibility.
Abstract
This work introduces audio2chart, a framework for the automatic generation of Guitar Hero style charts directly from raw audio. The task is formalized as a sequence prediction problem, where models are trained to generate discrete chart tokens aligned with the audio on discrete time steps. An unconditional baseline demonstrates strong predictive performance, while the addition of audio conditioning yields consistent improvements across accuracy based metrics. This work demonstrates that incorporating audio conditioning is both feasible and effective for improving note prediction in automatic chart generation. The complete codebase for training and inference is publicly available on GitHub supporting reproducible research on neural chart generation. A family of pretrained models is released on Hugging Face.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗3podi/charter-v1.0-20-M-best-accmodel
- 🤗3podi/charter-v1.0-20-M-best-acc-nonpadmodel
- 🤗3podi/charter-v1.0-40-M-best-acc-nonpadmodel· 47 dl47 dl
- 🤗3podi/charter-v1.0-40-M-best-accmodel· 229 dl229 dl
- 🤗3podi/charter-v1.0-20-S-best-accmodel
- 🤗3podi/charter-v1.0-20-S-best-acc-nonpadmodel
- 🤗3podi/charter-v1.0-40-S-best-accmodel
- 🤗3podi/charter-v1.0-40-S-best-acc-nonpadmodel
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Speech Recognition and Synthesis · Machine Learning and Data Classification
