ViolinDiff: Enhancing Expressive Violin Synthesis with Pitch Bend   Conditioning

Daewoong Kim; Hao-Wen Dong; Dasaem Jeong

arXiv:2409.12477·cs.SD·February 5, 2025

ViolinDiff: Enhancing Expressive Violin Synthesis with Pitch Bend Conditioning

Daewoong Kim, Hao-Wen Dong, Dasaem Jeong

PDF

Open Access 1 Repo 1 Models

TL;DR

ViolinDiff introduces a diffusion-based framework that explicitly models pitch bend contours from MIDI data to produce more realistic and expressive violin sounds, addressing the challenge of polyphonic F0 contour synthesis.

Contribution

The paper presents a novel two-stage diffusion model that explicitly incorporates pitch bend information for expressive violin sound synthesis from MIDI files.

Findings

01

Generated violin sounds are more realistic with pitch bend modeling.

02

Quantitative metrics show improved synthesis quality.

03

Listening tests favor the proposed method over baselines.

Abstract

Modeling the natural contour of fundamental frequency (F0) plays a critical role in music audio synthesis. However, transcribing and managing multiple F0 contours in polyphonic music is challenging, and explicit F0 contour modeling has not yet been explored for polyphonic instrumental synthesis. In this paper, we present ViolinDiff, a two-stage diffusion-based synthesis framework. For a given violin MIDI file, the first stage estimates the F0 contour as pitch bend information, and the second stage generates mel spectrogram incorporating these expressive details. The quantitative metrics and listening test results show that the proposed model generates more realistic violin sounds than the model without explicit pitch bend modeling. Audio samples are available online: daewoung.github.io/ViolinDiff-Demo.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

daewoung/ViolinDiff
pytorchOfficial

Models

🤗
dawokim/ViolinDiff
model

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic Technology and Sound Studies · Music and Audio Processing · Neuroscience and Music Perception