CounterFlow: A Two-Phase Inference-Time Sampling for Counterfactual Video Foley Generation

Gyubin Lee; Junwon Lee; Juhan Nam

arXiv:2605.18916·cs.MM·May 20, 2026

CounterFlow: A Two-Phase Inference-Time Sampling for Counterfactual Video Foley Generation

Gyubin Lee, Junwon Lee, Juhan Nam

PDF

1 Repo

TL;DR

CounterFlow introduces a two-phase inference method for generating counterfactual video Foley, enabling sound-source identity contradiction while maintaining temporal synchronization with silent videos.

Contribution

It proposes a novel dual-phase sampling scheme that enhances counterfactual audio generation by suppressing visual cues and focusing on target prompts, outperforming existing methods.

Findings

01

Significantly improves counterfactual Foley generation quality.

02

Proposes a new metric for evaluating replacement quality.

03

Demonstrates effectiveness through video examples and code availability.

Abstract

We investigate Counterfactual Video Foley Generation, which aims to adopt a sound-source identity that contradicts the visual evidence while remaining temporally synchronized to a silent video. Existing Video&Text-to-Audio (VT2A) models struggle with this, often remaining anchored to the visually implied sound source when video and text contents disagree. We present ConterFlow, an inference-time dual-phase sampling scheme for pretrained flow-matching VT2A models. Phase 1 builds a video-derived temporal structure while suppressing the visually implied source; Phase 2 drops video conditioning to focus entirely on shaping audio timbre toward the target prompt. ConterFlow substantially improves counterfactual Video Foley generation compared to naive negative prompting and state-of-the-art baselines. To evaluate replacement quality, we propose a metric leveraging a text-audio co-embedding…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://gyubin-lee.github.io/counterflow-demo
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.