Towards Unconstrained Audio Splicing Detection and Localization with   Neural Networks

Denise Moussa; Germans Hirsch; Christian Riess

arXiv:2207.14682·cs.SD·May 6, 2024

Towards Unconstrained Audio Splicing Detection and Localization with Neural Networks

Denise Moussa, Germans Hirsch, Christian Riess

PDF

Open Access

TL;DR

This paper introduces a Transformer-based neural network approach for detecting and localizing audio splicing in unconstrained, real-world scenarios, outperforming existing methods and addressing the limitations of handcrafted feature-based algorithms.

Contribution

It presents a novel Transformer seq2seq model for audio splicing detection and localization, capable of handling unconstrained audio samples with various post-processing disguises.

Findings

01

Outperforms existing dedicated splicing detection methods

02

Superior to general-purpose networks like EfficientNet and RegNet

03

Effective in simulated attack scenarios with post-processing operations

Abstract

Freely available and easy-to-use audio editing tools make it straightforward to perform audio splicing. Convincing forgeries can be created by combining various speech samples from the same person. Detection of such splices is important both in the public sector when considering misinformation, and in a legal context to verify the integrity of evidence. Unfortunately, most existing detection algorithms for audio splicing use handcrafted features and make specific assumptions. However, criminal investigators are often faced with audio samples from unconstrained sources with unknown characteristics, which raises the need for more generally applicable methods. With this work, we aim to take a first step towards unconstrained audio splicing detection to address this need. We simulate various attack scenarios in the form of post-processing operations that may disguise splicing. We propose…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Media Forensic Detection · Adversarial Robustness in Machine Learning · Digital and Cyber Forensics

MethodsMulti-Head Attention · Attention Is All You Need · *Communicated@Fast*How Do I Communicate to Expedia? · Linear Layer · Depthwise Convolution · Pointwise Convolution · Depthwise Separable Convolution · Batch Normalization · Softmax · Inverted Residual Block