Forensic Analysis and Localization of Multiply Compressed MP3 Audio Using Transformers
Ziyue Xiang, Paolo Bestagini, Stefano Tubaro, Edward J. Delp

TL;DR
This paper introduces a transformer-based method for forensic analysis of MP3 audio files, accurately localizing temporal splices and compression history, demonstrating superior performance and robustness over existing techniques.
Contribution
It presents a novel transformer-based approach for detecting and localizing multiple compression splices in MP3 audio signals at the frame level.
Findings
Higher accuracy in splice localization compared to existing methods
Robustness across diverse MP3 datasets
Effective identification of single and multiple compression regions
Abstract
Audio signals are often stored and transmitted in compressed formats. Among the many available audio compression schemes, MPEG-1 Audio Layer III (MP3) is very popular and widely used. Since MP3 is lossy it leaves characteristic traces in the compressed audio which can be used forensically to expose the past history of an audio file. In this paper, we consider the scenario of audio signal manipulation done by temporal splicing of compressed and uncompressed audio signals. We propose a method to find the temporal location of the splices based on transformer networks. Our method identifies which temporal portions of a audio signal have undergone single or multiple compression at the temporal frame level, which is the smallest temporal unit of MP3 compression. We tested our method on a dataset of 486,743 MP3 audio clips. Our method achieved higher performance and demonstrated robustness…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
