Adaptive Evidence Weighting for Audio-Spatiotemporal Fusion
Oscar Ovanger, Levi Harris, Timothy H. Keitt

TL;DR
This paper introduces FINCH, an adaptive evidence fusion framework that combines audio and spatiotemporal data for bioacoustic classification, improving robustness and accuracy over fixed-weight methods.
Contribution
FINCH provides a novel adaptive log-linear fusion method with per-sample gating, explicitly bounding contextual influence and outperforming fixed-weight fusion in bioacoustic tasks.
Findings
FINCH outperforms fixed-weight fusion and audio-only baselines.
Achieves state-of-the-art on CBI dataset.
Improves robustness even with weak contextual information.
Abstract
Many machine learning systems have access to multiple sources of evidence for the same prediction target, yet these sources often differ in reliability and informativeness across inputs. In bioacoustic classification, species identity may be inferred both from the acoustic signal and from spatiotemporal context such as location and season; while Bayesian inference motivates multiplicative evidence combination, in practice we typically only have access to discriminative predictors rather than calibrated generative models. We introduce \textbf{F}usion under \textbf{IN}dependent \textbf{C}onditional \textbf{H}ypotheses (\textbf{FINCH}), an adaptive log-linear evidence fusion framework that integrates a pre-trained audio classifier with a structured spatiotemporal predictor. FINCH learns a per-sample gating function that estimates the reliability of contextual information from uncertainty…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnimal Vocal Communication and Behavior · Music and Audio Processing · Speech and Audio Processing
