Loading paper
VGGSounder: Audio-Visual Evaluations for Foundation Models | Tomesphere