Loading paper
TB-AVA: Text as a Semantic Bridge for Audio-Visual Parameter Efficient Finetuning | Tomesphere