Loading paper
MA-AVT: Modality Alignment for Parameter-Efficient Audio-Visual Transformers | Tomesphere