Loading paper
AVESFormer: Efficient Transformer Design for Real-Time Audio-Visual Segmentation | Tomesphere