Loading paper
Best of Both Worlds: Multi-task Audio-Visual Automatic Speech Recognition and Active Speaker Detection | Tomesphere