Loading paper
Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection | Tomesphere