Loading paper
Multimodal Speech Emotion Recognition using Cross Attention with Aligned Audio and Text | Tomesphere