Loading paper
Temporal Sub-sampling of Audio Feature Sequences for Automated Audio Captioning | Tomesphere