Loading paper
On the Contributions of Visual and Textual Supervision in Low-Resource Semantic Speech Retrieval | Tomesphere