Loading paper
Word Discovery in Visually Grounded, Self-Supervised Speech Models | Tomesphere