Loading paper
Watch and Learn: Mapping Language and Noisy Real-world Videos with Self-supervision | Tomesphere