Loading paper
SViTT-Ego: A Sparse Video-Text Transformer for Egocentric Video | Tomesphere