Loading paper
HVD: Human Vision-Driven Video Representation Learning for Text-Video Retrieval | Tomesphere