Loading paper
VideoWebArena: Evaluating Long Context Multimodal Agents with Video Understanding Web Tasks | Tomesphere