Loading paper
CrossA11y: Identifying Video Accessibility Issues via Cross-modal Grounding | Tomesphere