Loading paper
LEMON: How Well Do MLLMs Perform Temporal Multimodal Understanding on Instructional Videos? | Tomesphere