Loading paper
SAP-Bench: Benchmarking Multimodal Large Language Models in Surgical Action Planning | Tomesphere