Loading paper
Windows Agent Arena: Evaluating Multi-Modal OS Agents at Scale | Tomesphere