Loading paper
Safe, or Simply Incapable? Rethinking Safety Evaluation for Phone-Use Agents | Tomesphere