Loading paper
SocialMaze: A Benchmark for Evaluating Social Reasoning in Large Language Models | Tomesphere