Loading paper
Evaluating the Ability of Large Language Models to Reason about Cardinal Directions, Revisited | Tomesphere