Loading paper
Mil-SCORE: Benchmarking Long-Context Geospatial Reasoning and Planning in Large Language Models | Tomesphere