Loading paper
FEM-Bench: A Structured Scientific Reasoning Benchmark for Evaluating Code-Generating LLMs | Tomesphere