Loading paper
STEM: Efficient Relative Capability Evaluation of LLMs through Structured Transition Samples | Tomesphere