Loading paper
100 instances is all you need: predicting the success of a new LLM on unseen data by testing on a few instances | Tomesphere