Loading paper
Same evaluation, more tokens: On the effect of input length for machine translation evaluation using Large Language Models | Tomesphere